网络爬虫技术实现java依赖库整理输出
网络爬虫技术实现java依赖库整理输出
目录
1 简介... 2
1.1 背景介绍... 2
1.2 现有方法优缺点对比... 2
2 实现方法... 2
2.1 通过配置文件配置需要查询的依赖库... 2
2.2 获取最新版本号... 3
2.3 版本号解析算法实现... 4
2.4 获取依赖库信息... 6
2.5 依赖库信息解析算法实现... 6
2.6 输出依赖库信息结果... 10
3 软件操作步骤... 17
1 简介
1.1 背景介绍
Java有很多依赖库,而且依赖库的版本不断的更新,在产品开发中,使用新的依赖库,需要更新对应的依赖库的版本信息,往往存在牵一发而动全身的效果。目前公司采用的方法都是人工去查询,然后整理输出到java的配置文件中,需要根据依赖库的artifact去网站https://mvnrepository.com/上逐个查询,这种方法费时费力,且容易出错。而且版本不断的更新,可能要经常去更新依赖配置文件。需要经常去更新依赖库的版本信息;造成大量的时间浪费。为了解决该问题。采用网络爬虫技术,去检索网页中依赖库的版本信息,将依赖库信息抽取出来,然后按照java配置文件中依赖库的pom要求的xml格式和ReadMe需要的格式自动输出依赖关系。
1.2 现有方法优缺点对比
人工查询的具有容易出错、耗时耗力的缺点。而通过工具去查询,具有快速、准确的优点,而且能够按照java配置文件的格式进行输出。无需人为去整理。版本更新迭代时,只需要几秒钟的时间就可以完成人工查询几天的任务量;
2 实现方法
2.1 通过配置文件配置需要查询的依赖库
具体格式和java配置文件中格式相同,如果指定了版本则查询该版本号的依赖库,如果没有指定版本则查询最新版本的依赖库信息,包括groupId,artifactId,version;配置文件的格式如下:
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>1.5.19.RELEASE</version>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
<dependency>
<groupId>com.github.pagehelper</groupId>
<artifactId>pagehelper-spring-boot-starter</artifactId>
</dependency>
</dependencies>
2.2 获取最新版本号
读取配置文件中需要查询的依赖库,获取版本信息,没有配置版本号的,通过调用接口从https://mvnrepository.com/查询最新的版本号。
调用接口https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-web,https://mvnrepository.com/artifact/+groupid/+artifactid,获取依赖库的版本信息。对应的网页界面如下:
调用接口可以获取到字符串格式的内容,通过观察字符串中的节点名称和组织规律,设计解析算法,获取最新的版本信息。最新的版本信息为2.1.5.RELEASE。
2.3 版本号解析算法实现
int MvnRepository::ParseNewestVersion(string strResponse, Dependence& dep)
{
int pos = strResponse.find("License</th><td><span class=");
string strTemp = "";
if (pos!= string::npos)
{
strResponse = strResponse.substr(pos);
strTemp = strResponse.substr(0, 60);
pos = strTemp.find("b lic");
while (pos!= string::npos)
{
strResponse = strResponse.substr(pos + 7);
pos = strResponse.find("<");
if (pos== string::npos)
{
break;
}
strTemp = strResponse.substr(0, pos);
dep.vecLicense.push_back(strTemp);
strTemp = strResponse.substr(0, 60);
pos= strTemp.find("b lic");
}
}
pos = strResponse.find("Categories</th><td>");
if (pos!=string::npos)
{
strResponse = strResponse.substr(pos);
strTemp = strResponse.substr(0, 120);
pos = strTemp.find("b c");
while (pos != string::npos)
{
strResponse = strResponse.substr(pos + 5);
pos = strResponse.find("<");
if (pos == string::npos)
{
break;
}
strTemp = strResponse.substr(0, pos);
dep.vecLicense.push_back(strTemp);
strTemp = strResponse.substr(0, 60);
pos = strTemp.find("b c");
}
}
pos = strResponse.find("vbtn release");
if (pos == string::npos)
{
LOGIC_ERROR("find vbtn release failed");
return HPR_ERROR;
}
strResponse = strResponse.substr(pos + 14);
pos = strResponse.find("<");
if (pos == string::npos)
{
LOGIC_ERROR("find < failed");
return HPR_ERROR;
}
dep.strNewestVersion = strResponse.substr(0, pos);
if (dep.strCurrentVersion=="")
{
dep.strCurrentVersion = dep.strNewestVersion;
}
return HPR_OK;
}
2.4 获取依赖库信息
调用接口获取版本号之后,再调用接口获取依赖库信息,接口为https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-web/2.1.5.RELEASE,获取依赖库的Compile Dependencies。同样是解析字符串,按照格式抓取Compile Dependencies信息。保存起来。
2.5 依赖库信息解析算法实现
int MvnRepository::ParseDependences(string strResponse,map<string, Dependence>& mapDependence)
{
int iReval = HPR_ERROR;
int pos = string::npos;
do
{
if (strResponse=="")
{
break;
}
pos = strResponse.find("Compile Dependencies");
if (pos== string::npos)
{
break;
}
strResponse = strResponse.substr(pos);
pos = strResponse.find("Test Dependencies");
if (pos != string::npos)
{
strResponse=strResponse.substr(0, pos);
}
pos = strResponse.find(" class=\"b ");
if (pos== string::npos)
{
pos= strResponse.find("vbtn release");
}
string strtemp = "";
while (pos!= string::npos)
{
Dependence dep;
pos = strResponse.find(" class=\"b ");
if (pos!= string::npos)
{
strResponse = strResponse.substr(pos);
strtemp = strResponse.substr(0, 60);
while (strtemp.find(" class=\"b ") != string::npos)
{
pos = strtemp.find(" class=\"b ");
strResponse = strResponse.substr(pos);
pos = strResponse.find(">");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find(>) failed");
break;
}
strResponse = strResponse.substr(pos + 1);
pos = strResponse.find("<");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find(<) failed");
break;
}
strtemp = strResponse.substr(0, pos);
dep.vecLicense.push_back(strtemp);
LOGIC_TRACE("vecLicense:%s", strtemp.c_str());
strtemp = strResponse.substr(0, 60);
}
}
pos = strResponse.find("vbtn release");
if (pos== string::npos)
{
LOGIC_ERROR("strResponse.find vbtn release failed!");
break;
}
strResponse = strResponse.substr(pos + 30);
pos = strResponse.find("\">");
if (pos== string::npos)
{
LOGIC_ERROR("strResponse.find \"> failed!");
break;
}
strtemp = strResponse.substr(0, pos);
strResponse = strResponse.substr(pos);
pos = strtemp.find("/");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find / failed!");
break;
}
dep.strGroupid = strtemp.substr(0, pos);
LOGIC_TRACE("strGroupid:%s", dep.strGroupid.c_str());
strtemp = strtemp.substr(pos + 1);
pos = strtemp.find("/");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find / failed!");
break;
}
dep.strArtifact = strtemp.substr(0, pos);
LOGIC_TRACE("strArtifact:%s", dep.strArtifact.c_str());
if (dep.strArtifact=="sqljet")
{
int i = 0;
}
strtemp = strtemp.substr(pos + 1);
dep.strCurrentVersion = strtemp;
LOGIC_TRACE("strCurrentVersion:%s", dep.strCurrentVersion.c_str());
strtemp = strResponse.substr(0,120);
pos = strtemp.find("vbtn release");
/*if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find vbtn release failed!");
break;
}
strResponse = strResponse.substr(pos);*/
//pos = strResponse.find(dep.strArtifact);
if (pos != string::npos)
{
strResponse = strResponse.substr(pos);
pos = strResponse.find("\">");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find \"> failed!");
break;
}
strResponse = strResponse.substr(pos + 2);
pos= strResponse.find("<");
if (pos== string::npos)
{
LOGIC_ERROR("strResponse.find \"> failed!");
break;
}
strtemp = strResponse.substr(0, pos);
dep.strNewestVersion = strtemp;
LOGIC_TRACE("strNewestVersion:%s", dep.strNewestVersion.c_str());
}
mapDependence[dep.strArtifact + dep.strCurrentVersion]=dep;
pos= strResponse.find(" class=\"b ");
if (pos==string::npos)
{
pos = strResponse.find("vbtn release");
}
}
iReval = HPR_OK;
} while (0);
return iReval;
}
2.6 输出依赖库信息结果
解析完依赖库信息之后,按照java配置文件的格式输出到文件。
1)Pom.xml文件输出格式如下:
<?xml version="1.0" encoding="UTF-8" ?><output>
<properties>
<cdi-api.version>1.0</cdi-api.version>
<ejb-api.version>3.0</ejb-api.version>
<guava.version>19.0</guava.version>
<javaslang.version>2.0.6</javaslang.version>
<javax.annotation-api.version>1.3</javax.annotation-api.version>
<javax.servlet-api.version>3.0.1</javax.servlet-api.version>
<joda-time.version>2.10.1</joda-time.version>
<json-path.version>2.4.0</json-path.version>
<kotlin-reflect.version>1.2.71</kotlin-reflect.version>
<kotlin-stdlib.version>1.2.71</kotlin-stdlib.version>
<mybatis-spring-boot-starter.version>1.3.2</mybatis-spring-boot-starter.version>
<pagehelper-spring-boot-autoconfigure.version>1.2.10</pagehelper-spring-boot-autoconfigure.version>
<pagehelper-spring-boot-starter.version>1.2.10</pagehelper-spring-boot-starter.version>
<pagehelper.version>5.1.8</pagehelper.version>
<querydsl-apt.version>4.2.1</querydsl-apt.version>
<querydsl-collections.version>4.2.1</querydsl-collections.version>
<querydsl-core.version>4.2.1</querydsl-core.version>
<reactor-core.version>3.2.6.RELEASE</reactor-core.version>
<rxjava-reactive-streams.version>1.2.1</rxjava-reactive-streams.version>
<rxjava.version>1.3.8</rxjava.version>
<rxjava.version>2.2.6</rxjava.version>
<scala-library.version>2.11.7</scala-library.version>
<spring-boot-starter.version>2.0.1.RELEASE</spring-boot-starter.version>
<spring-data-commons.version>2.1.5.RELEASE</spring-data-commons.version>
<spring-hateoas.version>0.25.1.RELEASE</spring-hateoas.version>
<threetenbp.version>1.3.8</threetenbp.version>
<vavr.version>0.9.3</vavr.version>
<xmlprojector.version>1.4.15</xmlprojector.version>
</properties>
<dependencies>
<dependency>
<groupId>javax.enterprise</groupId>
<artifactId>cdi-api</artifactId>
<version>${cdi-api.version}</version>
</dependency>
<dependency>
<groupId>javax.ejb</groupId>
<artifactId>ejb-api</artifactId>
<version>${ejb-api.version}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${guava.version}</version>
</dependency>
<dependency>
<groupId>io.javaslang</groupId>
<artifactId>javaslang</artifactId>
<version>${javaslang.version}</version>
</dependency>
<dependency>
<groupId>javax.annotation</groupId>
<artifactId>javax.annotation-api</artifactId>
<version>${javax.annotation-api.version}</version>
</dependency>
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<version>${javax.servlet-api.version}</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>${joda-time.version}</version>
</dependency>
<dependency>
<groupId>com.jayway.jsonpath</groupId>
<artifactId>json-path</artifactId>
<version>${json-path.version}</version>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-reflect</artifactId>
<version>${kotlin-reflect.version}</version>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>${kotlin-stdlib.version}</version>
</dependency>
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>${mybatis-spring-boot-starter.version}</version>
</dependency>
<dependency>
<groupId>com.github.pagehelper</groupId>
<artifactId>pagehelper-spring-boot-autoconfigure</artifactId>
<version>${pagehelper-spring-boot-autoconfigure.version}</version>
</dependency>
<dependency>
<groupId>com.github.pagehelper</groupId>
<artifactId>pagehelper-spring-boot-starter</artifactId>
<version>${pagehelper-spring-boot-starter.version}</version>
</dependency>
<dependency>
<groupId>com.github.pagehelper</groupId>
<artifactId>pagehelper</artifactId>
<version>${pagehelper.version}</version>
</dependency>
<dependency>
<groupId>com.querydsl</groupId>
<artifactId>querydsl-apt</artifactId>
<version>${querydsl-apt.version}</version>
</dependency>
<dependency>
<groupId>com.querydsl</groupId>
<artifactId>querydsl-collections</artifactId>
<version>${querydsl-collections.version}</version>
</dependency>
<dependency>
<groupId>com.querydsl</groupId>
<artifactId>querydsl-core</artifactId>
<version>${querydsl-core.version}</version>
</dependency>
<dependency>
<groupId>io.projectreactor</groupId>
<artifactId>reactor-core</artifactId>
<version>${reactor-core.version}</version>
</dependency>
<dependency>
<groupId>io.reactivex</groupId>
<artifactId>rxjava-reactive-streams</artifactId>
<version>${rxjava-reactive-streams.version}</version>
</dependency>
<dependency>
<groupId>io.reactivex</groupId>
<artifactId>rxjava</artifactId>
<version>${rxjava.version}</version>
</dependency>
<dependency>
<groupId>io.reactivex.rxjava2</groupId>
<artifactId>rxjava</artifactId>
<version>${rxjava.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala-library.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
<version>${spring-boot-starter.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-commons</artifactId>
<version>${spring-data-commons.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.hateoas</groupId>
<artifactId>spring-hateoas</artifactId>
<version>${spring-hateoas.version}</version>
</dependency>
<dependency>
<groupId>org.threeten</groupId>
<artifactId>threetenbp</artifactId>
<version>${threetenbp.version}</version>
</dependency>
<dependency>
<groupId>io.vavr</groupId>
<artifactId>vavr</artifactId>
<version>${vavr.version}</version>
</dependency>
<dependency>
<groupId>org.xmlbeam</groupId>
<artifactId>xmlprojector</artifactId>
<version>${xmlprojector.version}</version>
</dependency>
</dependencies>
</output>
2)Readme文件输出格式如下:
## com.github.pagehelper/pagehelper-spring-boot-starter/1.2.10(1.2.10)/MIT
-引入: mybatis-spring-boot-starter (org.mybatis.spring.boot)/ 1.3.2(最新版 2.0.0)/ Apache 2.0
-引入: pagehelper-spring-boot-autoconfigure (com.github.pagehelper)/ 1.2.10(最新版 )/ MIT
-引入: pagehelper (com.github.pagehelper)/ 5.1.8(最新版 )/ MIT
-引入: spring-boot-starter (org.springframework.boot)/ 2.0.1.RELEASE(最新版 2.1.3.RELEASE)/ Apache 2.0
## org.springframework.data/spring-data-commons/2.1.5.RELEASE(2.1.5.RELEASE)/Apache 2.0
-引入: cdi-api (javax.enterprise)/ 1.0(最新版 )/ Dep Injection,Apache 2.0
-引入: ejb-api (javax.ejb)/ 3.0(最新版 )/ Java Spec,CDDL 1.1
-引入: guava (com.google.guava)/ 19.0(最新版 27.1-jre)/ JSON Lib,Apache 2.0
-引入: javaslang (io.javaslang)/ 2.0.6(最新版 0.10.0)/ Functional Programming,Apache 2.0
-引入: javax.annotation-api (javax.annotation)/ 1.3(最新版 1.3.2)/ Java Spec,CDDL,GPL 2.0
-引入: javax.servlet-api (javax.servlet)/ 3.0.1(最新版 4.0.1)/ Java Spec,CDDL,GPL 2.0
-引入: joda-time (joda-time)/ 2.10.1(最新版 )/ Date/Time,Apache 2.0
-引入: json-path (com.jayway.jsonpath)/ 2.4.0(最新版 )/ JSON Lib,Apache 2.0
-引入: kotlin-reflect (org.jetbrains.kotlin)/ 1.2.71(最新版 1.3.21)/ Reflection,Apache 2.0
-引入: kotlin-stdlib (org.jetbrains.kotlin)/ 1.2.71(最新版 1.3.21)/ JVM Languages,Apache 2.0
-引入: querydsl-apt (com.querydsl)/ 4.2.1(最新版 )/ Apache 2.0
-引入: querydsl-collections (com.querydsl)/ 4.2.1(最新版 )/ Apache 2.0
-引入: querydsl-core (com.querydsl)/ 4.2.1(最新版 )/ Apache 2.0
-引入: reactor-core (io.projectreactor)/ 3.2.6.RELEASE(最新版 )/ Apache 2.0
-引入: rxjava-reactive-streams (io.reactivex)/ 1.2.1(最新版 )/ Apache 2.0
-引入: rxjava (io.reactivex)/ 1.3.8(最新版 2.2.7)/ Apache 2.0
-引入: rxjava (io.reactivex.rxjava2)/ 2.2.6(最新版 2.2.7)/ Apache 2.0
-引入: scala-library (org.scala-lang)/ 2.11.7(最新版 2.12.8)/ JVM Languages,Apache 2.0
-引入: spring-hateoas (org.springframework.hateoas)/ 0.25.1.RELEASE(最新版 )/ Core Utils,Apache 2.0
-引入: threetenbp (org.threeten)/ 1.3.8(最新版 )/ BSD 3-clause
-引入: vavr (io.vavr)/ 0.9.3(最新版 0.10.0)/ Functional Programming,Apache 2.0
-引入: xmlprojector (org.xmlbeam)/ 1.4.15(最新版 1.4.16)/ Apache 2.0
3 软件操作步骤
(1) 将需要查询的依赖库按照格式输入根目录的pom.xml文件夹下,配置三个选项,如果指定了version,则根据指定的版本去查找,没有指定的库,从网站上查找最新的版本。
<dependencies>
<dependency>
<groupId>com.github.pagehelper</groupId>
<artifactId>pagehelper-spring-boot-starter</artifactId>
<version>1.2.10</version>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-commons</artifactId>
</dependency>
</dependencies>
(2) 双击打开JavaDependence.exe软件,点击读取按钮,从配置文件中读取需要查询的库。在对话框中会显示读取的数量;
(3) 点击查询按钮进行查询,大概每个3秒左右的时间,慢慢等待。查询结束后,会在对话框中显示成功失败的数量,如果失败了几个,再刷新下网页,然后继续点击查询按钮,会将失败的继续查询,直到所有的都查询成功;
(4) 所有的都查询成功后,点击输出按钮进行输出。会按照格式要求输出到文件中。对话框会显示输出成功。Pom.xml中会按照artifact的字母顺序输出。然后在根目录下会有两个文件。
解析函数实现 头文件MvnRepository.h
#pragma once
#include "HPR_Singleton.h"
#include <string>
#include <vector>
#include <map>
using namespace std;
struct Dependence
{
string strGroupid;
string strArtifact;
string strCurrentVersion;
string strNewestVersion;
vector<string> vecLicense;
Dependence()
{
strGroupid = "";
strArtifact = "";
strCurrentVersion = "";
strNewestVersion = "";
}
};
class MvnRepository:public singleton<MvnRepository>
{
public:
MvnRepository();
~MvnRepository(); public:
int GetNewestVersion(string artifactid, Dependence& dep);
int GetDependences(string strArtifactid, map<string, Dependence>& mapDependence);
int ParseNewestVersion(string strResponse, Dependence& dep);
int ParseDependences(string strResponse, map<string, Dependence>& mapDependence); };
MvnRepository.cpp 源文件
#include "stdafx.h"
#include "MvnRepository.h"
#include "SimpleHttpClient.h"
#include "hlog1.h"
#include "RestClient.h"
MvnRepository::MvnRepository()
{
} MvnRepository::~MvnRepository()
{
}
int MvnRepository::GetNewestVersion(string artifactid,Dependence& dep)
{ string strUrl = "https://mvnrepository.com/artifact/";
strUrl = strUrl + artifactid;
//CSimpleHttpClient findresByAuthclient("GET", strUrl.c_str(), 5);
//findresByAuthclient.setHttpHeader("Content-Type", "application/json");
//if (!findresByAuthclient.sendHttpRequest())
//{
// LOGIC_ERROR("send findResourcesByAuth request error,url %s,return %s", strUrl.c_str(), findresByAuthclient.getHttpResponseBody().c_str());
//}
//else
//{
// std::string error_code;
// std::string error_msg;
// std::string strResponsefindResByAuth = findresByAuthclient.getHttpResponseBody();
// //LOGIC_TRACE("strResponsefindResByAuth1: %s", strResponsefindResByAuth.c_str());
//
// strVersion=ParseNewestVersion(strResponsefindResByAuth);
//}
string strResponsefindResByAuth = "";
if (CHttpClient::instance()->Gets(strUrl, strResponsefindResByAuth)==HPR_ERROR)
{
LOGIC_ERROR("Gets failed");
return HPR_ERROR;
}
return ParseNewestVersion(strResponsefindResByAuth, dep);
}
int MvnRepository::GetDependences(string strArtifactid, map<string, Dependence>& mapDependence)
{
int iReval = HPR_ERROR;
do
{
string strVersion = "";
string strUrl = "https://mvnrepository.com/artifact/";
strUrl = strUrl + strArtifactid;
CSimpleHttpClient findresByAuthclient("GET", strUrl.c_str(), );
findresByAuthclient.setHttpHeader("Content-Type", "application/json");
if (!findresByAuthclient.sendHttpRequest())
{
LOGIC_ERROR("send findResourcesByAuth request error,url %s,return %s", strUrl.c_str(), findresByAuthclient.getHttpResponseBody().c_str());
break;
}
else
{
std::string error_code;
std::string error_msg;
std::string strResponsefindResByAuth = findresByAuthclient.getHttpResponseBody();
//LOGIC_TRACE("strResponsefindResByAuth1: %s", strResponsefindResByAuth.c_str());
if ( ParseDependences(strResponsefindResByAuth, mapDependence)==HPR_ERROR)
{
LOGIC_ERROR("ParseDependences failed");
break;
}
}
iReval = HPR_OK;
} while ();
return iReval;
}
int MvnRepository::ParseNewestVersion(string strResponse, Dependence& dep)
{ int pos = strResponse.find("License</th><td><span class=");
string strTemp = "";
if (pos!= string::npos)
{
strResponse = strResponse.substr(pos);
strTemp = strResponse.substr(, );
pos = strTemp.find("b lic");
while (pos!= string::npos)
{
strResponse = strResponse.substr(pos + );
pos = strResponse.find("<");
if (pos== string::npos)
{
break;
}
strTemp = strResponse.substr(, pos);
dep.vecLicense.push_back(strTemp);
strTemp = strResponse.substr(, );
pos= strTemp.find("b lic");
}
}
pos = strResponse.find("Categories</th><td>");
if (pos!=string::npos)
{
strResponse = strResponse.substr(pos);
strTemp = strResponse.substr(, );
pos = strTemp.find("b c");
while (pos != string::npos)
{
strResponse = strResponse.substr(pos + );
pos = strResponse.find("<");
if (pos == string::npos)
{
break;
}
strTemp = strResponse.substr(, pos);
dep.vecLicense.push_back(strTemp);
strTemp = strResponse.substr(, );
pos = strTemp.find("b c");
}
}
pos = strResponse.find("vbtn release");
if (pos == string::npos)
{
LOGIC_ERROR("find vbtn release failed");
return HPR_ERROR;
}
strResponse = strResponse.substr(pos + );
pos = strResponse.find("<");
if (pos == string::npos)
{
LOGIC_ERROR("find < failed");
return HPR_ERROR;
}
dep.strNewestVersion = strResponse.substr(, pos);
if (dep.strCurrentVersion=="")
{
dep.strCurrentVersion = dep.strNewestVersion;
} return HPR_OK;
}
int MvnRepository::ParseDependences(string strResponse,map<string, Dependence>& mapDependence)
{
int iReval = HPR_ERROR;
int pos = string::npos; do
{
if (strResponse=="")
{
break;
}
pos = strResponse.find("Compile Dependencies");
if (pos== string::npos)
{
break;
}
strResponse = strResponse.substr(pos);
pos = strResponse.find("Test Dependencies");
if (pos != string::npos)
{
strResponse=strResponse.substr(, pos);
}
pos = strResponse.find(" class=\"b ");
if (pos== string::npos)
{
pos= strResponse.find("vbtn release");
}
string strtemp = "";
while (pos!= string::npos)
{
Dependence dep;
pos = strResponse.find(" class=\"b ");
if (pos!= string::npos)
{
strResponse = strResponse.substr(pos);
strtemp = strResponse.substr(, );
while (strtemp.find(" class=\"b ") != string::npos)
{
pos = strtemp.find(" class=\"b ");
strResponse = strResponse.substr(pos);
pos = strResponse.find(">");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find(>) failed");
break;
}
strResponse = strResponse.substr(pos + );
pos = strResponse.find("<");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find(<) failed");
break;
}
strtemp = strResponse.substr(, pos);
dep.vecLicense.push_back(strtemp);
LOGIC_TRACE("vecLicense:%s", strtemp.c_str());
strtemp = strResponse.substr(, );
}
} pos = strResponse.find("vbtn release");
if (pos== string::npos)
{
LOGIC_ERROR("strResponse.find vbtn release failed!");
break;
}
strResponse = strResponse.substr(pos + );
pos = strResponse.find("\">");
if (pos== string::npos)
{
LOGIC_ERROR("strResponse.find \"> failed!");
break;
}
strtemp = strResponse.substr(, pos);
strResponse = strResponse.substr(pos);
pos = strtemp.find("/");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find / failed!");
break;
}
dep.strGroupid = strtemp.substr(, pos);
LOGIC_TRACE("strGroupid:%s", dep.strGroupid.c_str());
strtemp = strtemp.substr(pos + );
pos = strtemp.find("/");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find / failed!");
break;
}
dep.strArtifact = strtemp.substr(, pos);
LOGIC_TRACE("strArtifact:%s", dep.strArtifact.c_str());
if (dep.strArtifact=="sqljet")
{
int i = ;
}
strtemp = strtemp.substr(pos + );
dep.strCurrentVersion = strtemp;
LOGIC_TRACE("strCurrentVersion:%s", dep.strCurrentVersion.c_str());
strtemp = strResponse.substr(,);
pos = strtemp.find("vbtn release");
/*if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find vbtn release failed!");
break;
}
strResponse = strResponse.substr(pos);*/
//pos = strResponse.find(dep.strArtifact); if (pos != string::npos)
{
strResponse = strResponse.substr(pos);
pos = strResponse.find("\">");
if (pos == string::npos)
{
LOGIC_ERROR("strResponse.find \"> failed!");
break;
}
strResponse = strResponse.substr(pos + );
pos= strResponse.find("<");
if (pos== string::npos)
{
LOGIC_ERROR("strResponse.find \"> failed!");
break;
}
strtemp = strResponse.substr(, pos);
dep.strNewestVersion = strtemp;
LOGIC_TRACE("strNewestVersion:%s", dep.strNewestVersion.c_str());
} mapDependence[dep.strArtifact + dep.strCurrentVersion]=dep;
pos= strResponse.find(" class=\"b ");
if (pos==string::npos)
{
pos = strResponse.find("vbtn release");
}
}
iReval = HPR_OK;
} while ();
return iReval;
}
自己开发了一个股票智能分析软件,功能很强大,需要的点击下面的链接获取:
https://www.cnblogs.com/bclshuai/p/11380657.html
百度云盘下载地址:
链接:https://pan.baidu.com/s/1swkQzCIKI3g3ObcebgpIDg
提取码:mc8l
微信公众号获取最新的软件和视频介绍
QStockView
网络爬虫技术实现java依赖库整理输出的更多相关文章
- iOS—网络实用技术OC篇&网络爬虫-使用java语言抓取网络数据
网络爬虫-使用java语言抓取网络数据 前提:熟悉java语法(能看懂就行) 准备阶段:从网页中获取html代码 实战阶段:将对应的html代码使用java语言解析出来,最后保存到plist文件 上一 ...
- iOS开发——网络实用技术OC篇&网络爬虫-使用java语言抓取网络数据
网络爬虫-使用java语言抓取网络数据 前提:熟悉java语法(能看懂就行) 准备阶段:从网页中获取html代码 实战阶段:将对应的html代码使用java语言解析出来,最后保存到plist文件 上一 ...
- 企业级Python开发大佬利用网络爬虫技术实现自动发送天气预告邮件
前天小编带大家利用Python网络爬虫采集了天气网的实时信息,今天小编带大家更进一步,将采集到的天气信息直接发送到邮箱,带大家一起嗨~~拓展来说,这个功能放在企业级角度来看,只要我们拥有客户的邮箱,之 ...
- 网络爬虫必备知识之requests库
就库的范围,个人认为网络爬虫必备库知识包括urllib.requests.re.BeautifulSoup.concurrent.futures,接下来将结对requests库的使用方法进行总结 1. ...
- 网络爬虫技术Jsoup——爬到一切你想要的(转)
转自:http://blog.csdn.net/ccg_201216323/article/details/53576654 本文由我的微信公众号(bruce常)原创首发, 并同步发表到csdn博客, ...
- 【网络爬虫】【java】微博爬虫(五):防止爬虫被墙的几个技巧(总结篇)
爬虫的目的就是大规模地.长时间地获取数据,跟我们正常浏览器获取数据相比,虽然机理相差不大,但总是一个IP去爬网站,大规模集中对服务器访问,时间一长就有可能被拒绝.关于爬虫长时间爬取数据,可能会要求验证 ...
- 【网络爬虫】【java】微博爬虫(一):小试牛刀——网易微博爬虫(自定义关键字爬取微博数据)(附软件源码)
一.写在前面 (本专栏分为"java版微博爬虫"和"python版网络爬虫"两个项目,系列里所有文章将基于这两个项目讲解,项目完整源码已经整理到我的Github ...
- 网络爬虫必备知识之urllib库
就库的范围,个人认为网络爬虫必备库知识包括urllib.requests.re.BeautifulSoup.concurrent.futures,接下来将结合爬虫示例分别对urllib库的使用方法进行 ...
- 【网络爬虫】【java】微博爬虫(二):如何抓取HTML页面及HttpClient使用
一.写在前面 上篇文章以网易微博爬虫为例,给出了一个很简单的微博爬虫的爬取过程,大概说明了网络爬虫其实也就这么回事,或许初次看到这个例子觉得有些复杂,不过没有关系,上篇文章给的例子只是让大家对爬虫过程 ...
随机推荐
- 换发型app任性扣费?苹果app订阅任性扣费?怎么办?刚成功
2019年9月18日17:09:27 什么黑猫举报没用 先关闭订阅 账户中心自助申请试试,不通过再进行下面这步 https://getsupport.apple.com/?caller=home&am ...
- JavaScript例子3-对多选框进行操作,输出选中的多选框的个数
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title> ...
- O049、准备 LVM Volume Provider
参考https://www.cnblogs.com/CloudMan6/p/5597790.html Cinder 真正负责volume 管理的组件是 volume provider .Cinde ...
- javascript立体学习指南
javascript立体学习指南第一章:首先了解javascript 首先,什么是javascript? JavaStrip出生于1995年,是一种文本脚本语言,成都装修公司是一种动态的.弱类型的.基 ...
- latex公式居中环境
一般能够用到的环境是 \begin{equation} \begin{aligned} ... \end{aligned} \end{equation} 然而,这种环境用&只能够保证左对齐或者 ...
- fastadmin中关联表时A为主表,想让B表和C表关联时怎么办?
$sql = Db::connect('数据库')->table('C表')->where('status', 'normal')->field('字段 别称[不可与其他表重复]') ...
- OPNsense防火墙搭建实验环境,MSF与SSH进行流量转发
OPNsense防火墙搭建实验环境,MSF与SSH进行流量转发 摘要: 记录实验过程中踩到的坑.介绍OPNsense防火墙的安装配置并搭建实验环境,利用msf的模块及ssh进行流量转发(LAN向DMZ ...
- stm32WB 笔记
1.HAL Debug functions(调试功能) 可以在不同模式下使能或者失能调试器 This section provides functions allowing to:• Enable/D ...
- VToRay C-S config
Server config: { "inbounds": [{ "port": 20000, //Server Listening Port "pro ...
- GIT和SVN的区别(面试)
Cit是分布式,而SVN不是分布式 存储内容的时候,Git按元数据方式存储,而SVN是按文件 Git没有一个全局版本号,SVN有,目前为止这是SVN相比Git缺少的最大的一个特征 Git的内容完整性要 ...