httpClient download file(爬虫)
package com.opensource.httpclient.bfs;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.commons.httpclient.HttpStatus;
import org.apache.http.Header;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
public class DownLoadFile
{
public String getFileNameByUrl(String url, String contentType)
{
url = url.substring(7);
if (contentType.indexOf("html") != -1)
{
url = url.replaceAll("[\\?/:*|<>\"]", "_") + ".html";
return url;
}
else
{
return url.replaceAll("[\\?/:*|<>\"]", "_") + "." + contentType.substring(contentType.lastIndexOf("/") + 1);
}
}
public void saveToLocal(byte[] data, String filePath)
{
try
{
DataOutputStream out = new DataOutputStream(new FileOutputStream(new File(filePath)));
for (int i = 0; i < data.length; i++)
out.write(data[i]);
out.flush();
out.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
public String downloadFile(String url)
throws ClientProtocolException, IOException
{
String filePath = null;
HttpClient httpClient = new DefaultHttpClient();
HttpGet get = new HttpGet(url);
HttpResponse rsp = httpClient.execute(get);
if (rsp.getStatusLine().getStatusCode() != HttpStatus.SC_OK)
{
System.err.println("Method failed: " + rsp.getStatusLine());
filePath = null;
}
Header[] header = rsp.getHeaders("Content-Type");
filePath = "D:\\" + getFileNameByUrl(url, header[0].getValue());
saveToLocal(rsp.toString().getBytes(), filePath);
return filePath;
}
public static void main(String[] args)
throws ClientProtocolException, IOException
{
DownLoadFile downLoadFile = new DownLoadFile();
String temp = downLoadFile.downloadFile("http://www.huawei.com/cn/");
System.out.println(temp);
}
}
httpClient download file(爬虫)的更多相关文章
- HttpClient的使用-爬虫学习1
HttpClient的使用-爬虫学习(一) Apache真是伟大,为我们提供了HttpClient.jar,这个HttpClient是客户端的http通信实现库,这个类库的作用是接受和发送http报文 ...
- Csharp:WebClient and WebRequest use http download file
//Csharp:WebClient and WebRequest use http download file //20140318 塗聚文收錄 string filePath = "20 ...
- 基于HttpClient实现网络爬虫~以百度新闻为例
转载请注明出处:http://blog.csdn.net/xiaojimanman/article/details/40891791 基于HttpClient4.5实现网络爬虫请訪问这里:http:/ ...
- [Powershell] FTP Download File
# Config $today = Get-Date -UFormat "%Y%m%d" $LogFilePath = "d:\ftpLog_$today.txt&quo ...
- Angular HttpClient upload file with FormData
从sof上找到一个example:https://stackoverflow.com/questions/46206643/asp-net-core-2-0-and-angular-4-3-file- ...
- FTP Download File By Some Order List
@Echo Off REM -- Define File Filter, i.e. files with extension .RBSet FindStrArgs=/E /C:".asp&q ...
- httpclient upload file
用httpclient upload上传文件时,代码如下: HttpPost httpPost = new HttpPost(uploadImg); httpPost.addHeader(" ...
- Download file using libcurl in C/C++
http://stackoverflow.com/questions/1636333/download-file-using-libcurl-in-c-c #include <stdio.h&g ...
- HttpClient的使用-爬虫学习(一)
Apache真是伟大,为我们提供了HttpClient.jar,这个HttpClient是客户端的http通信实现库,这个类库的作用是接受和发送http报文,引进这个类库,我们对于http的操作会变得 ...
随机推荐
- Number Steps
Number Steps Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)Tota ...
- 1021 Fibonacci Again (hdoj)
Problem Description There are another kind of Fibonacci numbers: F(0) = 7, F(1) = 11, F(n) = F(n-1) ...
- STL初步
1.stackstack 模板类的定义在<stack>头文件中.stack 模板类需要两个模板参数,一个是元素类型,一个容器类型,但只有元素类型是必要的,在不指定容器类型时,默认的容器类型 ...
- Android 控件属性
TextView 文字属性//文字左右居中android:layout_centerHorizontal="true"//文字垂直居中android:layout_centerVe ...
- C++之单元测试
以前编写程序从没有做过单元测试的工作,所以在后期会花很多时间去纠错,这也就是软件工程中的2:8定律.最近要完成一个项目,要求要对系统中的主类和主函数作出单元测试的保证,才去查找了相关方面的资料,看过后 ...
- android textView 折叠 展开 ExpandableTextView
项目过程中可能会用到可以折叠和展开的TextView , 这里给出一种实现思路,自定义控件. package com.example.expandtextviewdemo; import androi ...
- 生成war的jdk版本高于tomcat使用的jdk版本,导致项目不能正常被访问
记录一个耽误30分钟的一个坑: 生成war的jdk版本高于tomcat使用的jdk版本,导致项目不能正常被访问 报404错误
- uva 10041 Vito's Family_贪心
题意:给你n个房子的距离,问那个房子离别的房子的距离最近,并且输出与别的房子距离的总和 思路:排序一下,中间的房子离别房子距离必然是最少的. #include <iostream> #in ...
- linux shell自定义函数(定义、返回值、变量作用域)介绍
http://www.jb51.net/article/33899.htm linux shell自定义函数(定义.返回值.变量作用域)介绍 linux shell 可以用户定义函数,然后在shell ...
- js算法
最近面试可能会问这些 1,插入排序 function sort(elements){ var res =[elements[0]]; for (var i = 0; i < elements.l ...