用 HTTP 协议下载资源(WinINet 实现)

WinINet 使用 HTTP 协议下载资源的流程

相关函数

InternetCrackUrl 解析 URL

BOOL InternetCrackUrl(
_In_ LPCTSTR lpszUrl, // (1)
_In_ DWORD dwUrlLength, // (2)
_In_ DWORD dwFlags, // (3)
_Inout_ LPURL_COMPONENTS lpUrlComponents // (4)
);

(1) Pointer to a string that contains the canonical URL to be cracked.
(2) Size of the lpszUrl string, in TCHARs, or zero if lpszUrl is an ASCIIZ string
(3) Controls the operation: ICU_DECODE(Converts encoded characters back to their normal form), ICU_ESCAPE(Converts all escape sequences (%xx) to their corresponding characters)
(4) Pointer to a URL_COMPONENTS structure that receives the URL components.

InternetOpen 初始化应用程序对 WinINet 的使用

InternetOpen 是应用程序调用的第一个 WinINet 函数。 它用来告诉 Internet DLL 初始化内部数据结构, 为未来应用程序的调用做准备。当应用程序不再使用 Internet 函数时, 需要调用 InternetCloseHandle 来释放句柄及其关联的资源。

HINTERNET InternetOpen(
_In_ LPCTSTR lpszAgent, // (1)
_In_ DWORD dwAccessType, // (2)
_In_ LPCTSTR lpszProxyName, // (3)
_In_ LPCTSTR lpszProxyBypass, // (4)
_In_ DWORD dwFlags // (5)
);

(1) Pointer to a null-terminated string that specifies the name of the application or entity calling the WinINet functions. This name is used as the user agent in the HTTP protocol.
(2) Type of access required:

  • INTERNET_OPEN_TYPE_DIRECT: Resolves all host names locally;
  • INTERNET_OPEN_TYPE_PRECONFIG: Retrieves the proxy or direct configuration from the registry;
  • INTERNET_OPEN_TYPE_PRECONFIG_WITH_NO_AUTOPROXY: Retrieves the proxy or direct configuration from the registry and prevents the use of a startup Microsoft JScript or Internet Setup (INS) file;
  • INTERNET_OPEN_TYPE_PROXY: Passes requests to the proxy unless a proxy bypass list is supplied and the name to be resolved bypasses the proxy. In this case, the function uses INTERNET_OPEN_TYPE_DIRECT

(3) Pointer to a null-terminated string that specifies the name of the proxy server(s) to use when proxy access is specified by setting dwAccessType to INTERNET_OPEN_TYPE_PROXY.
(4) Pointer to a null-terminated string that specifies an optional list of host names or IP addresses, or both, that should not be routed through the proxy when dwAccessType is set to INTERNET_OPEN_TYPE_PROXY.
(5) Options:

  • INTERNET_FLAG_ASYNC: Makes only asynchronous requests on handles descended from the handle returned from this function;
  • INTERNET_FLAG_FROM_CACHE: Does not make network requests. All entities are returned from the cache);
  • INTERNET_FLAG_OFFLINE: Identical to INTERNET_FLAG_FROM_CACHE

InternetConnect 为指定网站打开一个文件传输协议(File Transfer Protocol, FTP) 或 HTTP 协议 的会话(session)

HINTERNET InternetConnect(
_In_ HINTERNET hInternet, // (1)
_In_ LPCTSTR lpszServerName, // (2)
_In_ INTERNET_PORT nServerPort, // (3)
_In_ LPCTSTR lpszUsername, // (4)
_In_ LPCTSTR lpszPassword, // (5)
_In_ DWORD dwService, // (6)
_In_ DWORD dwFlags, // (7)
_In_ DWORD_PTR dwContext // (8)
);

(1) Handle returned by a previous call to InternetOpen.
(2) Pointer to a null-terminated string that specifies the host name of an Internet server. Alternately, the string can contain the IP number of the site, in ASCII dotted-decimal format (for example, 11.0.1.45).
(3) Transmission Control Protocol/Internet Protocol (TCP/IP) port on the server.
(4) Pointer to a null-terminated string that specifies the name of the user to log on. If this parameter is NULL, the function uses an appropriate default.
(5) Pointer to a null-terminated string that contains the password to use to log on. If both lpszPassword and lpszUsername are NULL, the function uses the default “anonymous” password.
(6) Type of service to access:

  • INTERNET_SERVICE_FTP: FTP service;
  • INTERNET_SERVICE_GOPHER: Gopher service;
  • INTERNET_SERVICE_HTTP: HTTP service

(7) Options specific to the service used.
(8) Pointer to a variable that contains an application-defined value that is used to identify the application context for the returned handle in callbacks.

HttpOpenRequest 创建 HTTP 请求(request) 句柄

如果指定了除 “GET” 或 “POST” 以外的请求方法动词, HttpOpenRequest 自动为请求设置 INTERNET_FLAG_NO_CACHE_WRITEINTERNET_FLAG_RELOAD.

HINTERNET HttpOpenRequest(
_In_ HINTERNET hConnect, // (1)
_In_ LPCTSTR lpszVerb, // (2)
_In_ LPCTSTR lpszObjectName, // (3)
_In_ LPCTSTR lpszVersion, // (4)
_In_ LPCTSTR lpszReferer, // (5)
_In_ LPCTSTR *lplpszAcceptTypes, // (6)
_In_ DWORD dwFlags, // (7)
_In_ DWORD_PTR dwContext // (8)
);

(1) A handle to an HTTP session returned by InternetConnect.
(2) A pointer to a null-terminated string that contains the HTTP verb to use in the request. If this parameter is NULL, the function uses GET as the HTTP verb.
(3) A pointer to a null-terminated string that contains the name of the target object of the specified HTTP verb. This is generally a file name, an executable module, or a search specifier. (即, 请求资源的 URI)
(4) A pointer to a null-terminated string that contains the HTTP version to use in the request.If this parameter is NULL, the function uses an HTTP version of 1.1 or 1.0, depending on the value of the Internet Explorer settings.(一般设置为 “HTTP/1.0” 或 “HTTP/1.1”)
(5) A pointer to a null-terminated string that specifies the URL of the document from which the URL in the request (lpszObjectName) was obtained. If this parameter is NULL, no referrer is specified.
(6) A pointer to a null-terminated array of strings that indicates media types accepted by the client.Here is an example.

PCTSTR rgpszAcceptTypes[] = {_T(“text/*”), NULL};

(7) Internet options: INTERNET_FLAG_RELOAD (Forces a download of the requested file, object, or directory listing from the origin server, not from the cache), INTERNET_FLAG_NO_CACHE_WRITE (Does not add the returned entity to the cache) 等。
(8) A pointer to a variable that contains the application-defined value that associates this operation with any application data.

HttpAddRequestHeaders 向 HTTP 的请求句柄添加首部字段

BOOL HttpAddRequestHeaders(
_In_ HINTERNET hRequest, // (1)
_In_ LPCTSTR lpszHeaders, // (2)
_In_ DWORD dwHeadersLength, // (3)
_In_ DWORD dwModifiers // (4)
);

(1) A handle returned by a call to the HttpOpenRequest function.
(2) A pointer to a string variable containing the headers to append to the request. Each header must be terminated by a CR/LF (carriage return/line feed) pair.
(3) The size of lpszHeaders, in TCHARs. If this parameter is -1L, the function assumes that lpszHeaders is zero-terminated (ASCIIZ), and the length is computed.
(4) A set of modifiers that control the semantics of this function:

  • HTTP_ADDREQ_FLAG_ADD: Adds the header if it does not exist. Used with HTTP_ADDREQ_FLAG_REPLACE;
  • HTTP_ADDREQ_FLAG_ADD_IF_NEW: Adds the header only if it does not already exist; otherwise, an error is returned;
  • HTTP_ADDREQ_FLAG_COALESCE: Coalesces(使联合;使合并) headers of the same name
  • HTTP_ADDREQ_FLAG_COALESCE_WITH_COMMA: Coalesces headers of the same name with comma(逗号). For example, adding “Accept: text/” followed by “Accept: audio/” with this flag results in the formation of the single header “Accept: text/, audio/“;
  • HTTP_ADDREQ_FLAG_COALESCE_WITH_SEMICOLON: Coalesces headers of the same name using a semicolon(分号);
  • HTTP_ADDREQ_FLAG_REPLACE: Replaces or removes a header. If the header value is empty and the header is found, it is removed. If not empty, the header value is replaced.

HttpSendRequest 发送 Http 请求

BOOL HttpSendRequest(
_In_ HINTERNET hRequest, (1)
_In_ LPCTSTR lpszHeaders, (2)
_In_ DWORD dwHeadersLength, (3)
_In_ LPVOID lpOptional, (4)
_In_ DWORD dwOptionalLength (5)
);

(1) A handle returned by a call to the HttpOpenRequest function.
(2) A pointer to a null-terminated string that contains the additional headers to be appended to the request. This parameter can be NULL if there are no additional headers to be appended.
(3) The size of the additional headers, in TCHARs. If this parameter is -1L and lpszHeaders is not NULL, the function assumes that lpszHeaders is zero-terminated (ASCIIZ), and the length is calculated.
(4) A pointer to a buffer containing any optional data to be sent immediately after the request headers. This parameter is generally used for “POST” and “PUT” operations.
(5) The size of the optional data, in bytes.

HttpQueryInfo 获取 HTTP 请求的响应情况

例子: Retrieving HTTP Headers

BOOL HttpQueryInfo(
_In_ HINTERNET hRequest, // (1)
_In_ DWORD dwInfoLevel, // (2)
_Inout_ LPVOID lpvBuffer, // (3)
_Inout_ LPDWORD lpdwBufferLength, // (4)
_Inout_ LPDWORD lpdwIndex // (5)
);

(1) A handle returned by a call to the HttpOpenRequest or InternetOpenUrl function.
(2) A combination of an attribute to be retrieved and flags that modify the request. For a list of possible attribute and modifier values, see Query Info Flags.

HTTP_QUERY_CONTENT_LENGTH (Retrieves the size of the resource, in bytes), HTTP_QUERY_ACCEPT_RANGES (Retrieves the types of range requests that are accepted for a resource), HTTP_QUERY_CONTENT_RANGE (HTTP_QUERY_CONTENT_RANGE), HTTP_QUERY_FLAG_NUMBER (Returns the data as a 32-bit number for headers whose value is a number, such as the status code), HTTP_QUERY_STATUS_CODE (Receives the status code returned by the server) 等

(3) A pointer to a buffer to receive the requested information.
(4) A pointer to a variable that contains, on entry, the size in bytes of the buffer pointed to by lpvBuffer. When the function returns successfully, this variable contains the number of bytes of information written to the buffer. In the case of a string, the byte count does not include the string’s terminating null character.
(5) A pointer to a zero-based header index used to enumerate multiple headers with the same name.

InternetReadFileInternetOpenUrl, FtpOpenFile, 或 HttpOpenRequest 打开的句柄中读取数据。

为了保证所有的数据都被读取, 需要循环调用 InternetReadFile 函数, 直到返回的 lpdwNumberOfBytesRead 参数为 0。

BOOL InternetReadFile(
_In_ HINTERNET hFile, // (1)
_Out_ LPVOID lpBuffer, // (2)
_In_ DWORD dwNumberOfBytesToRead, // (3)
_Out_ LPDWORD lpdwNumberOfBytesRead // (4)
);

(1) Handle returned from a previous call to InternetOpenUrl, FtpOpenFile, or HttpOpenRequest.
(2) Pointer to a buffer that receives the data.
(3) Number of bytes to be read.
(4) Pointer to a variable that receives the number of bytes read. InternetReadFile sets this value to zero before doing any work or error checking.

样例代码

#include <string>
#include <iostream>
#include <windows.h>
#include <WinINet.h> using namespace std; #pragma comment(lib, "WinINet.lib") int main(int argc, char* argv[])
{
wstring strURL = L"http://blog.csdn.net/yanglingwell/article/details/78258081";
// 解析 URL
URL_COMPONENTS urlComponents; ZeroMemory(&urlComponents, sizeof(urlComponents));
WCHAR lpszHostName[INTERNET_MAX_HOST_NAME_LENGTH] = {0};
WCHAR lpszUserName[INTERNET_MAX_USER_NAME_LENGTH] = {0};
WCHAR lpszPassword[INTERNET_MAX_PASSWORD_LENGTH] = {0};
WCHAR lpszURLPath[INTERNET_MAX_URL_LENGTH] = {0};
WCHAR lpszScheme[INTERNET_MAX_SCHEME_LENGTH] = {0}; urlComponents.dwStructSize = sizeof(URL_COMPONENTSA);
urlComponents.lpszScheme = lpszScheme;
urlComponents.dwSchemeLength = INTERNET_MAX_SCHEME_LENGTH;
urlComponents.lpszHostName = lpszHostName;
urlComponents.dwHostNameLength = INTERNET_MAX_HOST_NAME_LENGTH;
urlComponents.lpszUserName = lpszUserName;
urlComponents.dwUserNameLength = INTERNET_MAX_USER_NAME_LENGTH;
urlComponents.lpszPassword = lpszPassword;
urlComponents.dwPasswordLength = INTERNET_MAX_PASSWORD_LENGTH;
urlComponents.lpszUrlPath = lpszURLPath;
urlComponents.dwUrlPathLength = INTERNET_MAX_URL_LENGTH; BOOL bSuccess = InternetCrackUrl(strURL.data(), 0, NULL, &urlComponents);
if(bSuccess == FALSE)
{
wcout << strURL << L" 解析失败!" << endl;
return 0;
}
else if(urlComponents.nScheme != INTERNET_SCHEME_HTTP)
{
wcout << strURL << L" 不是 HTTP 协议!" << endl;
return 0;
} HINTERNET hSession = NULL;
HINTERNET hInternet = NULL;
HINTERNET hRequest = NULL; do
{
// Initializes an application's use of the WinINet functions.
// Returns a valid handle that the application passes to subsequent WinINet functions.
// If InternetOpen fails, it returns NULL.
hInternet = InternetOpen(L"yanglingwell", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
if(hInternet == NULL)
{
cout << "InternetOpen failed. errCode: " << GetLastError() << endl;
break;
} // Opens an HTTP session for a given site.
// Returns a valid handle to the session if the connection is successful, or NULL otherwise.
HINTERNET hSession = InternetConnect(hInternet, urlComponents.lpszHostName, urlComponents.nPort, urlComponents.lpszUserName,
urlComponents.lpszPassword, INTERNET_SERVICE_HTTP, 0, NULL);
if(hSession == NULL)
{
cout << "InternetConnect failed. errCode: " << GetLastError() << endl;
break;
} // Creates an HTTP request handle
// Returns an HTTP request handle if successful, or NULL otherwise.
hRequest = HttpOpenRequest(hSession, L"GET", urlComponents.lpszUrlPath, NULL, L"", NULL, 0, 0);
if(hRequest == NULL)
{
cout << "HttpOpenRequest failed. errCode: " << GetLastError() << endl;
break;
} // 设置首部字段
wstring strHeader;
// 设置接受数据类型
strHeader += L"Accept: */*\r\n";
// 设置禁止用缓存和缓存控制
strHeader += L"Pragma: no-cache\r\n";
strHeader += L"Cache-Control: no-cache\r\n";
// 设置其它首部字段.... // Adds one or more HTTP request headers to the HTTP request handle.
if (!HttpAddRequestHeaders(hRequest, strHeader.data(), strHeader.length(), HTTP_ADDREQ_FLAG_ADD|HTTP_ADDREQ_FLAG_REPLACE))
{
cout << "HttpAddRequestHeaders failed. errCode: " << GetLastError() << endl;
break;
} if (!HttpSendRequest(hRequest, NULL, 0, NULL, 0))
{
cout << "HttpAddRequestHeaders failed. errCode: " << GetLastError() << endl;
break;
} DWORD dwStatusCode;
DWORD dwSizeDW = sizeof(DWORD);
if (!HttpQueryInfo(hRequest, HTTP_QUERY_FLAG_NUMBER | HTTP_QUERY_STATUS_CODE, &dwStatusCode, &dwSizeDW, NULL))
{
cout << "HttpQueryInfo failed. errCode: " << GetLastError() << endl;
break;
}
else
{
cout << "StatusCode: " << dwStatusCode << endl;
} WCHAR buf[2048];
DWORD bufSize = sizeof(buf);
DWORD bufRead = 0;
do
{
if(!InternetReadFile(hRequest, &buf, bufSize, &bufRead))
{
cout << "InternetReadFile failed. errCode: " << GetLastError() << endl;
break;
}
wcout << L"reading..." << endl;
} while (bufRead != 0); } while (FALSE); if(hInternet != NULL)
{
InternetCloseHandle(hInternet);
}
if(hSession != NULL)
{
InternetCloseHandle(hSession);
}
if(hRequest != NULL)
{
InternetCloseHandle(hRequest);
} return 0;
}

用 HTTP 协议下载资源(WinINet 实现)的更多相关文章

  1. WinInet API详解

    一.概述 WinInet(「Windows Internet」)API帮助程序员使用三个常见的Internet协议,这三个协议是:用于World Wide Web万维网的超文本传输协议(HTTP:Hy ...

  2. loadrunner11录制报 NOT PROXIED!错误,无法生成脚本

    使用loadrunner11,IE9录制完脚本,报错: [Net An. Error    (1dec:282c)] Request Connection: Remote Server @ 210.5 ...

  3. LoadRunner中 host-mapping的Capture Level说明

    lr录制后空白,那么就要弄明白lr中host-mapping的Capture Level选项socket level data.winnet level data.socket level andwi ...

  4. Aria2+百度网盘 无限制的下载神器

    Aria2是一款免费开源跨平台且不限速的多线程下载软件,Aria2的优点是速度快.体积小.资源占用少:支持 HTTP / FTP / BT / Magnet 磁力链接等类型的文件下载:支持 Win.M ...

  5. 力推:无限制下载神器aria2

    百度网盘是一个非常方便的存储以及寻找资源的好帮手,但是百度为了挣钱把非会员的下载网速一再限制(无力吐槽),还还好一直使用油猴插件加idm下载神器来下载百度云文件.奈何idm对bt种子文件不支持下载,终 ...

  6. LoadRunner的Capture Level说明

    LoadRunner的Capture Level说明 Capture Level的设置说明: 1.Socket level data. Capture data using trapping on t ...

  7. loadrunner录制时web时,ie报安全证书问题

    解决方法:在Recording_Options下Port Mapping>Capture level设置为 WinNet level data Capture Level的设置说明:1.Sock ...

  8. 使用Fiddler对android应用抓http或https包

    工作原理 先上个图 此图一目了然,可以看出fiddler在请求中所处的位置,我们就可以确定它能干些什么.   WinInet(“Windows Internet”)API帮助程序员使用三个常见的Int ...

  9. 使用Fiddler对android应用抓包 专题

    工作原理 先上个图 此图一目了然,可以看出fiddler在请求中所处的位置,我们就可以确定它能干些什么. WinInet(“Windows Internet”)API帮助程序员使用三个常见的Inter ...

  10. Windows 配置 Aria2 及 Web 管理面板教程

    今天闲来没事,想找点东西折腾下,然后看到个在 Debian 7 x64 系统环境下配置 Aria2 和 Web 管理面板的教程,针对 Linux 服务器用的.但很多人没服务器,也不知道什么是 Aria ...

随机推荐

  1. 【Devexpress】pivotGridControl设置不显示展开折叠按钮

    只需要设置.效果看图二

  2. 如何禁止win7自动锁屏

    前言 我是真的服了,就解决这个问题百度查了一大堆(浪费很长时间),都说是电源管理的问题,也不知道是谁抄谁的,改完还会自动锁屏. 然后我google一下子就解决了(这里有一个搜索技巧,就是将你的问题翻译 ...

  3. python-CSV文件的读写

    CSV文件:Comma-Separated Values,中文叫逗号分隔值或者字符分隔值,其文件以纯文本的形式存储表格数据. 可以理解成一个表格,只不过这个 表格是以纯文本的形式显示,单元格与单元格之 ...

  4. SQL审核平台Yearning

    1.关于Yearming Yearming是一个Sql审核平台,底层使用Go语言,安装和部署方式也很便捷 项目地址 https://guide.yearning.io/install.html git ...

  5. 常用内置模块之collections模块、时间模块、随机数random模块

    今日内容回顾 目录 今日内容回顾 包的具体使用 编程思想的转变 软件开发目录规范 常用内置模块之collections模块 常用内置模块之时间模块 常用内置模块之随机数random模块 报的具体使用 ...

  6. Appium工具

    1.安装 (1)jdk安装以及环境配置 a.jdk下载地址:https://www.oracle.com/cn/java/technologies/downloads/ 新建系统环境变量: b.编辑P ...

  7. python 错误之TypeError: XXXXX() takes no keyword arguments

    举个例子: str1 = 'sunlightn' f = str1.rfind("n", __start=1, __end=2) print(f) 以上代码运行后出现: " ...

  8. python 之字符串的使用

    在python中,字符串是最常用的数据类型,通常由单引号(' ').双引号(" ").三重引号(''' ''',""" ""&qu ...

  9. 希腊字母表及latex代码

    希腊字母表及latex代码 字母大写 字母小写 英文名称 latex大写代码 latex小写代码 \(\Alpha\) \(\alpha\) alpha \Alpha \alpha \(\Beta\) ...

  10. js任务队列EventLoop

    JS 执行机制 在我们学js 的时候都知道js 是单线程的如果是多线程的话会引发一个问题在同一时间同时操作DOM 一个增加一个删除JS就不知道到底要干嘛了,所以这个语言是单线程的但是随着HTML5到来 ...