《The Python Standard Library》——http模块阅读笔记1
官方文档:https://docs.python.org/3.5/library/http.html
偷个懒,截图如下:
即,http客户端编程一般用urllib.request库(主要用于“在这复杂的世界里打开各种url”,包括:authentication、redirections、cookies and more.)。
1. urllib.request—— Extensible library for opening URLs
使用手册,结合代码写的很详细:HOW TO Fetch Internet Resources Using The urllib Package
该模块提供的函数:
urllib.request.
urlopen
(url, data=None, [timeout, ]*, cafile=None, capath=None, cadefault=False, context=None)
urllib.request.
install_opener
(opener)
urllib.request.
build_opener
([handler, ...])
urllib.request.
pathname2url
(path)
urllib.request.
url2pathname
(path)
urllib.request.
getproxies
()
该模块提供的类:
class urllib.request.
Request
(url, data=None, headers={}, origin_req_host=None, unverifiable=False, method=None)
class urllib.request.
OpenerDirector
class urllib.request.
BaseHandler
class urllib.request.
HTTPDefaultErrorHandler
class urllib.request.
HTTPRedirectHandler
class urllib.request.
HTTPCookieProcessor
(cookiejar=None)
class urllib.request.
ProxyHandler
(proxies=None)
class urllib.request.
HTTPPasswordMgr
还有很多,不一一列出了。。。
1.2 Request对象
下面的方法是Request提供的公共接口,所以它们可以被子类重写。同时,也提供了一些客户端可以查阅解析的请求的公共属性。
Request.
full_url
Request.
type
Request.
host
Request.
origin_req_host #不包含端口号
Request.
selector
Request.
data
Request.
unverifiable
Request.
method
Request.
get_method
() Request.
add_header(key, val)
Request.
add_unredirected_header
(key, header) Request.
has_header
(header) Request.
remove_header
(header)
Request.
get_full_url
() Request.
set_proxy
(host, type) Request.
get_header
(header_name, default=None) Request.
header_items
()
1.3 OpenerDirector Objects
有以下方法:
OpenerDirector.
add_handler
(handler)
OpenerDirector.
open
(url, data=None[, timeout])
OpenerDirector.
error
(proto, *args)
1.4 BaseHandler Objects
1.5 HTTPRedirectHandler Objects
1.6 HTTPCookieProcessor Objects
它只有一个属性:HTTPCookieProcessor.
cookiejar ,所有的cookies都保存在http.cookiejar.CookeiJar中。
1.x 还有太多类,需要用时直接查看官方文档吧。。
EXamples
打开url读取数据:
- >>> import urllib.request
- >>> with urllib.request.urlopen('http://www.python.org/') as f:
- ... print(f.read(300))
- ...
- b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
- xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
- <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
- <title>Python Programming '
注意:urlopen返回一个bytes object(字节对象)。
- >>> with urllib.request.urlopen('http://www.python.org/') as f:
- ... print(f.read(100).decode('utf-8'))
- ...
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
- "http://www.w3.org/TR/xhtml1/DTD/xhtm
向CGI的stdin发送数据流:
- >>> import urllib.request
- >>> req = urllib.request.Request(url='https://localhost/cgi-bin/test.cgi',
- ... data=b'This data is passed to stdin of the CGI')
- >>> with urllib.request.urlopen(req) as f:
- ... print(f.read().decode('utf-8'))
- ...
- Got Data: "This data is passed to stdin of the CGI"
CGI的另一端通过stdin接收数据:
- #!/usr/bin/env python
- import sys
- data = sys.stdin.read()
- print('Content-type: text/plain\n\nGot Data: "%s"' % data)
Use of Basic HTTP Authentication:
- import urllib.request
- # Create an OpenerDirector with support for Basic HTTP Authentication...
- auth_handler = urllib.request.HTTPBasicAuthHandler()
- auth_handler.add_password(realm='PDQ Application',
- uri='https://mahler:8092/site-updates.py',
- user='klem',
- passwd='kadidd!ehopper')
- opener = urllib.request.build_opener(auth_handler)
- # ...and install it globally so it can be used with urlopen.
- urllib.request.install_opener(opener)
- urllib.request.urlopen('http://www.example.com/login.html')
添加HTTP头部:
- import urllib.request
- req = urllib.request.Request('http://www.example.com/')
- req.add_header('Referer', 'http://www.python.org/')
- # Customize the default User-Agent header value:
- req.add_header('User-Agent', 'urllib-example/0.1 (Contact: . . .)')
- r = urllib.request.urlopen(req)
OpenerDirector
automatically adds a User-Agent header to every Request
. To change this:
- import urllib.request
- opener = urllib.request.build_opener()
- opener.addheaders = [('User-agent', 'Mozilla/5.0')]
- opener.open('http://www.example.com/')
Also, remember that a few standard headers (Content-Length, Content-Type and Host) are added when the Request
is passed to urlopen()
(or OpenerDirector.open()
).
GET:
- >>> import urllib.request
- >>> import urllib.parse
- >>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
- >>> url = "http://www.musi-cal.com/cgi-bin/query?%s" % params
- >>> with urllib.request.urlopen(url) as f:
- ... print(f.read().decode('utf-8'))
POST:
- >>> import urllib.request
- >>> import urllib.parse
- >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
- >>> data = data.encode('ascii')
- >>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
- ... print(f.read().decode('utf-8'))
The following example uses an explicitly specified HTTP proxy, overriding environment settings:
- >>> import urllib.request
- >>> proxies = {'http': 'http://proxy.example.com:8080/'}
- >>> opener = urllib.request.FancyURLopener(proxies)
- >>> with opener.open("http://www.python.org") as f:
- ... f.read().decode('utf-8'
The following example uses no proxies at all, overriding environment settings:
- >>> import urllib.request
- >>> opener = urllib.request.FancyURLopener({})
- >>> with opener.open("http://www.python.org/") as f:
- ... f.read().decode('utf-8')
《The Python Standard Library》——http模块阅读笔记1的更多相关文章
- Python Standard Library
Python Standard Library "We'd like to pretend that 'Fredrik' is a role, but even hundreds of vo ...
- Python 日期时间处理模块学习笔记
来自:标点符的<Python 日期时间处理模块学习笔记> Python的时间处理模块在日常的使用中用的不是非常的多,但是使用的时候基本上都是要查资料,还是有些麻烦的,梳理下,便于以后方便的 ...
- Python语言中对于json数据的编解码——Usage of json a Python standard library
一.概述 1.1 关于JSON数据格式 JSON (JavaScript Object Notation), specified by RFC 7159 (which obsoletes RFC 46 ...
- The Python Standard Library
The Python Standard Library¶ While The Python Language Reference describes the exact syntax and sema ...
- 《The Python Standard Library》——http模块阅读笔记2
http.server是用来构建HTTP服务器(web服务器)的模块,定义了许多相关的类. 创建及运行服务器的代码一般为: def run(server_class=HTTPServer, handl ...
- 《The Python Standard Library》——http模块阅读笔记3
http.cookies — HTTP state management http.cookies模块定义了一系列类来抽象cookies这个概念,一个HTTP状态管理机制.该模块支持string-on ...
- python os os.path模块学习笔记
#!/usr/bin/env python #coding=utf-8 import os #创建目录 os.mkdir(r'C:\Users\Silence\Desktop\python') #删除 ...
- Python Standard Library 学习(一) -- Built-in Functions 内建函数
内建函数列表 Built-in Functions abs() divmod() input() open() staticmethod() all() enumerate() int() ord() ...
- Python内置模块和第三方模块
1.Python内置模块和第三方模块 内置模块: Python中,安装好了Python后,本身就带有的库,就叫做Python的内置的库. 内置模块,也被称为Python的标准库. Python 2.x ...
随机推荐
- IOC AOP 设计模式
IOC AOP 不是什么技术而是一种设计模式 学习 IOC AOP 其实是在学习一种思想. 1.IOC IOC其实是 将对象的创建和获取提取到外部.由外部IOC容器提供需要的组件. 看下面代码: p ...
- android eclipse 报error loading /system/media/audio/ xxx 错的解决办法。
只针对 报错..error loading /system/media/audio/ xxx.ogg 一步操作 解决烦恼..把 模拟器声音 关了..所有的错 都没了. 包括 关闭按键声音,触摸声音 ...
- U盘安装Ubuntu 12.04成功后系统无法启动的原因及解决办法
想搭建一个Linux开发环境,选择了ubuntu12.04长期支持版,采用u盘安装(Universal-USB-Installer做的启动),发现安装完成之后,拔掉u盘无法启动,插上u盘之后,可以重启 ...
- [Violet]樱花
题目链接 洛谷 狗粮版 前置技能 初中基础的因式分解 线性筛 \(O(nlog)\)的分解质因数 唯一分解定理 题解 首先来分解一下式子 \[\frac{1}{x}+\frac{1}{y}=\frac ...
- PHP的Composer 与 Packagist,简单入门
[转]http://www.php.cn/manual/view/34000.html Composer 是一个 杰出 的依赖管理器.在 composer.json 文件中列出你项目所需的依赖包,加上 ...
- kali linux之选择和修改exp与windows后渗透
网上公开的exp代码,选择可信赖的exp源,exploit-db,securityfocus,searchsploit,有能力修改exp(python,perl,ruby,c,c++.......) ...
- kali linux之wireshark/tcpdump
抓包嗅探协议分析,必备技能,抓包引擎(linux---libpcap9 windows-----winpcap10) 解码能力优秀 常见协议包 数据包的分层i协议 arp icmp tcp--三次 ...
- Redhat系的Linux系统里,网络主要设置文件简介【转载】
以下是原文地址,转载请指明出处: http://blog.chinaunix.net/uid-26495963-id-3230810.html 一.配置文件详解在RHEL或者CentOS等Redhat ...
- 最新cenos执行service httpd restart 报错Failed to restart httpd.service: Unit not found.
原来是需要将Apache注册到Linux服务里面啊!注册Apache到Linux服务在Linux下用源代码方式编译安装完Apache后,启动关闭Apache可以通过如下命令实现: /usr/local ...
- spark执行例子eclipse maven打包jar
首先在eclipse Java EE中新建一个Maven project具体选项如下 点击Finish创建成功,接下来把默认的jdk1.5改成jdk1.8 然后编辑pom.xml加入spark-cor ...