urllib库的使用

  1. # coding=utf-8
  2. import urllib2
  3. import urllib
  4.  
  5. # htpbin模拟的环境
  6. URL_IP="http://10.11.0.215:8080"
  7. URL_GET = "http://10.11.0.215:8080/get"
  8.  
  9. def use_simple_urllib2():
  10. response = urllib2.urlopen(URL_IP)
  11. print '>>>> Response Headers:'
  12. print response.info()
  13. print '>>>>Response Body:'
  14. print ''.join([line for line in response.readlines()])
  15.  
  16. def use_params_urllib2():
  17. # 构建请求参数
  18. params = urllib.urlencode({'param1': 'hello','param2': 'world'})
  19. print 'Request Params:'
  20. print params
  21. # 发送请求
  22. response = urllib2.urlopen('?'.join([URL_GET, '%s']) % params)
  23. # 处理响应
  24. print '>>>Response Headers:'
  25. print response.info()
  26. print '>>>Status code'
  27. print response.getcode()
  28. print '>>>Response Body'
  29. print ''.join([line for line in response.readlines()])
  30. # print response.readlines()
  31.  
  32. if __name__ == '__main__':
  33. # print '>>>Use simple urllib2'
  34. # use_simple_urllib2()
  35. print '>>>Use params urllib2'
  36. use_params_urllib2()

requests库的简单使用

  1. # coding=utf-8
  2.  
  3. import requests
  4.  
  5. URL_IP="http://10.11.0.215:8080/ip"
  6. URL_GET="http://10.11.0.215:8080/get"
  7.  
  8. def use_simple_requests():
  9. response = requests.get(URL_IP)
  10. print ">>>Response Headers:"
  11. print response.headers
  12. print ">>>Response Code:"
  13. print response.status_code
  14. print "Response Body:"
  15. print response.text
  16.  
  17. def use_params_requests():
  18. response = requests.get(URL_GET)
  19. print ">>>Response Headers:"
  20. print response.headers
  21. print ">>>Response Code:"
  22. print response.status_code
  23. print response.reason
  24. print "Response Body:"
  25. print response.json()
  26.  
  27. if __name__ == "__main__":
  28. # print "simple requests:"
  29. # use_simple_requests()
  30. print "params requests:"
  31. use_params_requests()

requests和github api的互动

  1. # coding=utf-8
  2. import json
  3. import requests
  4. from requests import exceptions
  5.  
  6. URL = "https://api.github.com"
  7.  
  8. def build_uri(endpoint):
  9. # 拼凑url为最终的api路径
  10. return '/'.join([URL, endpoint])
  11.  
  12. def better_print(json_str):
  13. # 格式化输出, indent=4是缩进为4个空格
  14. return json.dumps(json.loads(json_str), indent = 4)
  15.  
  16. def request_method():
  17. # 获取用户信息
  18. # response = requests.get(build_uri('users/reblue520'))
  19. # response = requests.get(build_uri('user/emails'), auth=('reblue520', 'reblue520'))
  20. response = requests.get(build_uri('user/public_emails'), auth=('reblue520', 'reblue520'))
  21. print(better_print(response.text))
  22.  
  23. def params_request():
  24. response = requests.get(build_uri('users'), params={'since':11})
  25. print better_print(response.text)
  26. print response.request.headers
  27. print response.url
  28.  
  29. def json_request():
  30. # 更新用户信息,邮箱必须是已经验证过的邮箱
  31. # response = requests.patch(build_uri('user'), auth=('reblue520','reblue520'),json={'name':'hellojack2019','email':'reblue520@163.com'})
  32. response = requests.post(build_uri('user/emails'), auth=('reblue520','Reblue0225520'),json=['hellojack2019@163.com'])
  33. print better_print(response.text)
  34. print response.request.headers
  35. print response.request.body
  36. print response.status_code
  37.  
  38. def timeout_request():
  39. # api异常处理:超时
  40. try:
  41. response = requests.get(build_uri('user/emails'), timeout=10)
  42. response.raise_for_status()
  43. except exceptions.Timeout as e:
  44. print e.message
  45. except exceptions.HTTPError as e:
  46. print e.message
  47. else:
  48. print response.status_code
  49. print response.text
  50.  
  51. def hard_requests():
  52. # 自定义request
  53. from requests import Request, Session
  54. s = Session()
  55. headers = {'User-Agent': 'fake1.3.4'}
  56. req = Request('GET', build_uri('user/emails'), auth=('reblue520', 'Reblue0225520'), headers=headers)
  57. prepped = req.prepare()
  58. print prepped.body
  59. print prepped.headers
  60.  
  61. resp = s.send(prepped, timeout = 5)
  62. print resp.status_code
  63. print resp.request.headers
  64. print resp.text
  65.  
  66. if __name__ == '__main__':
  67. # request_method()
  68. # params_request()
  69. # json_request()
  70. # timeout_request()
  71. hard_requests()

response响应的常用api

  1. 响应的基本API
  2. In []: import requests
  3.  
  4. In []: response = requests.get("https://api.github.com")
  5.  
  6. In []: response.status_code
  7. Out[]:
  8.  
  9. In []: response.reason
  10. Out[]: 'OK'
  11.  
  12. In []: response.headers
  13. Out[]: {'Date': 'Sat, 20 Jul 2019 03:48:51 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Server': 'GitHub.com', 'Status': '200 OK', 'X-RateLimit-Limit': '', 'X-RateLimit-Remaining': '', 'X-RateLimit-Reset': '', 'Cache-Control': 'public, max-age=60, s-maxage=60', 'Vary': 'Accept, Accept-Encoding', 'ETag': 'W/"7dc470913f1fe9bb6c7355b50a0737bc"', 'X-GitHub-Media-Type': 'github.v3; format=json', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': '33D9:591B:9D084B:CF860E:5D328F23'}
  14.  
  15. In []: response.url
  16. Out[]: 'https://api.github.com/'
  17.  
  18. In []: response.history
  19. Out[]: []
  20.  
  21. In []: response = requests.get("http://api.github.com")
  22.  
  23. In []: response.history
  24. Out[]: [<Response []>]
  25.  
  26. In []: response = requests.get("https://api.github.com")
  27.  
  28. In []: response.elapsed
  29. Out[]: datetime.timedelta(microseconds=)
  30.  
  31. In []: response.request
  32. Out[]: <PreparedRequest [GET]>
  33.  
  34. In []: response.request.headers
  35. Out[]: {'User-Agent': 'python-requests/2.22.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
  36.  
  37. In []: response.encoding
  38. Out[]: 'utf-8'
  39.  
  40. In []: response.raw.read()
  41. Out[]: b''
  42.  
  43. In []: response.content
  44. Out[]: b'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'
  45.  
  46. In []: response.json()
  47. Out[]:
  48. {'current_user_url': 'https://api.github.com/user',
  49. 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}',
  50. 'authorizations_url': 'https://api.github.com/authorizations',
  51. 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}',
  52. 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}',
  53. 'emails_url': 'https://api.github.com/user/emails',
  54. 'emojis_url': 'https://api.github.com/emojis',
  55. 'events_url': 'https://api.github.com/events',
  56. 'feeds_url': 'https://api.github.com/feeds',
  57. 'followers_url': 'https://api.github.com/user/followers',
  58. 'following_url': 'https://api.github.com/user/following{/target}',
  59. 'gists_url': 'https://api.github.com/gists{/gist_id}',
  60. 'hub_url': 'https://api.github.com/hub',
  61. 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}',
  62. 'issues_url': 'https://api.github.com/issues',
  63. 'keys_url': 'https://api.github.com/user/keys',
  64. 'notifications_url': 'https://api.github.com/notifications',
  65. 'organization_repositories_url': 'https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}',
  66. 'organization_url': 'https://api.github.com/orgs/{org}',
  67. 'public_gists_url': 'https://api.github.com/gists/public',
  68. 'rate_limit_url': 'https://api.github.com/rate_limit',
  69. 'repository_url': 'https://api.github.com/repos/{owner}/{repo}',
  70. 'repository_search_url': 'https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}',
  71. 'current_user_repositories_url': 'https://api.github.com/user/repos{?type,page,per_page,sort}',
  72. 'starred_url': 'https://api.github.com/user/starred{/owner}{/repo}',
  73. 'starred_gists_url': 'https://api.github.com/gists/starred',
  74. 'team_url': 'https://api.github.com/teams',
  75. 'user_url': 'https://api.github.com/users/{user}',
  76. 'user_organizations_url': 'https://api.github.com/user/orgs',
  77. 'user_repositories_url': 'https://api.github.com/users/{user}/repos{?type,page,per_page,sort}',
  78. 'user_search_url': 'https://api.github.com/search/users?q={query}{&page,per_page,sort,order}'}

从0开始学爬虫10之urllib和requests库与github/api的交互的更多相关文章

  1. 从0开始学爬虫11之使用requests库下载图片

    从0开始学爬虫11之使用requests库下载图片 # coding=utf-8 import requests def download_imgage(): ''' demo: 下载图片 ''' h ...

  2. 从0开始学爬虫8使用requests/pymysql和beautifulsoup4爬取维基百科词条链接并存入数据库

    从0开始学爬虫8使用requests和beautifulsoup4爬取维基百科词条链接并存入数据库 Python使用requests和beautifulsoup4爬取维基百科词条链接并存入数据库 参考 ...

  3. 从0开始学爬虫12之使用requests库基本认证

    从0开始学爬虫12之使用requests库基本认证 此处我们使用github的token进行简单测试验证 # coding=utf-8 import requests BASE_URL = " ...

  4. 从0开始学爬虫9之requests库的学习之环境搭建

    从0开始学爬虫9之requests库的学习之环境搭建 Requests库的环境搭建 环境:python2.7.9版本 参考文档:http://2.python-requests.org/zh_CN/l ...

  5. 从0开始学爬虫4之requests基础知识

    从0开始学爬虫4之requests基础知识 安装requestspip install requests get请求:可以用浏览器直接访问请求可以携带参数,但是又长度限制请求参数直接放在URL后面 P ...

  6. 从0开始学爬虫3之xpath的介绍和使用

    从0开始学爬虫3之xpath的介绍和使用 Xpath:一种HTML和XML的查询语言,它能在XML和HTML的树状结构中寻找节点 安装xpath: pip install lxml HTML 超文本标 ...

  7. 从0开始学爬虫2之json的介绍和使用

    从0开始学爬虫2之json的介绍和使用 Json 一种轻量级的数据交换格式,通用,跨平台 键值对的集合,值的有序列表 类似于python中的dict Json中的键值如果是字符串一定要用双引号 jso ...

  8. Python使用urllib,urllib3,requests库+beautifulsoup爬取网页

    Python使用urllib/urllib3/requests库+beautifulsoup爬取网页 urllib urllib3 requests 笔者在爬取时遇到的问题 1.结果不全 2.'抓取失 ...

  9. urllib和requests库

    目录 1. Python3 使用urllib库请求网络 1.1 基于urllib库的GET请求 1.2 使用User-Agent伪装后请求网站 1.3 基于urllib库的POST请求,并用Cooki ...

随机推荐

  1. Java精通并发-同步方法访问标志与synchronized关键字之间的关系

    继续基于上一次https://www.cnblogs.com/webor2006/p/11428811.html来研究synchronized关键字在字节码中的表现,在上一次文末提出了一个这样的问题: ...

  2. 51nod 2488 矩形并的面积

    在二维平面上,给定两个矩形,满足矩形的每条边分别和坐标轴平行,求这个两个矩形的并的面积.即它们重叠在一起的总的面积. 收起   输入 8个数,分别表示第一个矩形左下角坐标为(A,B),右上角坐标为(C ...

  3. 【OI备忘录】trick汇总帖

    OI中的那些实用的小trick 在OI中,我们时常会用到一些小技巧,无论是代码方面还是数学方面抑或是卡常,都有很多不错的小技巧. 鄙人不才,往往没办法想出来,于是就有了这篇汇总帖~ 如有疏漏,还请da ...

  4. hive函数之数学函数

    hive函数之数学函数   round(double d)--返回double型d的近似值(四舍五入),返回bigint型: round(double d,int n)--返回保留double型d的n ...

  5. Spark运行架构及作业提交流程

    1.yarn-cluster模式: (1)client客户端提交spark Application应用程序到yarn集群. (2)ResourceManager收到了请求后,在集群中选择一个NodeM ...

  6. JDK9的JShell简单使用

    JShell其实就是一个命令行工具,输入片段代码马上就可以看到结果,相当于脚本一行行解析执行,用户可以体验一把Java交互式编程环境.

  7. [NgRx] NgRx Entity Adapter Configuration - Understanding sortComparer and selectId

    import { Course, compareCourses } from "../model/course"; import { EntityState, createEnti ...

  8. html5 打开摄像头

    <video onloadedmetadata="" id="inputVideo" style="width: 1080px;height: ...

  9. bzoj 1072: [SCOI2007]排列perm 状压dp

    code: #include <bits/stdc++.h> #define N 1005 using namespace std; void setIO(string s) { stri ...

  10. P1903 [国家集训队]数颜色 (带修改莫队)

    题目描述 墨墨购买了一套N支彩色画笔(其中有些颜色可能相同),摆成一排,你需要回答墨墨的提问.墨墨会向你发布如下指令: 1. Q L R代表询问你从第L支画笔到第R支画笔中共有几种不同颜色的画笔. 2 ...