Python强大的功能使得在写爬虫的时候显得十分的简单,但是Python2和Python3在这方面有了很多区别。

本人刚入门爬虫,所以先写一点小的不同。

以爬取韩寒的一篇博客为例子:

在Python2.7中,我们往往这样写:

import urllib2
request=urllib2.Request("http://blog.sina.com.cn/s/blog_4701280b0102egl0.html")
response=urllib2.urlopen(requset)
print response.read()

但是在Python3中,这样做却行不通了,首先Python3将urllib和urllib2合并成了urllib

aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAl4AAACpCAIAAAB1S6RZAAAW/klEQVR4nO2dzZXrqhJGbywOxicZjV4k7jTevB2MYziDu9YLQG/gH1HUD4Us25K19+pBW4KiAMEnQBL/jAAAAFDwz6cdAAAAWBdIIwAAgABpBAAAECCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgEAAARIIwAAgABpBAAAECCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgEAAARIIwAAgABpBAAAECCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgEAAARIIwAAgABpBAAAECwvjX///d8K/xbPJgAAfCv7lMbzcDgM58Wzvizn4XA4ni7vT/ZeMrNKaYr+tB+31J+rrMX8eT/vukrbF9pinlxOx8PhcPjIhV3jZfudnYNMq6/pXU5HI1CRq0dhC8osf6iL2Qb7lMZN8JHrtmyTM/qIBbuVRaRxE/dAHt8njavqilflzNjf9HLSWAU4D4fi2NqKYFWsRRofNzXJ449T8RGkcUaiSOM6+E5pXE11rE0X3iKN8uDaimBVrEUaH9qWP+5JaUIa5ZUnJx68K/I8tMJcTsfD8XSerFVXXWBBuHA7J67ba4DHz8BUcep4Oolrv0zFaBH16fNwOAyn6WCZkOFAw7qZR+3xpTyqpTFIpLavg2Yq2g9jJK07H2c+S+brWi+ypo0Urf7xluIU43i6lCkpZ5plVV4kVYqdVRAXr1n9btXXRVRmvsydPQBKmfXal5aoyZx72cqGlvb2nlZP07Ptj5XlTIBQGu3atG1+JRuWxq6Q0sfi6q/qWl7BVoRrjKjvLzs80RgcC/Ln43Kdrtur4Ywpkeat7ZYN5R7SzUPdL5jRfQfcu10nj5bHRfFV/XKQhaAMOyraD+MkXfcUj4R9V8+D6rA7CvN2jVXqa2XRLdjgImlLo5uvVDuql9aCqndbmEhcZD1zRVVtxWpf6SbgNLQeb72RoptuZf5Vo8Zcr/jdvEQarzcaenj3uAfxZkrz0mhGf4TUZ6WPxVWYmuM5D+UV5N05qUtYtlfPgnd9lneWVYflmFKOFZ2BSsXOeN0+KwG/d5RRXqzS9PJY9h5VQN0vB1mIyrCjot0wbtJ6aFDd1rRc7SzM8K6qiO4XbOsiCaXRz1e2HYnbgKDq7cGMvKcahnvhTcbyZoP21WoCURl2eRtJo1WtJXOk0bjJNgs6VZtfzqtGjYHmlaI1WxpjydQBpI/6Bi19HdynUTxp9KbXXAvuFMV5OByOx6PV5hxT+oL2++qgZTlds9Vc6tJwmlR4MyGOCwGW/wRZCMuwp6KdMFHpVerbLG2/18kUZpVT72dQsMFF0pTGVr4yxete9pkiEi5dFy9kjA6zXvtKNIGoDHu8DaWx0fTmPqFaK7p/D7JzdfyMNC41oeqNGnukcQxWwawQwznq6HUr09eptKBjFZEPtxblrekIU4ZforNutBNVMm779EsjkEZ3Clp77Eujk4WwDLsq2l/2dUvPGFJF4esi6ivM2dJ4txZdJClpDK6iZvGG0pgbgd5G2ZfT8TbiK5fsusx67avdBMIy7PH2DdL4CFANGCOvywhxY/lmkEZ1Ql8scX8kgjnSGFiIR416Jigw1TVqtEm0z6g03jtqbNrvrWgvTFx6t0Ss4aMf2vb8ddL4olFjUCTRKGr2qPHWyMoBWPGrf9RozLQuOGpsefuUNNpHoyJWtZKsz0xj+T6QxrF9yrrd9Dp6+8qLLHjXp3P/F5hSbaHwKOpunKzm+gWVl1jmJd1rjUGtNdYaO07pMHHg83A4DEMZJJ1ub2EmpbFnrbG6SJzV5Uw5+Nk0Dmaq3uJyOtbLdsfCUt6sN77qvzt0n5RpevuUNFo3hGUNOoPb6i4gfauzt4HjO6TxOiRfqTRWV4d5saiWY99E3WYg7mdc0aotyHYlbu9FG5o6PdeU6loPsvnLVuHc15cZaPULlQNuW3PyaHlc3wJMbgRZaJdhpqL9MHHp3UqhKCw/fDBqNApzrjQGBSt9E2mWZ25Xc7YKMsVr3BbEVW9y9Vj6X9+QZsw67SsnUUFD6/G2TqvR9Ezj7k27OZcirly3uaZq88t54csbj3nq8v+SSsPMU0sdlz4aTfSOfQUU8+731QJPQZ23kWIL3gJXdXkW4uiaKnJzfSOr6jnrZCqKZu62z8CBoJdw1y4Kj6v86pFKkAXTvvAnUdFBmKj06j4qCF93c35hPieNbsGqRMU7eVOc4urJVUG7eFWOGlVv4+t8l1mvfWVHb0FDS3trGQybnkLWhr5gVbRrBKNBVDbs2nSXLr6Pdb3X+Lq/xbNpsMLLZo8TIQBvh4b2dSCNy/Fxaawd2OU8CMCroaHtAKRxOT4ujWM9RcJ9LMBLoKF9O0gjAACAYHlpBAAA2DRIIwAAgABpBAAAECCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgEAAARIIwAAgABpBAAAECCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgEAAARIIwAAgABpBAAAECCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgEAAARII3RwOR0Ph8PhcDieLp/2JeY8HA7D+dNePDgP7yuzKa0FC+FyOhq2imw9Lg1Bmed3lgHAkyCNkIfObS6fkcYFyUljFeA8HIpjXD2wIfYujf/857/Xv+Txx6n4yJeysqHYhtinNMqDSCNsiL1L4xVP2JLSGFjYFlEuxIRZ2dvpqbPzcDieToMIWWLGupyOh+PpPKVS9aJl+jKtw3CeLJaeyf91GOXLye28i2Bl9MjypcyKbdkoKCebgQ9BWgsWwjxpFHIYSaOcjuX2Cz4O0jiOndL4ZMg108qFHDWK6bLyx3nQ/frYinXvHEsbj7Qup+NkUfy49upl92yrghXG8MX0u7QlEn/Wcl1QfjYDH4K0FiyEV44aq6iy6gE+wldJYzWee/wsZ0fNmdK8NAYTsMHZwIiO8jhiTtvquHH2YzuBcYu6fxYdnXz6w1FGP1bV34ugyuDkiDw1dbK1KhhhVGfuSPp5UPK1nOV6BOxn0/EhSmvBQpgjjde7nfZaI/P0sD5WLo2PmZ5sywk0r+z0Z0ujd1wrTcZV7Y/n2CuO908LF32Y7gltWZIEsfSpR1eq+1Q3rcdPPZeowmg3m6th9+txIct6FO5l0/chSmvBQpj5hGotutF0NeoIK2LN0igbW24BP5bGZKwZx5Na2wxTjd6ekcCMnYxLBaE02rIkCWKp8aSURq/bnS+Nji/GdVZciMNZjRqfsGxMUHvq4vkQprVgIXSOGqsBY2BYROi8EQZ4GSuWxvo2NNVevkAaM4F7JbMK482afnrUGEhjtHT52lFj5XNCGhcbNWZ8eNOo0a7U6IJQM7PNMbkTD+ADrFga9zpqzAR+RhqDtDqksbXW6NzIdKw1Fj1pYPAJaVSduXLBi72QZdf5TC5dSQqeRXrCVfe9RfHUkBbZ6nGhVCtm7RE+z5qlcXxyrTG//vc6aQyeaslIo7dG2HU8sJNxqcCYACx7Xn8gU9mwYt1uhMSTp3J+rnom05ThHmm0fGmNGqtArmXjKdq2NPrZjHwI0lquEFRA9dtfF22+vFGdSEsowOtYuTTOoZww1EtuWjPMU8se9zxsng3yFeQ3sBOkGzh8R2lesTym3jV0bHixrl3rqVxSE5GcN9+ekkbpy/WtP8Nx8dbgJSPMKpb7XmMym4EPQVpLFoLhnciQvRh5jTB5oyjfIDHtAnyEL5TGPZCZL90Ydtf6XpjJGykEgHFEGrdIPOzbKu+XxjrFXU7kUQgAFkgjrIOPjBrlHN9Ox0oUAoACaQQAABAgjQAAAAKkEQAAQIA0AgAACJBGAAAAAdIIAAAgQBphIc5D4rn/zPvki75zPnm18VfZV5qRpZ3JfOEd4PUgjbAIq+qvHwRfStsWm3a+B6QR1sHepbH5IVMvSnxkf6yz70Yat8Y6pJFuAfYujVd6N8fo37Nivbh7yEafDp/OPTY2qoKWn1hxvmptmcqGUQ7q73fXXp0H//PlVg4aZSU3vhBf6panjPTs5K4bdp3ld8S9jGQ/8m4VnfcJc2HxWqTT+USFRpaDEm5/h119ev4RoiqLy+m4oITuuVso+Y5c9II0jmP/plTPhFwVchekaodEd8Mp2Wkbe1CUP3QScgOk2JQXJrmVUsqU8NDbrjAKNm1CWW5H6ZWta+cuEWVJ+7cLZRZ7aqoagdU7TRVHnb1U3Ar1LQclnN8XzLagNjtZcHC5226h4jty0ctXSaO3IZS5TVUV0TNoJuHdHnZ9+FvvD+X5mfE/3qbKccn7lnRjm+J6JyljE6gittprsPw/YcoK095Svi85lVndwYbBbv+XGffK1rejVLmoBk8a+2sqNTWrd1BMVKhruZHldj2aJVNsN1los1HkW+kWYn+87eTK42XcOPuxncD4fli5ND5mcbL3gsHFXYlQEGvGcX1JLe5ncNyz6Z264S3h6OO2bpU/nX7xXoHt0U/XVos6NVuJkqaMmdg6L61g50Fep17ZBnZ0FKlzRkbm1lSrQfn62apQZwjqZDlZj1HJSJ11ZsO30i14/niOveK4J/B7Y83SGOyb6hK3gWSsGceTjaoM4N2RdTkzL783vL7E6NY9CbSPFzU3nMNR4zxpdBycLY0GljSGwaqZSa9sAzs6ygxpbNfUWLUsZ63RWykMKtSx7Gc5W4+6ZMojRU16ncRWugUvjNddzJDAjJ2MS1/PiqVRKmNy4LiVNrCUM09L4wtGjVX0F0jja0eNJo1gl9PxMAy1+qRGjcLG09LYrinlTqPc8hXqWA6yvMSo8a6JgTJuplvwwiRT7JXMKow3a4o0ro0vHzUu4sxT0uh2Wo0VrIaeGUGWlkbVVzpPz/QmFxAFu5dXPZbxho22HZUHa9aw+n9GTWX8kQezFeoZaWS5XY/RKuz919GdTR230y14Yd4gjUFaSOMKeczFpIaMo7+KMH6oDSRvxGZPqI6J9hAg+5zq4Qa5huTr3qP7K5+GlOtPavKrZaoVxnDQk0YrafGz6nidgVQQzL4t8MrWtVM9h6lyaGekt6YqyW6PttMV6lsOSlic8uoxKpnpfHDzvJVuwTOekTpvgrR3AhZpvLJyaZxDOTOg59b1xWGeWvZ47GfVYPLGvVSa4Utmvdc46p9ltybeVLs4KvKcNEoHr+/vOfOXhReeqfabfn4wqQhCRjLvNdZzsM6bl35GumuqDO8ISp1ErkIjy0EJz3ivsa6h4AmcO+vvFuJm23R1tBSxzG9gp9ldZHqSL+MLpRH2SHJadM14y5PQ5HI65hZcvpu9qddLQRphg9QqknyWZt0gjXNBGcfwORqYAdII20TM3X2FoiCNM2guMwLMAmkEAAAQII0AAAACpBEAAECANAIAAAiQRgAAAAHSCAAAIEAad83l58/15Yc/Pyt8+v13OByG30VMXX7+GLZ+h0fOHyUhKIulCAwAXw7SuGf209vnpLEK8DscimP7KSwA2Ls0Nr9Y6EWJj2yEBYdlK2eWNMqDSCPAfti7NF4JPqprHjSldHm3XoqYQSy7fz2X+Dsc/vz8DCLkZOTPz+9kqtKOwl4ZUxmU05mlN/L/yV7ph/D6x1SwedIo5DCSRtt/ANgqSOM49u8+80zINSFHjWL+sPzxO3irkXdJKAOKGcj7j8vPn8KENFip0mSklkapY7f/tdfLSWNu1Oj6DwBb5auk0dv5xdyPporoGTST8EaNyS/8Np3xtoyJzwYb6ziO1OIlev5JCvzxklQ8aeV3KM8EKuNO69bSOEV5WFOK5sj4HGm86n57rXE/09IAu2Hl0viYKst2PYHmVZuWBbFmHNdy1fTTDB8YMf33wucyWHTqWhqmI37fr2OZAnKvRmum9HFap6AnVNUp7ZqtYDOfUK1F175B8PwHgK2yZmmUvVXuCYhYGpOxZhxPam0yfH5oG2892ho4htI4nY2lUVaMrTbDrxo1WiO05lpj7Zvj9dOjxmrAGBiO/AeArbJiaazv41MdzhdI44yhbX6KWLLIqNGRxspgQxqlgWIid6FRo51klH81M5t8QtVfmAWAzbBiadzlqLF/UtQO0C2NrbXGYELV1g9r0jQjjc5o1ZFGpWjKo/K48d7iPagz/qyWRFMXIWuPAJtnzdI4PrnWmF//e500epOZS0njYmuNY+MJ1UAa5fsN5jCsenJUpytfFcmPGrNPqNYB1W9zxlWIY/gYjuk/AGyVlUvjHMznOTOPcXqPfT5/PHAy41IycJCEU1RK89z3GsPHcH7KRcXq5GTMmSmt063e8WhKo4x9ffXRu5EK5iHsxchrhMlpRfkGSd/8BgCsmC+Uxk2Qm/BcPbagfBSmMwHgaZDGD9AazG2Hj0tj7QCzmQCwAEgjPMHHpXGsJzoZMALA8yCNAAAAAqQRAABAgDQCAAAIkEYAAAAB0ggAACBAGgHezd9//7fDv0+XOkAHSCOsiMfnalb5buJiXxP4uEqtQRqLuv7xP3sE8BmQRlgPe3lh/+MqtQJp9OoaaYRVsHdpbH6A1IsSH3mFk6+zvxr20i3O0JXrAOvj8rasNFp1vZdrAFbO3qXxStfOG56ULu/W0/a3JKjuZsbm586vk3Dq6zfXnbV+5XfNC8ov58jPy0mD8jvk/tbKkz211WM5WViPjuZJyzxpXEpQe+3o8H5du9+RLwPqr8HzCSR4HUjjOPZvSvVMyHl8vzSOY9cmWfbE673HLAOKDTrK3SnlFh/Tr+rrd5ORugd/xCmtZTbJ2rs01rVh/y/qqPzhVhDAYnyVNHobQpkbPFURPYNmEt6oMfnd8Iw/pofx9K+ZRHkqk6i501Z8trlPVjO/dzq2VraXJNVGxoWV36E8o7aTlHsWmz1t3YNPUR7W1EdlTRn/e58gvepH+X8pLfqIF9gLX+KdyoicGd7z3wsfF6b6X9WyucU1wGtYuTQ+tZVx9bPqrINYM45r2ci4avrjHfTse5nq8jNw3vSz6UychbY06g+XT0da+0fWO0EqGb1fVsFmkFYKif0jtWuWA6aeVapjHjTDeOEzR/JjQTNkV7q5wixLUpabvABQR3gpa5bGYONZl7hrTsaacTyptc0wnp3e4/P8zA+pg5IxB6xBlIJQGjOjBjXULHvY4noaftWoUS9aHqbQOpgtjY7XkTQ2RciTn+bxvGjNlkYt2ItKo4GYEa8PAizGiqVRKmPy+kcaZ/s5Y0id1NHmqTuLjBodaawMNqRRGigmchcfNb5fGh961qWRn5DG1O2wv/IMMJ8VSyOjxjdKY3JS1OTJQi7oWGsMJlTLWG4k9bSMf+dljlYdaVSKrjwaxxVIY/LsJ6WxY0WRtUdYnjVL4/jkWmN+Qu910ug/ctI2nl/DS3oV+5lJK3Yg6adPxxOqgTROUlRGUsPEKpxMV74qkh81dj6hGuiNDhBonieTzSPV/4FSxvqXSdev68QTquXo0K0ggMVYuTTO4R/reUhz9UsHeNFxz0NPOPN+BqmYp7qKIhk4yF0yv3eU5rnvNYaP4fyUi4rVycmYM1Nap1u949GURhn7+upj7ezfYkozGHVpLQyiBMJWnfKMxNIY2PHiVgf9uk6912g8KaUvDYCF+EJphIDMTO+GMR6D+TSONO7w7zPlDzALpHFHJMZtG+fj0lg7YE/2fVylkEaAGKQRvoiPS+NYv3VgOvNxlUIaAWKQRoB383GVQhoBYpBGAAAAAdIIAAAgQBoBAAAESCMAAIAAaQQAABAgjQAAAAKkEQAAQIA0AgAACJBGAAAAAdIIAAAgQBoBAAAESCMAAIAAaQQAABAgjQAAAAKkEQAAQIA0AgAACJBGAAAAAdIIAAAgQBoBAAAESCMAAIDg/+V+ohfF23S7AAAAAElFTkSuQmCC" alt="" />

而获取网络数据需要urllib.request模块。

其次,由于unicode会导致爬回来的中文乱码,因此需要用str()函数进行对乱码的修改。

因此在Python3中需要这样写:

import urllib.request
url='http://blog.sina.com.cn/s/blog_4701280b0102egl0.html'
response=urllib.request.urlopen(url)
content=response.read()
print (str(content),'utf-8')

爬虫入门---Python2和Python3的不同的更多相关文章

  1. Python爬虫入门教程 50-100 Python3爬虫爬取VIP视频-Python爬虫6操作

    爬虫背景 原计划继续写一下关于手机APP的爬虫,结果发现夜神模拟器总是卡死,比较懒,不想找原因了,哈哈,所以接着写后面的博客了,从50篇开始要写几篇python爬虫的骚操作,也就是用Python3通过 ...

  2. Python爬虫入门教程 53-100 Python3爬虫获取三亚天气做旅游参照

    爬取背景 这套课程虽然叫爬虫入门类课程,但是里面涉及到的点是非常多,十分检验你的基础掌握的牢固程度,代码中的很多地方都是可以细细品味的. 为什么要写这么一个小东东呢,因为我生活在大河北,那雾霾醇厚的很 ...

  3. Python爬虫入门教程 51-100 Python3爬虫通过m3u8文件下载ts视频-Python爬虫6操作

    什么是m3u8文件 M3U8文件是指UTF-8编码格式的M3U文件. M3U文件是记录了一个索引纯文本文件, 打开它时播放软件并不是播放它,而是根据它的索引找到对应的音视频文件的网络地址进行在线播放. ...

  4. Python2.x爬虫入门之URLError异常处理

    大家好,本节在这里主要说的是URLError还有HTTPError,以及对它们的一些处理. 1.URLError 首先解释下URLError可能产生的原因: (1)网络无连接,即本机无法上网 (2)连 ...

  5. Python基础入门一文通 | Python2 与Python3及VSCode下载和安装、PyCharm激活与安装、Python在线IDE、Python视频教程

    目录 1. 关键词 2. 推荐阅读 2.1. 视频教程 3. 本文按 4. 安装 4.1. 视频教程 4.2. 资源下载 4.3. 安装教程 1. 关键词 Python2 与Python3及VSCod ...

  6. 爬虫入门系列(二):优雅的HTTP库requests

    在系列文章的第一篇中介绍了 HTTP 协议,Python 提供了很多模块来基于 HTTP 协议的网络编程,urllib.urllib2.urllib3.httplib.httplib2,都是和 HTT ...

  7. Python2和Python3的差异

    之前做Spark大数据分析的时候,考虑要做Python的版本升级,对于Python2和Python3的差异做了一个调研,主要对于语法和第三方工具包支持程度进行了比较. 基本语法差异 核心类差异 Pyt ...

  8. Python2和Python3比较分析

    一直有看到网上有讨论Python2和Python3的比较,最近公司也在考虑是否在spark-python大数据开发环境中升级到python3.通过本篇博文记录Python2.7.13和Pthon3.5 ...

  9. Python爬虫入门教程 37-100 云沃客项目外包网数据爬虫 scrapy

    爬前叨叨 2019年开始了,今年计划写一整年的博客呢~,第一篇博客写一下 一个外包网站的爬虫,万一你从这个外包网站弄点外快呢,呵呵哒 数据分析 官方网址为 https://www.clouderwor ...

随机推荐

  1. Js判断键盘按键

    该文转自: namehwh 网址:http://www.cnblogs.com/hanwenhua/articles/3365154.html window.document.onkeydown = ...

  2. LNA和PA

    低噪声放大器(Low Noise Amplifier) -------------LNA 功率放大器(Power Amplifier)---------------------PA LNA是低噪声放大 ...

  3. OkHttp+Stetho+Chrome调试android网络部分

    如果要达到上面的效果,你需要改造你的网络请求模块,使用Chrome浏览器和android程序之间的中间件来连接,这就是本篇要介绍的主题:OkHttp+Stetho+Chrome进行网络调试. okht ...

  4. CentOS中vsftp安装与配置

    http://blog.chinaunix.net/uid-7271021-id-3086186.html 553 Could not create file 解决办法 [root@localhost ...

  5. IOS 用keychain(钥匙串)保存用户名和密码

    IOS系统中,获取设备唯一标识的方法有很多: 一.UDID(Unique Device Identifier) UDID的全称是Unique Device Identifier,顾名思义,它就是苹果I ...

  6. Ubuntu11.10 更新软件源source.list (ZT)

    添加完列表后执行 sudo apt-get update sudo apt-get upgrade  --------添加列表------------------------------------- ...

  7. Android crash特殊位置定位

    本文来自http://blog.csdn.net/liuxian13183/ ,引用必须注明出处! 通常情况下,在我们开发的过程中遇到的crash,可以到logcat中找原因:如果做定制App,对方用 ...

  8. HDU 3038

    http://acm.hdu.edu.cn/showproblem.php?pid=3038 题意:[1-n]的区间,有m个询问,每个询问表示[a,b]的和是s,问一共有多少组矛盾 sum[i]表示i ...

  9. JavaScript string.format

    //string.format String.prototype.format=function(){ var e = this, f = arguments.length; if (f > 0 ...

  10. 阅读学术论文的心得体会from小木虫

    我们搞科研的很重要的一个环节就是文献的阅读!关于如何阅读文献?读什么,怎么读?结合我自己的体会,我想这里的关键在于要让我们通过这种方式的学习,学会看懂作者的思想.思路和科学方法,从中学习论文作者发现问 ...