1.错误排除

bsObj = BeautifulSoup(html.read())

报错:

 UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

解决办法:

bsObj = BeautifulSoup(html.read(),"html.parser")

BeautifulSoup

简介:通过定位HTML标签来格式化和组织复杂的网络信息,用简单的python对象来展现XML结构信息。

python3 安装 版本4  BeautifulSoup4 (BS4)

运行实例:

 #!/usr/bin/env python
# encoding: utf-8
"""
@author: 侠之大者kamil
@file: beautifulsoup.py
@time: 2016/4/19 16:36
"""
from bs4 import BeautifulSoup
from urllib.request import urlopen
html = urlopen('http://www.cnblogs.com/kamil/')
print(type(html))
bsObj = BeautifulSoup(html.read(),"html.parser") #html.read() 获取网页内容,并且传输到BeautifulSoup 对象。
print(type(bsObj))
print(bsObj.h1)

第12 行注意,需要加上 "html.parser"

结果:

ssh://kamil@xzdz.hk:22/usr/bin/python3 -u /home/kamil/windows_python3/python3/Day11/day12/beautifulsoup.py
<class 'http.client.HTTPResponse'>
<class 'bs4.BeautifulSoup'>
<h1><a class="headermaintitle" href="http://www.cnblogs.com/kamil/" id="Header1_HeaderTitle">侠之大者kamil</a></h1> Process finished with exit code 0

官方文档

BeautifulSoup_python3的更多相关文章

随机推荐

  1. Linux—C内存管理

    程序(可执行文件)存储结构与进程存储结构: 查看文件基本情况:file fileName.查看文件存储情况:size fileName(代码区text segment.全局初始化/静态数据区data ...

  2. 内裤:DataTable转Model

    public class ConvertHelper<T> where T : new() { /// <summary> /// 利用反射和泛型 /// </summa ...

  3. 用django实现一个微信图灵机器人

    微信的post请求格式是xml,所以django需要做的就是将xml请求解析出来,把content发送到图灵机器人接口, 接口返回的json数据把主要内容给解析出来,然后重新封装成xml返回给微信客户 ...

  4. 10 Things Every Java Programmer Should Know about String

    String in Java is very special class and most frequently used class as well. There are lot many thin ...

  5. Broadmann分区

    来源: http://blog.sina.com.cn/s/blog_60a751620100k2hj.html Brodmann areas Name 中文名 Function 1 Somatose ...

  6. NOIP2016提高组解题报告

    NOIP2016提高组解题报告 更正:NOIP day1 T2天天爱跑步 解题思路见代码. NOIP2016代码整合

  7. 解决Ehcache缓存警告问题

    警告: Creating a new instance of CacheManager using the diskStorePath "D:\Apache Tomcat 6.0.18\te ...

  8. struts2: config-browser-plugin 与 convention-plugin 学习

    struts2被很多新手诟病的一个地方在于“配置过于复杂”,相信不少初学者因为这个直接改投Spring-MVC了.convention-plugin. config-browser-plugin这二个 ...

  9. jboss:跟踪所有sql语句及sql参数

    默认情况下,hibernate/JPA 在server.log中记录的SQL语句,参数都是用?代替的,这样不太方便. 网上留传的p6spy在最新的jboss上(EAP 6.0+版本)貌似已经不起作用了 ...

  10. python 图

    class Graph(object): def __init__(self,*args,**kwargs): self.node_neighbors = {} self.visited = {} d ...