Use BeautifulSoup and Python to scrap a website

Lib:

  • urllib
  • Parsing HTML Data

Web scraping script

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
uClient = uReq(quotes_page)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
quotes = page_soup.findAll("div", {"class":"quotes"}) for quote in quotes:
fav_quote = quote.findAll("p", {"class":"aquote"})
aquote = fav_quote[0].text.strip() fav_authors = quote.findAll("p",{"class":"author"})
author = fav_authors[0].text.strip() print(aquote)
print(author)

Run this script successfully

Following is the whole result of this scraping.

I hear and i forget. I see and i remember. I do and i understand.
Confucious
Feeling gratitude and not expressing it is like wrapping a present and not giving it.
William Arthur Ward
Our greatest glory is not in never falling but in rising every time we fall.
Confucious
The secret of getting aheadis getting started.
Mark Twain
Believe you can and you're halfway there.
Theodore Roosevelt
Resentment is like drinking Poison and waiting for your enemies to die.
Nelson Mandela
Silence is a true friend who never betrays.
Confucius
The best way to find yourself is to lose yourself in the service of others.
Mahatma Gandhi
Never succumb to the temptation of bitterness.
Martin Luther King Jnr
The journey of a thousand miles begins with one step.
Lao Tzu
It is health that is real wealth and not pieces of gold and silver.
Mahatma Gandhi
Yesterday is not ours to recover but tomorrow is ours to win or lose.
Lyndon B Johnson
It's not what happens to you but how you react to it that matters .
Epictetus
Beware of what you become in pursuit of what you want.
Jim Rohn
The best revenge is massive success.
Frank Sinatra
Do not take life too seriously You will never get out of it alive.
Elbert Hubbard
Don't judge each day by the harvest you reap but by the seeds that yiu plant.
Robert Loius Stevenson
Your attitude and not your aptitude will determine your altitude
Zig Ziglar
Imagination is more important than knowledge.
Albert Einstein

.

Web Scraping using Python Scrapy_BS4 - using BeautifulSoup and Python的更多相关文章

  1. Web Scraping using Python Scrapy_BS4 - using Scrapy and Python(2)

    Scrapy Architecture Creating a Spider. Spiders are classes that you define that Scrapy uses to scrap ...

  2. Web Scraping using Python Scrapy_BS4 - using Scrapy and Python(1)

    Create a new Scrapy project first. scrapy startproject projectName . Open this project in Visual Stu ...

  3. Web Scraping using Python Scrapy_BS4 - Software

    Install the following software before web scraping. Visual Studio Code Python and Pip pip install vi ...

  4. Web Scraping using Python Scrapy_BS4 - Introduction

    What is Web Scraping This is also referred to as web harvesting and web data extraction. This is the ...

  5. Web Scraping with Python读书笔记及思考

    Web Scraping with Python读书笔记 标签(空格分隔): web scraping ,python 做数据抓取一定一定要明确:抓取\解析数据不是目的,目的是对数据的利用 一般的数据 ...

  6. <Web Scraping with Python>:Chapter 1 & 2

    <Web Scraping with Python> Chapter 1 & 2: Your First Web Scraper & Advanced HTML Parsi ...

  7. 《Web Scraping With Python》Chapter 2的学习笔记

    You Don't Always Need a Hammer When Michelangelo was asked how he could sculpt a work of art as mast ...

  8. 阅读OReilly.Web.Scraping.with.Python.2015.6笔记---Crawl

    阅读OReilly.Web.Scraping.with.Python.2015.6笔记---Crawl 1.函数调用它自身,这样就形成了一个循环,一环套一环: from urllib.request ...

  9. 阅读OReilly.Web.Scraping.with.Python.2015.6笔记---找出网页中所有的href

    阅读OReilly.Web.Scraping.with.Python.2015.6笔记---找出网页中所有的href 1.查找以<a>开头的所有文本,然后判断href是否在<a> ...

随机推荐

  1. Oracle调优之看懂Oracle执行计划

    @ 目录 1.文章写作前言简介 2.什么是执行计划? 3.怎么查看执行计划? 4.查看真实执行计划 5.看懂Oracle执行计划 5.1 查看explain 5.2 explain执行顺序 5.3 访 ...

  2. Beta冲刺--冲刺总结

    这个作业属于哪个课程 软件工程 (福州大学至诚学院 - 计算机工程系) 这个作业要求在哪里 Beta 冲刺 这个作业的目标 Beta冲刺--冲刺总结 作业正文 如下 其他参考文献 ... Beta冲刺 ...

  3. c++运算符重及其调用

    本文参考自:https://blog.csdn.net/lisemi/article/details/93618161 运算符重载就是赋予运算符新功能,其本质是一个函数. 运算符重载时要遵循以下规则: ...

  4. 图解 Git 基本命令 merge 和 rebase

    Git 基本命令 merge 和 rebase,你真的了解吗? 前言 Git 中的分支合并是一个常见的使用场景. 仓库的 bugfix 分支修复完 bug 之后,要回合到主干分支,这时候两个分支需要合 ...

  5. leetcode1028 从先序遍历还原二叉树 python 100%内存 一次遍历

    1028. 从先序遍历还原二叉树 python 100%内存 一次遍历     题目 我们从二叉树的根节点 root 开始进行深度优先搜索. 在遍历中的每个节点处,我们输出 D 条短划线(其中 D 是 ...

  6. 跟着whatwg看一遍事件循环

    前言 对于单线程来说,事件循环可以说是重中之重了,它为任务分配不同的优先级,井然有序的调度.让js解析,用户交互,页面渲染等互不冲突,各司其职. 我们书写的代码无时无刻都在和事件循环打交道,要想写出更 ...

  7. Python三大器之迭代器

    Python三大器之迭代器 迭代器协议 迭代器协议规定:对象内部必须提供一个__next__方法,对其执行该方法要么返回迭代器中的下一项(可以暂时理解为下一个元素),要么就引起一个Stopiterat ...

  8. CentOS 7 下安装 MySQL 8.0

    前言 本篇文章主要介绍在 CentOS 7 环境下安装 MySQL 8.0. 正文 1. 配置yum源 首先在 https://dev.mysql.com/downloads/repo/yum/ 找到 ...

  9. 使用随机函数random来实现课堂点名

    如何使用函数random来实现课堂随机点名 1.最初的样子如下 2.点击开始点名,上面一行的文字变成名字,名字在不停的变化,开始点名变成停止点名,如下 3.点击停止点名,上面名字不动,停止点名变成开始 ...

  10. 小白入门NAS—快速搭建私有云教程系列(一)

    什么是NAS 在日常的工作生活中,我们有大量的资料.文件需要存储在电脑或者其他终端设备中,但是这种方式需要电脑配备高容量的硬盘,而且需要随时随地的带着,这样是不是很麻烦? 那么,今天,我来介绍一种家庭 ...