scrapy wiki资料汇总
Getting started
If you're new to Scrapy, start by reading Scrapy at a glance.
Google Summer of Code
Articles & blog posts
These are guides contributed by the Scrapy community. If you know of any guide not included here please feel free to add it.
- Building a web crawler with Scrapy
- Scrapy after the tutorials
- How to do basic web scraping using Scrapy on a Windows Azure virtual machine
- Scraping iTunes Charts Using Scrapy
- SearchHub: Indexing web sites in Solr with Scrapy
- Using Parsley extraction language with Scrapy
- Running Scrapy on Amazon EC2
- How to automatically search and download torrents with Python and Scrapy
- Scraping Craigslist with Scrapy (includes video) - Nov 5, 2012
- How to Install Scrapy 0.14 in a 64 bit Windows 7 Environment
- Using Scrapy with different/many proxies
- Scrape multi-pages content with Scrapy
- Calling Scrapy from a Python script
- Scrapy and Django (1)
- Scrapy and Django (2)
- Scrapy and Django (3)
- Scraping Google Scholar with Scrapy and MongoDB
- Recursively scraping a blog with Scrapy
- Setup Macports Python and Scrapy successfully
- Crawl a website with Scrapy
- How to use Scrapy with TOR (scrapy-users message)
- Convert relative paths to absolute paths
- How to use Scrapy, Tor with multiple user agents
- (Russian, 2011) Собираем данные с помощью Scrapy
- How to Run Scrapy Spiders on Cloud Using Heroku and Redis
- Web Scraping With Scrapy and MongoDB
- Scrapy: it GETs the web - PyCon US 2013 talk
- Installing Scrapy on Windows (video tutorial)
- Recursively scraping Craigslist (includes video) - Nov 8, 2012
- Scraping the Web with Scrapy
- Karthik Ananth: Scrapy Workshop
- Scrapy / Python playlist on Youtube channel
English slides:
- Scrapy - a flexible crawler to power your search - give by Shane Evans in Feb 2013 Cambridge Search Meetup
- Web Crawling & Metadata Extraction in Python
- Crawling the web for fun and profit
- Scrapy for dummies
- Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
- Collecting web information with open source tools
- When big data meet python @ COSCUP 2012
- How to scrape any website's content using Scrapy
Spanish slides:
Chinese slides:
Portuguese Slides:
Projects, tools and libraries using Scrapy
- Django Dynamic Scraper - a web application (written in django) for runnning and controlling Scrapy spiders
- Slybot - A supervised learning crawler based on Scrapely
- scrapy-sentry - Logs Scrapy exceptions into Sentry
- ScrapyGraphite - Output scrapy statistics to carbon/graphite
- scrapy-mongo - A pipeline to store scrapy items in a MongoDB database
- scrapy-boilerplate - small set of utilities to simplify writing low-complexity spiders
- scrapy-inline-requests - provides a decorator to write spider callbacks which performs multiple requests without the need to write multiple callbacks for each request
- scrapy-redis - providesRedis-backed components for Scrapy
- scrapyz - Create simple spiders easily.
- Scrapy-related libraries on PyPI
- Scrapy_cn - provided a demo to solve encoding problems(utf-8).
- elite-proxies-scrapy-middleware - get new proxies from your EliteProxies account
- scrapydo - Crochet-based blocking API for Scrapy.
Companies using Scrapy
Release Notes
- see Release notes in the official documentation
Developer documentation
Scrapy Enhancement Proposals
- SEPs are available in scrapy/sep.
