这次爬取亚马逊网站,用到了scrapy,代理池,和中间件: spiders里面: # -*- coding: utf-8 -*- import scrapy from scrapy.http.request import Request from urllib.parse import urlencode from ..items import AmazonItem class SpiderGoodsSpider(scrapy.Spider): name = 'spider_goods' all
用scrapy爬取链家全国以上房源分类的信息: 路径: items.py # -*- coding: utf-8 -*- # Define here the models for your scraped items # # See documentation in: # https://doc.scrapy.org/en/latest/topics/items.html import scrapy class LianItem(scrapy.Item): # define the fields