http://phantomjs.org/ PhantomJS is an optimal solution for: Page automation Access webpages and extract information using the standard DOM API, or with usual libraries like jQuery. Screen capture Programmatically capture web contents, including SVG a…
参考了DotNetSpider示例, 感觉DotNetSpider太重了,它是一个比较完整的爬虫框架. 对比了以下各种无头浏览器,最终采用PuppeteerSharp+AngleSharp写一个爬虫示例. 和上面的博文一样,都是用汽车之家的https://store.mall.autohome.com.cn/83106681.html这个页面做数据采集示例. Headless Browsers A list of (almost) all headless web browsers in exi…