HtmlAgilityPack搭配 ScrapySharp或HtmlAgilityPack.CssSelectors
Html Agility Pack 源码中的类大概有28个左右,其实不算一个很复杂的类库,但它的功能确不弱,为解析DOM已经提供了足够强大的功能支持,可以跟jQuery操作DOM媲 美:)Html Agility Pack最常用的基础类其实不多,对解析DOM来说,就只有HtmlDocument和HtmlNode这两个常用的类,还有一个 HtmlNodeCollection集合类。
一、ScapySharp
HTML Agility Pack的操作起来还是很麻烦,下面我们要介绍的这个组件是ScrapySharp,他在2个方面针对Html Agility Pack进行了包装,使得解析Html页面不再痛苦,幸福指数直线上升到90分哈。
ScapySharp有了一个真实的浏览器包装类(处理Reference,Cookie等),另外一个就是使用类似于jQuery一样的Css选择器和Linq语法。让我们使用起来非常的爽。它的代码放在 https://bitbucket.org/rflechner/scrapysharp。也可以通过Nuget添加
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using HtmlAgilityPack;
using ScrapySharp.Extensions;
using ScrapySharp.Network; namespace HTMLAgilityDemo
{
class Program
{
static void Main(string[] args)
{
var uri = new Uri("http://www.cnblogs.com/shanyou/archive/2012/05/20/2509435.html");
var browser1 = new ScrapingBrowser();
var html1 = browser1.DownloadString(uri);
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html1);
var html = htmlDocument.DocumentNode; var title = html.CssSelect("title");
foreach (var htmlNode in title)
{
Console.WriteLine(htmlNode.InnerHtml);
}
var divs = html.CssSelect("div.postBody"); foreach (var htmlNode in divs)
{
Console.WriteLine(htmlNode.InnerHtml);
} divs = html.CssSelect("#cnblogs_post_body");
foreach (var htmlNode in divs)
{
Console.WriteLine(htmlNode.InnerHtml);
}
}
}
} Basic examples of CssSelect usages: var divs = html.CssSelect("div"); //all div elements var nodes = html.CssSelect("div.content"); //all div elements with css class ‘content’ var nodes = html.CssSelect("div.widget.monthlist"); //all div elements with the both css class var nodes = html.CssSelect("#postPaging"); //all HTML elements with the id postPaging var nodes = html.CssSelect("div#postPaging.testClass"); // all HTML elements with the id postPaging and css class testClass var nodes = html.CssSelect("div.content > p.para"); //p elements who are direct children of div elements with css class ‘content’ var nodes = html.CssSelect("input[type=text].login"); // textbox with css class login We can also select ancestors of elements: var nodes = html.CssSelect("p.para").CssSelectAncestors("div.content > div.widget");
二、搭配HtmlAgilityPack.CssSelectors(这个有bug,class里面有下划线_会抛异常)
var postItems = htmlDocument.QuerySelectorAll(".post-item");
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAvwAAAEJCAIAAABnjWTvAAAgAElEQVR4nO2dW5akuq5Foz/V2tOS06/zeTvA/chdbNKSlmUbCIznHDFqGIWQBfixMJHUZwMAAABYgM+3EwAAAAC4A0QPAAAALAGiBwAAAJYA0QMAAABLgOgBAACAJUD0AAAAwBIgegAAAGAJXi56Pge+nQsAAAB8kzdLgaPQ6RM9SCUAAIDX8OZJfVyyIHoAAABewz+T+uez/czve+Ffj88v475Z9bT+47sLT//wfjsdH3UVj72KB2Gf30RBtCcAAAA8h+MDoF/KpihY4/Ff7XmDUVBoEbdcdcjsfvwXAAAAnsYv0eOWN2+xZ/NEj1iVKaI1VVRmHK8VCaxYcddvxF5HN7vYE4UFAACAh1AXPdHyj1jpcaqpBRcViVB5qqLHurk+Yi9EDwAAwJOpiJ6pH2+JB1ViUz//4vEWAADApOzT9r9Pi+wzJvvZtl/ONojdVxttRVHMyOgcW/z74ui3ydHzLNe5cNj/RfcAAAA8kC9Mz0+QBOgSAACA1bh77m/9AfLZtfOH5QAAAIvC9A8AAABLgOgBAACAJUD0AAAAwBIgegAAAGAJ/Fft8VNfAAAAeBn1txIDAAAAvABEDwAAACwBogcAAACWANEDAAAAS5D6X8QBAAAAZoeVHgAAAFgCRA8AAAAsAaIHAAAAlgDRAwAAAEvAG5kBAABgCRA3AAAAsASIHgAAAFgCRA8AAAAsAaIHAAAAlgDRAwAAAEuA6AEAAIAleITo+Xw292/kf+zRV0/j3JTEsZ/In//89+dTlPuCtO573LGv6lPoq3fw1I2c6ta9AADgh6doBzG736Nv8rXcqbduqOs4ifZNw7ft/rT5vhA9VeeqJVlpx14AALAhejpqQfR0+w9GeNp8j+gBAJiLCyfV/QHN8TGNLrsPdAqLcEs+Eio83TzduiLP01M62iPP6u7J2iPRUzyycZ9DiQnYfWglItiKNqMq3KdIbp6bJ0Tc2l29koyp7frU7YdcPaKm8wkAAIJrVxKKObjV6G5WffQcH3lqsZWJf3pKyZit57OgqmaEMZprk7tv8Vxud3GrEzEztVc3MzGLUyFEpE0+mWfeCAAAgheKnkxW1fWkwnlQ9FTzafoqWtRx3apoNaCVUHWlp2oU+kZLnCLDqmKLYrpV6/Sqokcrue6K8kcEAAAuK4qejlqeI3rclEScahrVlYk+0WN3qdZY+CRFT2v8qmVc9NiUqvWOLOqgewAAknxT9BQ/l3H3sptVn8HHW9WUItkx7+MtW36IsbosdIpuuE70tIrIs44IAABcLhc94oe3tiB+iltIJeHcl9IxGZ2A9RxMyfWsxsyfOsHxKYx9SFQ8pnGf2tgdI09rLKo41uKGPX4b1R4dUTUl1y5iHi1FYm7mwkdUJ1KNLgcAAETct9IDs8PkCgAAU3OhKsmvc8DzYVEBAABmB0kCAAAAS4DoAQAAgCVA9AAAAMASIHoAAABgCW4SPZ/P58NPmgEAAOB73Cd67qkIAAAAwAXRAwAAAEuA6AEAAIAlQPQAAADAEtyhRVA8AAAA8HVY6QEAAIAlQPQAAADAEiB6AAAAYAkQPQAAALAEvJEZAAAAlgAhAgAAAEuA6AEAAIAlQPQAAADAEiB6AAAAYAkQPQAAALAEiB4AWBf+qvTrcAngTl7b2j6fze1KP/boq6fRndKf//z3+LGWH+N3idLoy/Bn6HQPuSOr4159MatuTWP95y/5XdzdfyLoUMVXuuoiZlHReOaXorOKBor92w4map9nkbn6z2we8Ere3NROH7BOTKDbs4liUCsGykuqbESk0ZThcdCMDrMpoD11uyUfp+qZHOuPAiLv7FqaQumwOqZVTk3V9TF4UHbvK7KeqH2eBaIHnsObmxqiR4ieh3CK6ClGzLkmldPng6TauFr0DFbXR+vimbFYn8GMHOZqn6fQJ9YBrmCydrY/nDquPOuyu0ZdWISbeBwmPN083boiz/GUtlj0HAvHVfHCaB3cxfNovb0IJTzdlXbXYj3tWGknEjd5/bDPPXXHf0VKbljx3KH2kEUpCfdBUmEUYYtHD+JRlLuAFNWlV5vyyQvPIvnM7kIOim64f2udzRGl+uZ07fNwgM6ZP36VOfNRG0u2OoBB5mtSdkhqMrqbVR/d9SJPLbYy8btT+iE/2BX+7g2oO7CK21axzlTdXW/uZXdMjGaLaOawU0W0mTn21sMUB6K/KiaSyF/sHmmUzPTj1hjJi/zuxb+6IuGgK4qdN2txxxO9WRU907XPw6E5Z15czYzR3QS4iPna2VdETyYrLXGs86DoyRANlEKpVI12YE2O4IW/zsTddCuyY2X16ArLPZOKuGvXw72VFMWmvZPOBG/dJa9a3AhiDaAqeiLP6oEcI1SPaGu5S4mcW29FpmifRwb1TdRot7ihApzLfO3smaKno5Yvip7IOLjSk69rcKWnoBgur5tUtmBhrHpE+eS1Q3WCt/aviJ6+rCLRk08+n0BgsXuF3yb7rGWu9lkwLnoykQGuY752pvXNJ1iOHhQ9uj9Gnm56Sc/BlH4QA2VyELRjaOZu0lbdVHs++Z3jiFlNSR+Rrn18UtHLVC5NKxx63aUaKsqqVfTomGJ3IXry5yGTkpdVWCjKelNf1Ynap+VE0ZNUqwDnMl9T+8gfCNvCJ/gRYiGPtHNfSsdkdALWczAlu15dWApjUd4Oo6fdZTNj65/fq+J/gqVynZVwiyr6fWI/+ojctO0R6bO0mRkic5iuZ9NA7z6jcR8TuM728dC+6ZZd5yLsp0Zy9yLnYyZNybvH7lZ9/NZYnGEhKkdjiGCi9umerujMuydZt4fCM6pR5APQwXxNil4AAGfBtPp1xCXg6sDpTNakkuscAAAAAAXIBwAAAFgCRA8AAAAsAaIHAAAAlgDRAwAAAEuA6AFYHf5G5utwCQDu4bU9LfojL/H3Xw8cdrpTqr7Mw/p3J7lz7sDdHawpDfESl6bd9WtIRtABxyvSEfQfS/ZVnm+WmQgjMTuq7iPTJNA9ADfw5m52+mB9YgLdnk3Yl5KdG/BIZsi+bVRPzh9HpdIRs3gJXkcCSS5VNiZz6zBYuUOhV9xyU5AtfhlxU5DrQPQAPIE3dzNEz22iJy0yButv4PQ5pknWzCJ63FcSVy3jIHq6fQBghMn6mHgBfFR21+cLi3BLvg6x8HTzdOuKPMdT2nKiJ3oWED0ds87BC/5TJ6RwKPbVbm5F1cQy30ZPrApjU/Dqe/qFMTjD/mOvYnc3eTdPcY02c+YH26cVOvr/RoiaqN08/lsEKZz1M7KI4sTar5ousQ3u1ijyAYAm5utOdjhuMrqbVR897ESeWmxl4nen9EN+pUd46hvxSPFUy0dj8nLYK+te62p6+qticor8m0SPG6fP2FGRiODNu9YnPM/d7TNSM5GysVIm2iw83eB5o4urRwcvsbsJAKczXx/7iujJZKUljnUeFD0ZukXPFvzG040Q6Z7kCRkRfFr0JHWJXVaJ7s47ghdG947fdUvWldldREiKnuhbrZAEeVV9p+gRq0pHxiWs28C2WqMCgHHm62PPFD0dtTxc9DRFiOdUtSmMet+i3KF4tmBCEjsOih6xbFN9qDG4XpV324ZFj7bvXCd6tvjHPXmj5rZ1OwA4nfn6WPWO/wrRo8ciMR9HharnYEo/JB9a5T3FDPF7fK/kPCj49AJPkYyg+sRBT1r5daATp8mOiqJQQYSwsMmLoj0L8g3sK6Ln6pUeHQoArmO+bvaRPxC2hU/wA8xCHmnnvpSOyegErOdgSnatvrDoVX2xzi8W/w/DevaEJK9RVHZPSNPkIR4wJR88uU+sMvaqm7tjZI9210e6BYs97ul1y1E/ErjNsvi2KB+dN9m89x3d6qzRbfYiefcK2q+sf8YzqlHkAwBNzNedGAFgJ7+0AAKm1a8jLgFXB+BEJutOyXUOWAeaBAAAJGGuAAAAgCVA9AAAAMASIHoAAABgCRA9AAAAsASIHoDvw1/ofB0uAcAKvLafR3/RI/7Y54GD3mBKpx9R9A6Ypn3zEQbrOj2fi9C1679N60tcvymnNcJIzI6q+0g2iRsyAYAv8uZOfvpUcWIC3Z5XkKz9+Gq1vHPS3hons0vTJHf1hJd/EYv7RsfTyf9HEMkgW/wq5KYg14HoAYA3d3JEz4m1nyVWrhY9ybfc2m+/JXrcFyJXLeMgerp9AGBeJuvh4uX3Udl9OlBYhFvy3XeFp5unW1fkOZhSFHCLT1EtoFqocF+x7z5QKHTGz6b7Sn4bIaqoCGWriyoSZbdqN4JNqXBrOSFhY9g2xzjSQqzQyf8vJW6QYxwbX/9HENrTHGC92bhGfYl3u1ujyAcAJmK+zmwngyaju1n1qSqM5F55z8GUot0zJycIVV+oiMoiTiEORBxtdN0Ku+vsTodROQpercgmJuyRjnG/7W4hkZqJlI2VMtFm4ekGzxtdzmo2W9xgAOCVzNfDvyJ6MllpiWOdB0VPMqtos1v0uGsbWjq4cexmdaISt+zV+NWKbJ7VI3JTyugnN8ODxfqE32qFJMg/3rpT9IhVpSOD+iZqS5tsEgDwAubr4c8UPR21TCp6RDnyF/a+2auIcLroySwJRLt/XfRo+851omeLf9yTN2rGRU8mMgC8j/l6uNY3n9wTnFbRo0fCqpiopuQ6jKTUtHtUu4nWsJgRTTCniB5rzO8icssf3bmVmnMSFoqy3tRXsyp6isLNoufqlR4dCgDeynyd/CN/IGwLn+Dnn4U80s59KR2T0QlYzxNTKs6GGyQTczPLKkdjZLduuz0qu86Rj5tAVLWNHOmeKNXCX2RYzepoNxancUblqCUL9Dt17A+Zt98/dj7uFcVp+iGzfcIlks80G9c/4xnVKPIBgImYrzMz/sD7YFr9OuIScHUAXsNknTm5zgEAAABQgHwAAACAJUD0AAAAwBIgegAAAGAJED0AAACwBIgeAHDgT5a+DpcA4HRe26miP/ISf//1wBGmLyX9FpbWCCMxO6rWr7oZQbxrp/DRWdmvBpOpvj/mfnQC+i8oF2y0fWQu9NdbAsDLeHOPOn1cPjGBbs8k+f9kIBlki1+z2xQkQyECWnd39yqUTdO+yR1bESndM9Xlj9R96+bpTN1o+0D0ANzMm3sUokeXm4JsLxI9rft2BHlURa21eGfP+pyfz9SNto/MhUb3AJzIZN1JvHo/KrtL8YVFuCVfh1h4unm6dUWegynZOUO/9d9d0nf/ZyX7fzBV/5MB7WnZn21tv0d88STIPhHTj7HcgIWxqpysm81HhBXCrpqqPkzrWaQkdt/Kcx62z/1b62wCvrzRZq67a7SNJ9MYIiMAVJmv59iRt8noblZ99AgTeWqxlYnfnVI0MUSThJ0Vos3C0w2eN7rs88d2GNyL6SEquJtHe0bKZL4SPiI9d5LrS6npLGkHXfsWtGS3o+nNdzfa6uVuNbqbADDCfN3pK6Ink5WWONZ5UPRo8k8K7pw/xA36kWg6F/fHbgQdP7PZJ3psOT/hRTHtsefPUnREg6In+lYrJMELGm1RbjJGIhjRA3Ai83WnZ4qejlomFT1b/DuJvFFTXaIonJ8vetw8RZL5eq3o0dG+Inq0fecdjXYLLker8HUjA8Ag83UnrW8+war7oOjRw07k6aaX9BxMqTp/FIWb54/kTXN1uSKaM5JLLNXN01d6+ioSMYXoya/0VFVjviXrzRUa7dZ1OVqFKQD0MV+P+sgfCNvCJ/itZSGPtHNfSsdkdALWcySl42q8+wsJOxMcnY97RXFsQDvZ2OcCyccEQusUX+1TvruO4u4bGYuwrZ42k8jZ3VfErB57cbCi3iisexWMxekvUTnqXILXNFp93QtjxjOqUeQDABHz9Rw6O0xHoTO+mEmeWfJ8MeIScHUA+pis5ySXXgCehnsfDwAAd8IQDAAAAEuA6AEAAIAlQPQAAADAEiB6AAAAYAkQPQBwE/yO++twCWBxXtsBoj/yEn//9cDRoC8l/cqT1ggjMTuqjl5UcwpXBKy+XkW8Cyf/mpzWZKovgLkfnYD+q8wFO0IfmQv99ZYA8EXe3PpPH0NPTKDbM0n15batQbb4nbZNQTLM8labfJ63vXBFpHTPmcwfqfsmz9OZuiP0gegBELy59SN6dLkpyIboMeRz+4roubSi1lqsHdFzEZkL/eRuBXApkzV98Z77qOwumxcW4ZZ8HWLh6ebp1hV5DqZkx3f9in13+d2O9cW/RZDCWT8aELiiZ1+3Lxbw3Uc5xbfWLbN7VJHWZG5F7r5uBLcK98DdsLaQT/Xzm6rn5p2uzGGKNr9/a51NwJd3hMx1d4228WQaQ2QEeBnztXI7SjYZ3c2qjx4NIk8ttjLxu1OKBvFoQLcjeLRZeLrB80YXPfUWFlt2jdVNvbuOWZ1U8ilVq9D6Jik7opT2WVbEFylFh+PaIx3jfrtyR9jLRaHP6G4CrMN8Tf8roieTlZY41nlQ9Gjyq/p3jvXiZvpIy8TprC5cIXr2upI5ny56RJ7Vo4hiRqfOip78SR4UPdG3WiEJXtMR+kRPJIIRPbAs8zX9Z4qejlomFT1b/JuGvFHTJHqE58hqR1+NTRWJgMndmyazfL1W9OhoXxE92r7zmo7QJ3oykQGWYr6mr/XNJ1ghHxQ9eoiIPN30kp6DKVXH+qJw81h/4kqPKA+KnurCRl7KXLrS01eRiClET36lJ3Hyw0JR1puLdIRB0ZMRpgArMF/r/8gfCNvCJ/hdZCGPtHNfSsdkdALWcySl48q5+2sGO2ofnY97RXFsQDsx2DX85JK+XcA42u3kHflHu9vFEtfZFqJ9I2Ohn1o9i6pttnpfEVPsvp+6TEXuCSw2oyuymVsU23OP5ajDCl7TEfR1L4wZz6hGkQ/AO5ivldMxAbY5791nyfPFiEvA1YEVmKyVJ5deAFbAvY8HAIAIhksAAABYAkQPAAAALAGiBwAAAJYA0QMAAABLgOgBmAl+tgxwNfSyF/PaSxv9kZf4+68HtvO+lPTrSVojjMTsqPqH2/4oyX2BjXC7ISWdhvxWNZgHtqW+tvEcntAkqjwtwVl63POvLPTx5ut6+gRwYgLdnkmqL6JtDbLF759tCpLh+Gq11n37KkrWNZLPKcdi3jJnHcYrKbmoLXUEuZqRtnon807HE/U4RM9befN1RfToclOQ7S7Rk3x77Cm0Bv+u6HHfaFy1jHOp6GmNcymInquZvcfBC5jsoop30kdld82/sAi35OsQC083T7euyHMwJTup6Nfhu88a3Inq+G8RpHDWz8giXNFj36x/dHAttmDLtrooZuTsehYL+J/f6N1t8m7totlsm2N8Zltyy9W2ZCvVtbt5unHyD93cC5ppDK0kB5bCodhXu7kVyZTUsWeMIqa1Zzxv6HHaCBMx3/WzQ3yT0d2s+uh2HnlqsZWJ351SNIhHs5GVMtFm4ekGzxst0bjjjpvRYFpTDM44mPSsxkyGako+jmB9/PavN7/VloryKcZk+2xtlkfy17062dcqqpePxuSwZltI1Ga8aCcc+7w9Dl7AfBf1K6Ink5WWONZ5UPRo8o8k7hQ9mZvpVtHj3tKJISwayKox3dysZzW+a8nMjknRE32rFZLgurZUlKvGzO62fbqLUskjcumb+PvIDywjN059oqewuy1cnIEZexy8gPku6jNFT0ctk4qeLXGjnJ+9LLs0KTRK06QyKHp0bplaMvavix5t35ld9IwckcudoudvHLUpjHrfopxJVh9RX4+YqMfBC5jvolbvVK4QPbrxi3EkKlQ9B1OqDut6qrA7Frtc93hrS0wk2mFLj2L52Wt8d+vQmryXQFjY5mlLyeBRjX2Pt/KVWvpEj9uANU3DxYjoyU/u48eeNFY3r+5xKJ63Mt91/cgfCNvCx/spX2F3jUd7X0rHZHQC1nMkpeNivn2c9Cf4UWexWTwU0A8I/pgnVtFjBevpEq11F+tArufngA3r1hLZXU/XXk2+SLsjeZv/ZlS+bfxbos0LLm1L7lqRaEvWaKMVtdvd80ckcNtDvjEkJ9T8wJIc66Ky3ZRZpRpt3liN+ZUeF12j5LWDxzLf9aPJrUM0948EPDHaV3jBISzOA69gIZVAXKMHXj5oYrLrl78dAdiJbgQB4AeGVlgE2jgAAAAsAaIHAAAAlgDRAwAAAEuA6AEAAIAlQPQAzAQ/xwaoQjeBiNe2jOgvEcQfKTywm/SlpN9E0hphJGZH1T/c9sdW0Vs6IrcbUtJpyG9Vg1m2LbXyhAt95Em5bNs8XeZRFxGew5ubxekTwIkJdHsmqb5ztjXIFr+IuSlIhn20unrYKt5R1uQ/UtdZQdx31p3O1G2pj+sb3qXhL2SiLoPoAZc3NwtEjy43BdnumqiS70U9hdbg3xU93gtnrc9gJQ7ztqVuED0Rs3cZgMnahHihflSOXuLuhhXV5ROL8nTrijwHU7KTk/5/IaL/E8BuHv8tghTO+rlGhCt6juvk0VvkC4st2LKtLooZObuexfr/5zd6d5u8W7toNtvmGJdtS+JkVi+c8MyTHBkKh2Jf7eZWJFNSh5kxipjWnvG8octoI6zDfJffDvFNRnez6qO7SeSpxVYmfndK0QwUzUZ2+ok2C083eN5oiYYtd9iNxmI9/LnDaNKzGjMZqin5OIL18du/3nxrW9qMXC4KfcY8Tf090iuuuCm+ii66F+2EY5+3ywDM1ya+InoyWWmJY50HRY8m/0jizolKrATstIoe945QjIDROFiN6eZmPavxXUtmck2KnuhbrZAE87albXiOF40hSX5kGLnz6RM9hd1touKoZ+wyAPO1iWeKno5aJhU9W/yDjLxRsEuTQqNUJyobJIqsd8mM8tVaMvavix5t35m3LW1niJ5kRbU01KYw6n2LciZZfUR9TXqiLgMwX5uo3uhcIXp03xHDUFSoeg6mVJ2oisLNE1Xy7lzcfAuHLT0I5ie/8d2tQ2vyXgJhYaMtbdt2quhxW6Cmqb+PiJ785H7RCXHj682ruwyKB1zmaxYf+QNhW/h4vwQs7K7xaO9L6ZiMTsB6jqR0XPZ3f4php5yj83GvKI4NaGc1+wAi8zzi72H6S+XFOpDr+Tlgw7q1RHbX07VXky/S7kje5r8ZlW8b/5Zo84Kp29LxTBZnVV+4jKc6a/+6ZUeG5GAVle2mzCrV6vLGakzXrnc/2juS39J3PrAa811+Wuw6uOPgYMATo32FFxzC7DzwEhRSCcQ1euDlgzuZ7PLn72YAdqL7SIDXwNgIkIEuAgAAAEuA6AEAAIAlQPQAAADAEiB6AAAAYAkQPQDwTfiBOcDV0Mt2Xnsioj9kEH/j8MBW0Z1S9eUo1r87yZ2ffpWvNCJ6H0xrzMx7XPJZ3fb3X9F7RyK3G1ISORSb+VewuBHMt6oL9B36pe3zlH70Rb7eojI8LcFZOuzzr+w9vPksnD5cnphAt2cT+q244wGPHHtU9R2+HXVFr7NrCmJJDgS729UDR/HWtSb/kbpOjJA3Wrv7Er/Tuah9dgS5mo5kvjI1zjsdT9RhET0/vPksIHpuEz1Fd5pI9GyJsSD5ptdTaA3+9THUBhlRQi8QPa1xLgXRczUTddhTIryAyU6BfS/7bozK7gp5YRFuyVd+fYK3xdt93eSt53hKW070RMvy0dMx62w7kp0A9P8kECVgN4//ivzdsOIBRHVd2m4eV6qLVWu7jr1vFgVbjpKJ1sZdi3jGVNQoPEXyxXmoppQxio6wf2udTcBU77iufbrlavu0lera3TzdOPmHbm57yLSlVpJDZeFQ7Kvd3IpkSurYM0YR09oznld0WJGqNb6Y+Y7WDohNRnez6qNbReSpxVYmfndKP+RXevIjuMnQySMacKOZw0qZaLPwdIPnjdUDcb/aN92BLxoN9QDkDmRJz2rMZKim5HeHzLCeN0Y6xv22u3dc1z6L8inGZJvPN3VLvtlUJ/taRfXy0ZgcqG0LidqMF+2EY5+iw7oRFmS+U/AV0ZPJSksc6zwoejJ0i54t+BVwUi7kNdOdokfc+OqxoFX0uPdkYgyKaq/GdHOzntX4riWTvBBDfcb8fULk3Hoz8F3Rk9ndtnnbkvNH5NI38feRHyo7RsVB0VPY3bYtzsDDO6ywL8V8p+CZoqejloeLnqYIRV+6dFKpjvX5mcZNPnLYR6W+WWFQ9CSTzwsO154fQ8W3XxQ92r4zu+gZOSKXO0XP3zhqUxj1vkU5k6w+or4O9cAOW01sHeY7BVVdf4Xo0U1F9LqoUPUcTOmHzLjc5ClG3mN3yu/+FdFzLCdHgepMoB1ERWIQrMYc3N063Cx6vH3DQlHWm/qqXt0+k8GjGvOip69SS5/ocdu/pmkAHBE9+cl9/NiTxurmpR02Mi7IfGfhI38gbAsf74dvhd01Hu19KR2T0QlYz8GU7Bp4Yamulv8JngS5xr+5fY77FmW7+9Gt2IyS33e0R6oP3/VsGgWKJeij8VhwPT8H3JNW9dQxXXs1+SLtfPLHTbfsOmcOfzP3LbY7b0HvEN3wyKXtM+ovUfu0RhutqN3unj8igb52rmdhrFaxtQyVydE7KttNmVW9zTcZqzFdu979aO9IfkvfeLye+Y52sQsEz8IdiQYDnhhtRjgDs/PAK1hIJRDX6IGX71ImO9q8eAd4MtGdHACcApMFuNAiAAAAYAkQPQAAALAEiB4AAABYAkQPAAAALAGiBwC+Cb/mBqhCNzmL157H6Hf74if9D2xU3SlVX3Vj/buT3Pkk3tOTIXrXSGvMqlvTOHLbH1tFr9mI3G5ISeRQbObfOOJGMN+qLtB36BO1z7P4ejspeFIu2zZPj3vURZyXN5/E04fLExPo9mzCvv3s3IBHjh2ymA/6Eohe3dYUp+qZHEd2t6vHneIlY03+I3WdGCFvtHb3nXWnM1H7PIvr2+2l4S9koh6H6DmFN59ERM9toqfojXNNKtWhJPli01NoDf71IdgGGVFCiJ6LQPRETAwJMHsAAAibSURBVNTjTokAk51B+xry3RiV3RXywiLckm+4+gQvR7f7uslbz/GUtpzoqb4+P3r1/iGlMhU7kbi7i5hR8sd/Rf5uWPHcobqsbTePC93ForddBt83i4ItR8lES+uuRTxjKmoUniL54jxUU8oYRUfYv7XOJmCqd0zXPg8H6F+LLXfdhWee5FhXOBT7aje3IpmSOsyMUcS09oznFT1OpGqNEDHfybIDYpPR3az66EYVeWqxlYnfndIP+ZUe4alvhd3+Fs0W0cxhp4pos/B0g+eN1QNxv9o33XEzGkz1+OWOg0nPasxkqKbkd4fMrJA3RjrG/ba7d8zYPg+H9rFXNn/dq5O9pmkEi/SKK26Kr6KL7kU74din6HFuBGhlvjP4FdGTyUpLHOs8KHoydIueLfiVZVIu5DXTnZOKuGvXQ0mr6HFv6cQQFtVejenmZj2r8V1LJnkhhvqM+fuEyLn1ZmCK9nlkcI4XbSlJfqzrGNYGRU9hdxunOOqH9zhhhzzzncFnip6OWh4uepoiFF3xuklli388kTfq5COHfVDLzzTVWvKiJ5l8XnC49vwQLL79oujR9p252mfBuOhJVlRLQ20Ko963KGeS1UfU1yMe2OOqiUGS+c5g9bbgCtGjW5rotFGh6jmY0g/Jh1Z5TzFGH3tjfvevTCrHcnIQydw9CwdRkRhDqzEHd7cON4seb9+wUJT1pr6qE7VPy4mix23AmqYRbET05Cf3i06IG19vXtrjIiO0Mt9J/MgfCNvCx/vdXGF3jUd7X0rHZHQC1nMwJbtaXlj0urpYaRfL7z990q3U3f3oVmxGye872iPVh+96Ng0ixQr20XgsuJ6fA+5Jq3rqmK69mnyRdj7546Zbdp0zh7+Z+xbbnbegd4hueGSi9mmJzrzdLIwZT3XW/nXLjnXJ4Tcq202ZVb3RNhmrMV273v1o70h+S985gGa+k8X1hS/iDmSDAU+MNiOcga/zwEtQSCUQ1+iBl+/JTHay8tof4MlEN4IA8AOjPVwBDQoAAACWANEDAAAAS/D53//+jw8fPnz48OHD5/UfRA8fPnz48OHDZ4nPP6Ln2wtOALAo/JobAE4hM5i8VvREP/sXfxHwwLG3O6Xqq0Ssf3eSO5/Ee3oyRG9PaY2ZeetJPqvb/tgqektH5HZDSiKHYjP/whI3gvlWdYErekdfi72CKI2+DN/dN1v5esfJ8LQEZxmXqlW/VvRsstHcc0XytVyUj30l2rkBjxybWjEm9iXgvs+tNU7VM9k5d7erO3PxjrIm/5G6ToyQN1q7+8q7ixC94wmiZ5NpNGX47r7Zx1dm5afpmDwTjUuInuav7kmg27OJ20RP0c4mGli3TCfpendtH63Bvz642CAjSughouchnCJ63t03u0H0NDHRuFSNMJnosW8x341R2V0hLyzCLfmCrE/wbnW7r5u89RxPacsN69GKtF7//51SmYodTN3d9QMFN/njvyJ/N6xYe68u2NrN4xJusZxrF3j3zaJgy1Ey0aKxaxHPmIoahadIvjgP1ZQyRtER9m+tswl4Qu8o2udmJvXIQTfFwlKEEp6Z3uHuvq3RN93GueU6QqtnnuSMUDgU+2o3tyKZkjr2jFHEtPaMZ1HR5zd6d33p3VT/KcwleratbAqtRnez6qPbU+TpDspN8btT+iF/Lys89e2g27yiETMaPe1wGW0Wnm7wvLF6IO5X/3Yeb0SIhgndM90envSsxkyGakp+d8iMd3ljpGPcb0/vHUKgbLkGlmmfRdhqzCjb6u4r9M3N3H7YXc4y5mka5yO94oqb4quoa3jRTjj2KcYlN8KvbxE9TT4iKy1xrPOg6MnQLXo2b1yOItgWltdMdw6s4s61Oqq6m1FPdm9WROeMaq/GdHOzntX4riWTvBBDfcb8fULknO8jUXurtt4m0ZNUGIW/zsTddCt6fd/c4m6YN0YdtkPuHHbPzggdg/+g6CnsbhcWx/7wcUnY//kW0dPkk8+tWsvDRU9ThKKRXTewbsHN99Yy2urkI4e9u+YH1motedGTTD4vOFx7fnAR335R9Gh7QaZtj4ueTNVR/L6VnoJ3980t0Q37Omw+Abm72hRGvW9RzuTYcZgzjkvVxLb3iZ5PsBg+KHp0qxLNMSpUPQdT+kEMVfkxNDlOHdtZfvevDKz6Vtglc7MoHERFYnSoxhzc3TrcLHq8fcNCUdabZ/WOauvNi56MwsjUnk9+Z5G+udVa8rgxnVKlnGmrmeadl2S3naXq5qXjUmT85TCj6Pl4v9v6mJ96uZ4f7wdlrvFo70vpmIxOwHoOpmRXjAtLMU4Joxs5OA+f475F2e5+dCs2o+T3He2R6sN3PZtu4z5/scZjwfX8HHBPWtVTx3Tt1eSLtPPJHzfdsuucOfztd0c4tna3HPViTbJ36BZod9xMS7Y72l2SWQm3qCJ7nt/XN6OmaDcLo/aMWmw6q+yMkJykorLdlFmlDjNvrMZ07Xr3o70j+S1xfzWl6AH4Fm4XHQx4YrQZ4QzAFDywoRZSCcQ1mlX05FUtwJOJbnEAAPIwJ7bCfzjKhw8fPnz48Fnig+jhw4cPHz58+CzxYVEMAAAAlgDRAwAAAEtwrejhp5oAAADwEC5XJIgeAAAAeAKIHgAAAFgCRA8AAAAsAaIHAAAAluAORYLuAQAAgK/DSg8AAAAsAaIHAAAAlgDRAwAAAEuA6AEAAIAl4I3MAAAAsAQoEgAAAFgCRA8AAAAsAaIHAAAAlgDRAwAAAEuA6AEAAIAl+Ff08JdWAAAA8GJ+qRxEDwAAALwVRA8AAAAsAaIHAAAAlgDRAwAAAEtQqhx0DwAAALwSVnoAAABgCRA9AAAAsASIHgAAAFgCRA8AAAAsAW9kBgAAgCVA5QAAAMASIHoAAABgCRA9AAAAsASIHgAAAFgCRA8AAAAsAaIHAAAAlgDRAwAAAEvw/w5wlQjw9AM9AAAAAElFTkSuQmCC" alt="" />
参考:http://www.cnblogs.com/shanyou/archive/2012/05/27/2520603.html
http://www.tools138.com/create/article/20141014/130844875.html
HtmlAgilityPack搭配 ScrapySharp或HtmlAgilityPack.CssSelectors的更多相关文章
- 爬虫技术 -- 进阶学习(十)网易新闻页面信息抓取(htmlagilitypack搭配scrapysharp)
最近在弄网页爬虫这方面的,上网看到关于htmlagilitypack搭配scrapysharp的文章,于是决定试一试~ 于是到https://www.nuget.org/packages/Scrapy ...
- 网易新闻页面信息抓取 -- htmlagilitypack搭配scrapysharp
最近在弄网页爬虫这方面的,上网看到关于htmlagilitypack搭配scrapysharp的文章,于是决定试一试~ 于是到https://www.nuget.org/packages/Scrapy ...
- 网易新闻页面信息抓取(htmlagilitypack搭配scrapysharp)
转自原文 网易新闻页面信息抓取(htmlagilitypack搭配scrapysharp) 最近在弄网页爬虫这方面的,上网看到关于htmlagilitypack搭配scrapysharp的文章,于是决 ...
- csharp: using HtmlAgilityPack and ScrapySharp reading Url find text
https://github.com/exaphaser/ScrapySharp https://github.com/zzzprojects/html-agility-pack https://gi ...
- c#中的解析HTML组件 -- (HtmlAgilityPack,Jumony,ScrapySharp,NSoup,Fizzler)
做数据抓取,网络爬虫方面的开发,自然少不了解析HTML源码的操作.那么问题来了,到底.NET如何来解析HTML,有哪些解析HTML源码的好用的,有效的组件呢? 作者在开始做这方面开发的时候就被这些 ...
- 使用HtmlAgilityPack和ScrapySharp抓取网页数据遇到的几个问题解决方法——格式编码问题
需要用到对应市区县街道居委会的区域编码,于是找到统计局的网页,对这些数据进行抓取,用到了HtmlAgilityPack和ScrapySharp,由于也是第一次从网页抓取数据,所以对于HtmlAgili ...
- HTML Agility Pack 搭配 ScrapySharp,彻底解除Html解析的痛苦
var divs = html.CssSelect("div"); //all div elementsvar nodes = html.CssSelect("div. ...
- C#+HtmlAgilityPack+XPath带你采集数据(以采集天气数据为例子)
第一次接触HtmlAgilityPack是在5年前,一些意外,让我从技术部门临时调到销售部门,负责建立一些流程和寻找潜在客户,最后在阿里巴巴找到了很多客户信息,非常全面,刚开始是手动复制到Excel, ...
- C#:使用HtmlAgilityPack解析Html
推荐阅读: HtmlAgilityPack 入门教程1 HtmlAgilityPack入门教程2 向HtmlAgilityPack道歉:解析HTML还是你好用 获取html中meta标签中的conte ...
随机推荐
- 从 Android 静音看正确的查bug的姿势?
0.写在前面 没抢到小马哥的红包,无心回家了,回公司写篇文章安慰下自己TT..话说年关难过,bug多多,时间久了难免头昏脑热,不辨朝暮,难识乾坤...艾玛,扯远了,话说谁没踩过坑,可视大家都是如何从坑 ...
- 坑爹的BufferManager
特别记录一下 国内外各种关于 Socket 的例子或开源项目,大部分都采用了 BufferManager.cs(代码类似). 也不知道是哪一个坑货写的.有一定几率会导致内存无法复用,导致数据是上一个的 ...
- .net开发笔记(十三) Winform常用开发模式第一篇
上一篇博客最后我提到“异步编程模型”(APM),之后本来打算整理一下这方面的材料然后总结一下写篇文章与诸位分享,后来在整理的过程中不断的延伸不断地扩展,发现完全偏离了“异步编程”这个概念,前前后后所有 ...
- ASP.NET MVC随想录——创建自定义的Middleware中间件
经过前2篇文章的介绍,相信大家已经对OWIN和Katana有了基本的了解,那么这篇文章我将继续OWIN和Katana之旅——创建自定义的Middleware中间件. 何为Middleware中间件 M ...
- 跟我一起云计算(3)——hbase
hbase HBase是一个分布式的.面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”.就像Bigtable利用了Go ...
- User and User Groups in Linux
本文梳理了一下Linux用户和用户组的常用的一些命令. 有关的配置文件: /etc/group 存储当前系统中所有用户组信息 /etc/gshadow 存储当前系统中所有用户组的密码 /etc/pas ...
- linux service
有些东西真是难得搞懂,一旦懂了就容易记住了. 说到service 就不能不说 daemon, 他们两者看起来不相关.其实是紧密相连的两个概念. —— 就像两个同心的正五边形和正六边形放在一起时候的样子 ...
- Redis教程(十二):服务器管理命令总结
转载于:http://www.itxuexiwang.com/a/shujukujishu/redis/2016/0216/140.html 一.概述: Redis在设计之初就被定义为长时间不间断运行 ...
- DDD~领域服务的规约模式
回到目录 规 约(Specification)模式:第一次看到这东西是在microsoft NLayer项目中,它是微软对DDD的解说,就像petshop告诉了我们MVC如何使用一样,这个规约模式最重 ...
- Java 集合 — HashMap
HashMap 无序(每次resize的时候都会变) 非线程安全 key和value都看可以为null 使用数组和链表实现 查找元素的时候速度快 几个重要属性: loadFactor:用来计算thre ...