This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world"...
More information
Deprecated as there's new maintainer for original HAP project. Please check the new repo at https://github.com/zzzprojects/html-agility-pack.
This is a port of HtmlAgilityPack library created by Simon Mourrier and Jeff Klawiter for .NET Core platform. This NuGet package supports can be used with...
More information
Abot is an open source C# web crawler built for speed and flexibility. It takes care of the low level plumbing (multithreading, http requests, scheduling, link parsing, etc..). You just register for events to process the page data. You can also plugin your own implementations of core interfaces to...
More information
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world"...
More information
A powerful C# web crawler that makes advanced crawling features easy to use. AbotX builds upon the open source Abot C# Web Crawler by providing a powerful set of wrappers and extensions.
.NET Core port of sjdirect/abot. Abot is an open source C# web crawler built for speed and flexibility. It takes care of the low level plumbing (multithreading, http requests, scheduling, link parsing, etc..). You just register for events to process the page data. You can also plugin your own...
More information
HtmlMonkey is a lightweight HTML/XML parser written in C#. It allows you to parse an HTML or XML string into a hierarchy of node objects, which can then be traversed or queried using jQuery-like selectors. The library also supports creating node objects from code and producing HTML or XML from those...
More information
Plugin Manager Plugin which generates a robots.txt file, based on DenySpider attributes on classes or methods within controllers. If UserSessionMiddleware.Plugin is also installed, will check to see if a bot is trying to access a page it has been denied, and return a 403 forbidden result.
Aspose.HTML for .NET is a cross-platform class library that works as a headless browser that you can seamlessly integrate within your own .NET, C#, VB.NET, and ASP.NET applications. Aspose.HTML for .NET helps you create, modify, extract, copy, delete and replace HTML document content, extract CSS...
More information
It helps you to use HAP in easier and meaningful way via Reflection.
It works somehow like Entity-Framework. Go to wiki in github page for tutorial :
https://github.com/parsalotfy/HtmlAgilityPack_Helper/wiki