jsoup
HTML parser
A Java library for parsing and manipulating HTML, XML, and CSS
jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
11k stars
395 watching
2k forks
Language: Java
last commit: 17 days ago csscss-selectorsdomhtmljavajava-html-parserjsoupparserxmlxpath
Related projects:
Repository | Description | Stars |
---|---|---|
zhegexiaohuozi/jsoupxpath | An HTML parser implementing W3C XPATH 1.0 syntax for Java. | 452 |
fcannizzaro/jsoup-annotations | A Java library that provides annotations to simplify HTML scraping and processing with Jsoup | 239 |
egonschiele/handsomesoup | A Haskell library that simplifies HTML parsing by providing CSS selectors and attribute extraction functions. | 124 |
tjatse/node-readability | Automates web page scraping and text extraction to make any webpage readable | 343 |
cheeriojs/cheerio | A fast and flexible HTML parser and DOM manipulator with jQuery-like API | 28,692 |
jsdom/jsdom | A pure-JavaScript implementation of various web standards for use with Node.js | 20,560 |
ericchiang/pup | A command line tool for parsing and manipulating HTML | 8,116 |
ndmitchell/tagsoup | A Haskell library for parsing and extracting information from HTML/XML documents | 233 |
imangazaliev/didom | A fast and simple HTML parser with support for CSS selectors and XPath expressions. | 2,200 |
fb55/htmlparser2 | A fast and forgiving HTML parser with a focus on minimal allocations | 4,451 |
lexborisov/myhtml | A fast HTML parsing library written in C | 1,655 |
snjyor/htmlpageparser | An HTML parsing library that converts web pages to structured data and then generates Markdown content from that data | 1 |
js-devtools/rehype-url-inspector | A plugin to inspect and manipulate URLs in HTML documents | 19 |
javve/list.js | A JavaScript library for adding search, sort, filters and flexibility to tables and lists in HTML elements. | 11,204 |
aantron/lambdasoup | A functional HTML scraping and manipulation library | 383 |