html5lib-python
HTML parser
A standards-compliant Python library for parsing and serializing HTML documents and fragments.
Standards-compliant library for parsing and serializing HTML documents and fragments in Python
1k stars
50 watching
284 forks
Language: Python
last commit: 9 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
kovidgoyal/html5-parser | A fast HTML parser written in C, optimized for performance. | 682 |
scrapy/scrapely | A pure-python library for extracting structured data from HTML pages. | 1,863 |
bupt1987/html-parser | A fast and efficient HTML parser for PHP. | 525 |
servo/html5ever | A high-performance HTML parser written in Rust. | 2,148 |
lexborisov/myhtml | A fast HTML parsing library written in C | 1,655 |
snjyor/htmlpageparser | An HTML parsing library that converts web pages to structured data and then generates Markdown content from that data | 1 |
rotatef/cl-html5-parser | An HTML5 parser for Common Lisp. | 55 |
kennethreitz/requests-html | A Pythonic HTML parsing library providing intuitive and asynchronous web scraping capabilities. | 303 |
cclib/cclib | A Python library for parsing and analyzing output files from computational chemistry packages | 336 |
imangazaliev/didom | A fast and simple HTML parser with support for CSS selectors and XPath expressions. | 2,200 |
qmlweb/qmlweb-parser | A JavaScript library that parses QML and JavaScript files at runtime | 27 |
iabudiab/htmlkit | An Objective-C framework for parsing and serializing HTML documents | 240 |
r1chardj0n3s/parse | A library that parses strings using a specification based on the Python format() syntax | 1,713 |
ndmitchell/tagsoup | A Haskell library for parsing and extracting information from HTML/XML documents | 233 |
thephpleague/uri | A PHP library for manipulating and parsing URIs according to various standards | 1,034 |