scrapely
HTML parser
A pure-python library for extracting structured data from HTML pages.
A pure-python HTML screen-scraping library
2k stars
123 watching
271 forks
Language: HTML
last commit: over 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
rust-scraper/scraper | A Rust library for parsing and querying HTML documents using CSS selectors. | 1,961 |
html5lib/html5lib-python | A standards-compliant Python library for parsing and serializing HTML documents and fragments. | 1,138 |
snjyor/htmlpageparser | An HTML parsing library that converts web pages to structured data and then generates Markdown content from that data | 1 |
utkarshkukreti/select.rs | A Rust library for extracting useful data from HTML documents | 974 |
kennethreitz/requests-html | A Pythonic HTML parsing library providing intuitive and asynchronous web scraping capabilities. | 304 |
miyagawa/web-scraper | A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
servo/html5ever | An HTML parser designed to meet the standards of modern web browsers | 2,171 |
plainas/tq | Tool that extracts content from HTML documents based on CSS selectors | 236 |
bupt1987/html-parser | A fast and efficient HTML parser for PHP. | 525 |
holgerd77/django-dynamic-scraper | An app that allows you to manage Scrapy spiders through a Django admin interface. | 1,155 |
ruippeixotog/scala-scraper | A Scala library providing a DSL for loading and extracting content from HTML pages | 717 |
jakopako/goskyr | A tool to simplify web scraping of list-like structured data from web pages | 36 |
zhuyingda/webster | A framework for automating web scraping and crawling tasks using Node.js | 518 |
spider-rs/spider | A tool for web data extraction and processing using Rust | 1,234 |
jturner314/py_literal | A Rust crate for parsing and formatting Python literals. | 16 |