scrapely

HTML parser

A pure-python library for extracting structured data from HTML pages.

A pure-python HTML screen-scraping library

2k stars

123 watching

271 forks

Language: HTML

last commit: over 4 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

brucedone/awesome-crawler

Related projects:

Repository	Description	Stars
rust-scraper/scraper	A Rust library for parsing and querying HTML documents using CSS selectors.	1,961
html5lib/html5lib-python	A standards-compliant Python library for parsing and serializing HTML documents and fragments.	1,138
snjyor/htmlpageparser	An HTML parsing library that converts web pages to structured data and then generates Markdown content from that data	1
utkarshkukreti/select.rs	A Rust library for extracting useful data from HTML documents	974
kennethreitz/requests-html	A Pythonic HTML parsing library providing intuitive and asynchronous web scraping capabilities.	304
miyagawa/web-scraper	A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface.	104
servo/html5ever	An HTML parser designed to meet the standards of modern web browsers	2,171
plainas/tq	Tool that extracts content from HTML documents based on CSS selectors	236
bupt1987/html-parser	A fast and efficient HTML parser for PHP.	525
holgerd77/django-dynamic-scraper	An app that allows you to manage Scrapy spiders through a Django admin interface.	1,155
ruippeixotog/scala-scraper	A Scala library providing a DSL for loading and extracting content from HTML pages	717
jakopako/goskyr	A tool to simplify web scraping of list-like structured data from web pages	36
zhuyingda/webster	A framework for automating web scraping and crawling tasks using Node.js	518
spider-rs/spider	A tool for web data extraction and processing using Rust	1,234
jturner314/py_literal	A Rust crate for parsing and formatting Python literals.	16