scrapely

HTML parser

A pure-python library for extracting structured data from HTML pages.

A pure-python HTML screen-scraping library

GitHub

2k stars
123 watching
271 forks
Language: HTML
last commit: over 2 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
rust-scraper/scraper A Rust library for parsing and querying HTML documents using CSS selectors. 1,961
html5lib/html5lib-python A standards-compliant Python library for parsing and serializing HTML documents and fragments. 1,138
snjyor/htmlpageparser An HTML parsing library that converts web pages to structured data and then generates Markdown content from that data 1
utkarshkukreti/select.rs A Rust library for extracting useful data from HTML documents 974
kennethreitz/requests-html A Pythonic HTML parsing library providing intuitive and asynchronous web scraping capabilities. 304
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
servo/html5ever An HTML parser designed to meet the standards of modern web browsers 2,171
plainas/tq Tool that extracts content from HTML documents based on CSS selectors 236
bupt1987/html-parser A fast and efficient HTML parser for PHP. 525
holgerd77/django-dynamic-scraper An app that allows you to manage Scrapy spiders through a Django admin interface. 1,155
ruippeixotog/scala-scraper A Scala library providing a DSL for loading and extracting content from HTML pages 717
jakopako/goskyr A tool to simplify web scraping of list-like structured data from web pages 36
zhuyingda/webster A framework for automating web scraping and crawling tasks using Node.js 518
spider-rs/spider A tool for web data extraction and processing using Rust 1,234
jturner314/py_literal A Rust crate for parsing and formatting Python literals. 16