tq

HTML extractor

Tool that extracts content from HTML documents based on CSS selectors

Perform a lookup by CSS selector on an HTML input

GitHub

236 stars

6 watching

5 forks

Language: Python

last commit: over 3 years ago

command-linecommand-line-toolcss-selectorjsonpython

Related projects:

Repository	Description	Stars
feichao93/temme	A lightweight, CSS-based selector for extracting structured data from HTML documents.	273
tjatse/node-readability	Automates web page scraping and text extraction to make any webpage readable	343
danburzo/hred	Extracts data from HTML or XML documents to JSON using a CSS selector-like query language	70
scrapy/scrapely	A pure-python library for extracting structured data from HTML pages.	1,865
syntax-tree/hast-util-to-text	Utility function to extract plain text from HTML-like data structures	19
mischov/meeseeks	A parser and extractor for HTML and XML data with CSS or XPath selectors	316
alir3z4/html2text	Converts HTML to plain text that can be easily read and formatted as Markdown.	1,862
miyagawa/web-scraper	A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface.	104
anthonygore/html-critical-webpack-plugin	A Webpack plugin that extracts critical CSS from HTML files and inlines it into the page.	448
rust-scraper/scraper	A Rust library for parsing and querying HTML documents using CSS selectors.	1,961
utkarshkukreti/select.rs	A Rust library for extracting useful data from HTML documents	974
dejan/auto_html	Transforms plain text into HTML code using a pipeline of filters	786
dwisiswant0/galer	A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript	255
slotix/dataflowkit	A framework for extracting structured data from web pages using CSS selectors.	667
s0rg/crawley	A utility for systematically extracting URLs from web pages and printing them to the console.	268