tq
HTML extractor
Tool that extracts content from HTML documents based on CSS selectors
Perform a lookup by CSS selector on an HTML input
236 stars
6 watching
5 forks
Language: Python
last commit: almost 2 years ago command-linecommand-line-toolcss-selectorjsonpython
Related projects:
Repository | Description | Stars |
---|---|---|
feichao93/temme | A lightweight, CSS-based selector for extracting structured data from HTML documents. | 273 |
tjatse/node-readability | Automates web page scraping and text extraction to make any webpage readable | 343 |
danburzo/hred | Extracts data from HTML or XML documents to JSON using a CSS selector-like query language | 69 |
scrapy/scrapely | A pure-python library for extracting structured data from HTML pages. | 1,863 |
syntax-tree/hast-util-to-text | Utility function to extract plain text from HTML-like data structures | 19 |
mischov/meeseeks | A parser and extractor for HTML and XML data with CSS or XPath selectors | 316 |
alir3z4/html2text | Converts HTML to plain text that can be easily read and formatted as Markdown. | 1,845 |
miyagawa/web-scraper | A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
anthonygore/html-critical-webpack-plugin | A Webpack plugin that extracts critical CSS from HTML files and inlines it into the page. | 447 |
rust-scraper/scraper | A Rust library for parsing and querying HTML documents using CSS selectors. | 1,937 |
utkarshkukreti/select.rs | A Rust library for extracting useful data from HTML documents | 974 |
dejan/auto_html | Transforms plain text into HTML code using a pipeline of filters | 786 |
dwisiswant0/galer | A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript | 253 |
slotix/dataflowkit | A framework for extracting structured data from web pages using CSS selectors. | 662 |
s0rg/crawley | A utility for systematically extracting URLs from web pages and printing them to the console. | 263 |