tq

HTML extractor

Tool that extracts content from HTML documents based on CSS selectors

Perform a lookup by CSS selector on an HTML input

GitHub

236 stars
6 watching
5 forks
Language: Python
last commit: almost 2 years ago
command-linecommand-line-toolcss-selectorjsonpython

Related projects:

Repository Description Stars
feichao93/temme A lightweight, CSS-based selector for extracting structured data from HTML documents. 273
tjatse/node-readability Automates web page scraping and text extraction to make any webpage readable 343
danburzo/hred Extracts data from HTML or XML documents to JSON using a CSS selector-like query language 69
scrapy/scrapely A pure-python library for extracting structured data from HTML pages. 1,863
syntax-tree/hast-util-to-text Utility function to extract plain text from HTML-like data structures 19
mischov/meeseeks A parser and extractor for HTML and XML data with CSS or XPath selectors 316
alir3z4/html2text Converts HTML to plain text that can be easily read and formatted as Markdown. 1,845
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
anthonygore/html-critical-webpack-plugin A Webpack plugin that extracts critical CSS from HTML files and inlines it into the page. 447
rust-scraper/scraper A Rust library for parsing and querying HTML documents using CSS selectors. 1,937
utkarshkukreti/select.rs A Rust library for extracting useful data from HTML documents 974
dejan/auto_html Transforms plain text into HTML code using a pipeline of filters 786
dwisiswant0/galer A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript 253
slotix/dataflowkit A framework for extracting structured data from web pages using CSS selectors. 662
s0rg/crawley A utility for systematically extracting URLs from web pages and printing them to the console. 263