node-readability
Web scraper
Automates web page scraping and text extraction to make any webpage readable
Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.
343 stars
11 watching
36 forks
Language: JavaScript
last commit: over 6 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A framework for automating web scraping and crawling tasks using Node.js | 518 |
| A Node.js library and CLI for scraping websites using Puppeteer and YAML definitions | 44 |
| Extracts readable content from web pages using Open Graph and traditional readability rules. | 69 |
| A plugin to assess text readability using various algorithms | 94 |
| Tool that extracts content from HTML documents based on CSS selectors | 236 |
| An object inspection utility that produces human-readable representations of objects across different platforms and environments. | 22 |
| A tool that extracts and converts Galician Official journal documents to different formats based on input year. | 0 |
| A Node.js-based web scraping project to extract photos from popular Chinese women's interest websites. | 247 |
| A Ruby-based web crawler and data extraction tool with an elegant DSL. | 1,315 |
| Provides a Node.js implementation of the core streams classes for userland development | 1,033 |
| A Node.js module for crawling web sites and scraping their content | 254 |
| A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
| Tools for extracting translatable strings from source code written in template languages. | 77 |
| A userscript project that parses Korean news site and makes the content more readable | 191 |
| A lightweight search module for Node.js applications using Redis as the backing store. | 890 |