node-readability

Web scraper

Automates web page scraping and text extraction to make any webpage readable

Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.

GitHub

343 stars

11 watching

36 forks

Language: JavaScript

last commit: almost 8 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

sindresorhus/awesome-nodejs

Related projects:

Repository	Description	Stars
zhuyingda/webster	A framework for automating web scraping and crawling tasks using Node.js	518
joseconstela/webparsy	A Node.js library and CLI for scraping websites using Puppeteer and YAML definitions	44
philipjkim/goreadability	Extracts readable content from web pages using Open Graph and traditional readability rules.	69
retextjs/retext-readability	A plugin to assess text readability using various algorithms	94
plainas/tq	Tool that extracts content from HTML documents based on CSS selectors	236
chaijs/loupe	An object inspection utility that produces human-readable representations of objects across different platforms and environments.	22
jjelosua/doga_scraper	A tool that extracts and converts Galician Official journal documents to different formats based on input year.	0
litt1e-p/weapp-girls	A Node.js-based web scraping project to extract photos from popular Chinese women's interest websites.	247
felipecsl/wombat	A Ruby-based web crawler and data extraction tool with an elegant DSL.	1,315
nodejs/readable-stream	Provides a Node.js implementation of the core streams classes for userland development	1,033
amoilanen/js-crawler	A Node.js module for crawling web sites and scraping their content	254
miyagawa/web-scraper	A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface.	104
gmarty/xgettext	Tools for extracting translatable strings from source code written in template languages.	77
disjukr/just-news	A userscript project that parses Korean news site and makes the content more readable	191
tj/reds	A lightweight search module for Node.js applications using Redis as the backing store.	890