webster

Web scraper

A framework for automating web scraping and crawling tasks using Node.js

a reliable high-level web crawling & scraping framework for Node.js.

GitHub

515 stars
33 watching
57 forks
Language: JavaScript
last commit: 19 days ago
Linked from 1 awesome list

automation-testautomation-uichromiumcrawlercrawlingheadless-chromejavascriptjavascript-frameworknodejsnodejs-frameworkpuppeteerscraping-frameworkspider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
spider-rs/spider A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. 1,140
joseconstela/webparsy A Node.js library and CLI for scraping websites using Puppeteer and YAML definitions 44
yhat/scrape A collection of utility functions and tools to simplify web scraping in Go. 1,513
tjatse/node-readability Automates web page scraping and text extraction to make any webpage readable 343
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 686
holgerd77/django-dynamic-scraper An app that allows you to manage Scrapy spiders through a Django admin interface. 1,153
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
joncanning/skyscraper A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions. 58
amoilanen/js-crawler A Node.js module for crawling web sites and scraping their content 253
howie6879/ruia An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling 1,752
scrapy/scrapely A pure-python library for extracting structured data from HTML pages. 1,863
dyweb/scrala A web crawling framework written in Scala that allows users to define the start URL and parse response from it 113
tidyverse/rvest A package for extracting data from web pages using HTML parsing and CSS/XPath selectors. 1,492
propublica/upton A web scraping framework that simplifies the process by handling repetitive tasks and provides options for efficient data retrieval 1,613
jakopako/goskyr A tool to simplify web scraping of list-like structured data from web pages 35