webster
Web scraper
A framework for automating web scraping and crawling tasks using Node.js
a reliable high-level web crawling & scraping framework for Node.js.
515 stars
33 watching
57 forks
Language: JavaScript
last commit: 19 days ago
Linked from 1 awesome list
automation-testautomation-uichromiumcrawlercrawlingheadless-chromejavascriptjavascript-frameworknodejsnodejs-frameworkpuppeteerscraping-frameworkspider
Related projects:
Repository | Description | Stars |
---|---|---|
spider-rs/spider | A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. | 1,140 |
joseconstela/webparsy | A Node.js library and CLI for scraping websites using Puppeteer and YAML definitions | 44 |
yhat/scrape | A collection of utility functions and tools to simplify web scraping in Go. | 1,513 |
tjatse/node-readability | Automates web page scraping and text extraction to make any webpage readable | 343 |
benibela/xidel | A tool to extract data from web pages using various query languages and selectors. | 686 |
holgerd77/django-dynamic-scraper | An app that allows you to manage Scrapy spiders through a Django admin interface. | 1,153 |
miyagawa/web-scraper | A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
joncanning/skyscraper | A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions. | 58 |
amoilanen/js-crawler | A Node.js module for crawling web sites and scraping their content | 253 |
howie6879/ruia | An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling | 1,752 |
scrapy/scrapely | A pure-python library for extracting structured data from HTML pages. | 1,863 |
dyweb/scrala | A web crawling framework written in Scala that allows users to define the start URL and parse response from it | 113 |
tidyverse/rvest | A package for extracting data from web pages using HTML parsing and CSS/XPath selectors. | 1,492 |
propublica/upton | A web scraping framework that simplifies the process by handling repetitive tasks and provides options for efficient data retrieval | 1,613 |
jakopako/goskyr | A tool to simplify web scraping of list-like structured data from web pages | 35 |