webparsy

Website scraper

A Node.js library and CLI for scraping websites using Puppeteer and YAML definitions

Node.JS library and cli for scraping websites using Puppeteer (or not) and YAML definitions

GitHub

44 stars

4 watching

7 forks

Language: JavaScript

last commit: over 3 years ago

Linked from 1 awesome list

browserchromeheadlessnodejspuppeteeryaml

Screenshot of joseconstela/webparsy website

www.npmjs.com/package/webparsy

Backlinks from these awesome lists:

angrykoala/awesome-browser-automation

Related projects:

Repository	Description	Stars
amoilanen/js-crawler	A Node.js module for crawling web sites and scraping their content	254
zhuyingda/webster	A framework for automating web scraping and crawling tasks using Node.js	518
benibela/xidel	A tool to extract data from web pages using various query languages and selectors.	690
jakopako/goskyr	A tool to simplify web scraping of list-like structured data from web pages	36
tjatse/node-readability	Automates web page scraping and text extraction to make any webpage readable	343
fanyong920/jvppeteer	A Java library that provides a headless Chrome browser solution for automation and testing purposes.	737
spider-rs/spider	A tool for web data extraction and processing using Rust	1,234
felipecsl/wombat	A Ruby-based web crawler and data extraction tool with an elegant DSL.	1,315
spekulatius/phpscraper	A web scraping utility for PHP that simplifies the process of extracting information from websites.	544
miyagawa/web-scraper	A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface.	104
jaimeiniesta/metainspector	A Ruby gem for web scraping and extracting metadata from web pages.	1,038
davemolk/gogetjs	Tools for extracting and analyzing JavaScript files from web pages	41
oscarotero/embed	A PHP library to retrieve metadata and embed code from any web page	2,100
hlaueriksson/puppeteer-sharp-contrib	Extensions to the .NET API for automating Chrome browser tests	82
postmodern/spidr	A Ruby web crawling library that provides flexible and customizable methods to crawl websites	809