node-crawler

Web scraper

A NodeJS-based web crawler and spider that extracts data from websites.

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

GitHub

7k stars

255 watching

876 forks

Language: TypeScript

last commit: almost 2 years ago

Linked from 2 awesome lists

cheeriocrawlerextract-datajavascriptjquerynodejsspider

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
apify/crawlee	A tool for building reliable web scraping and browser automation pipelines in Node.js.	16,081
yujiosaka/headless-chrome-crawler	A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites	5,534
ruipgil/scraperjs	A versatile web scraping module with two scrapers for static and dynamic content extraction.	3,714
rchipka/node-osmosis	A fast and flexible web scraping library using native libxml C bindings	4,115
npm/cli	A package manager for JavaScript that enables users to manage and install dependencies for web applications.	8,558
sindresorhus/got	A powerful HTTP client library for Node.js that provides a human-friendly and flexible way to make requests.	14,351
axios/axios	An HTTP client library for making requests to web servers using the Promise API.	105,978
veliovgroup/spiderable-middleware	intercepts requests from web crawlers and proxies them to a prerendering service for rendering HTML	39
unclecode/crawl4ai	A web crawling tool designed to extract structured data from the web for use in AI applications	18,541
macbre/phantomas	A tool for collecting and monitoring web performance metrics in a headless Chromium browser environment.	2,257
node-formidable/formidable	A module for parsing multipart form data, especially file uploads in Node.js applications.	7,076
spatie/crawler	A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently.	2,552
code4craft/webmagic	A framework for building scalable web crawlers in Java.	11,456
matthewmueller/x-ray	A flexible web scraping framework for extracting data from websites with customizable selectors and pagination support.	5,883
sjdirect/abot	A C# web crawler framework built for speed and flexibility, allowing developers to easily crawl websites with customizable logic.	2,255