node-crawler

Web scraper

A NodeJS-based web crawler and spider that extracts data from websites.

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

GitHub

7k stars
255 watching
876 forks
Language: TypeScript
last commit: 6 months ago
Linked from 2 awesome lists

cheeriocrawlerextract-datajavascriptjquerynodejsspider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
apify/crawlee A tool for building reliable web scraping and browser automation pipelines in Node.js. 16,081
yujiosaka/headless-chrome-crawler A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites 5,534
ruipgil/scraperjs A versatile web scraping module with two scrapers for static and dynamic content extraction. 3,714
rchipka/node-osmosis A fast and flexible web scraping library using native libxml C bindings 4,115
npm/cli A package manager for JavaScript that enables users to manage and install dependencies for web applications. 8,558
sindresorhus/got A powerful HTTP client library for Node.js that provides a human-friendly and flexible way to make requests. 14,351
axios/axios An HTTP client library for making requests to web servers using the Promise API. 105,978
veliovgroup/spiderable-middleware intercepts requests from web crawlers and proxies them to a prerendering service for rendering HTML 39
unclecode/crawl4ai A web crawling tool designed to extract structured data from the web for use in AI applications 18,541
macbre/phantomas A tool for collecting and monitoring web performance metrics in a headless Chromium browser environment. 2,257
node-formidable/formidable A module for parsing multipart form data, especially file uploads in Node.js applications. 7,076
spatie/crawler A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. 2,552
code4craft/webmagic A framework for building scalable web crawlers in Java. 11,456
matthewmueller/x-ray A flexible web scraping framework for extracting data from websites with customizable selectors and pagination support. 5,883
sjdirect/abot A C# web crawler framework built for speed and flexibility, allowing developers to easily crawl websites with customizable logic. 2,255