node-crawler

Web scraper

A NodeJS-based web crawler and spider that extracts data from websites.

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

GitHub

7k stars
255 watching
875 forks
Language: TypeScript
last commit: 4 months ago
Linked from 2 awesome lists

cheeriocrawlerextract-datajavascriptjquerynodejsspider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
apify/crawlee A tool for building reliable web scraping and browser automation pipelines in Node.js. 15,604
yujiosaka/headless-chrome-crawler A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites 5,527
ruipgil/scraperjs A versatile web scraping module with two scrapers for static and dynamic content extraction. 3,710
rchipka/node-osmosis A fast and flexible web scraping library using native libxml C bindings 4,116
npm/cli A package manager for JavaScript that enables users to manage and install dependencies for web applications. 8,493
sindresorhus/got A powerful HTTP client library for Node.js that provides a human-friendly and flexible way to make requests. 14,301
axios/axios An HTTP client library for making requests to web servers using the Promise API. 105,804
veliovgroup/spiderable-middleware intercepts requests from web crawlers and proxies them to a prerendering service for rendering HTML 38
unclecode/crawl4ai A tool for web crawling and data extraction, designed to work with large language models. 16,180
macbre/phantomas A tool for collecting and monitoring web performance metrics in a headless Chromium browser environment. 2,258
node-formidable/formidable A module for parsing multipart form data, especially file uploads in Node.js applications. 7,055
spatie/crawler A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. 2,537
code4craft/webmagic A scalable framework for building web crawlers in Java. 11,432
matthewmueller/x-ray A flexible web scraping framework for extracting data from websites with customizable selectors and pagination support. 5,878
sjdirect/abot A C# web crawler framework built for speed and flexibility, allowing developers to easily crawl websites with customizable logic. 2,247