spider

Web scraper

A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner.

A web crawler and scraper for Rust

GitHub

1k stars
13 watching
99 forks
Language: Rust
last commit: 6 days ago
Linked from 1 awesome list

ai-scrapingcrawlerheadless-chromeindexerllm-crawlerrustspiderweb-crawlerweb-scraperweb-scraping

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
rust-scraper/scraper A Rust library for parsing and querying HTML documents using CSS selectors. 1,937
postmodern/spidr A Ruby web crawling library that provides flexible and customizable methods to crawl websites 806
tidyverse/rvest A package for extracting data from web pages using HTML parsing and CSS/XPath selectors. 1,492
holgerd77/django-dynamic-scraper An app that allows you to manage Scrapy spiders through a Django admin interface. 1,153
zhuyingda/webster A framework for automating web scraping and crawling tasks using Node.js 515
elixir-crawly/crawly A framework for extracting structured data from websites 987
spekulatius/phpscraper A web scraping utility for PHP that simplifies the process of extracting information from websites. 536
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 378
webrecorder/browsertrix-crawler A containerized browser-based crawler system for capturing web content in a high-fidelity and customizable manner. 652
3nock/spidersuite A cross-platform web spider/crawler tool for analyzing and mapping attack surfaces 601
skallwar/suckit A Rust-based web scraping tool that recursively visits and downloads websites to disk. 747
fredwu/crawler A high-performance web crawling and scraping solution with customizable settings and worker pooling. 945
rivermont/spidy A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling 340
utkarshkukreti/select.rs A Rust library for extracting useful data from HTML documents 974
crypto-crawler/crypto-crawler-rs A Rust-based library for building and managing cryptocurrency crawlers 232