spider

Web scraper

A tool for web data extraction and processing using Rust

A web crawler and scraper for Rust

GitHub

1k stars
14 watching
107 forks
Language: Rust
last commit: about 1 month ago
Linked from 1 awesome list

ai-scrapingcrawlerheadless-chromeindexerllm-crawlerrustspiderweb-crawlerweb-scraperweb-scraping

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
rust-scraper/scraper A Rust library for parsing and querying HTML documents using CSS selectors. 1,961
postmodern/spidr A Ruby web crawling library that provides flexible and customizable methods to crawl websites 809
tidyverse/rvest A package for extracting data from web pages using HTML parsing and CSS/XPath selectors. 1,495
holgerd77/django-dynamic-scraper An app that allows you to manage Scrapy spiders through a Django admin interface. 1,155
zhuyingda/webster A framework for automating web scraping and crawling tasks using Node.js 518
elixir-crawly/crawly A framework for extracting structured data from websites 994
spekulatius/phpscraper A web scraping utility for PHP that simplifies the process of extracting information from websites. 544
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 380
webrecorder/browsertrix-crawler A containerized browser-based crawler system for capturing web content in a high-fidelity and customizable manner. 677
3nock/spidersuite A cross-platform web spider/crawler tool for analyzing and mapping attack surfaces 614
skallwar/suckit A Rust-based web scraping tool that recursively visits and downloads websites to disk. 750
fredwu/crawler A high-performance web crawling and scraping solution with customizable settings and worker pooling. 945
rivermont/spidy A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling 340
utkarshkukreti/select.rs A Rust library for extracting useful data from HTML documents 974
crypto-crawler/crypto-crawler-rs A Rust-based library for building and managing cryptocurrency crawlers 235