spidr
Crawler library
A Ruby web crawling library that provides flexible and customizable methods to crawl websites
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
808 stars
28 watching
106 forks
Language: Ruby
last commit: 10 months ago
Linked from 3 awesome lists
crawlerrubyscraperspiderspider-linkswebweb-crawlerweb-scraperweb-scrapingweb-spider
Related projects:
Repository | Description | Stars |
---|---|---|
rivermont/spidy | A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling | 341 |
spider-rs/spider | A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. | 1,185 |
felipecsl/wombat | A Ruby-based web crawler and data extraction tool with an elegant DSL. | 1,315 |
twiny/spidy | Tools to crawl websites and collect domain names with availability status | 150 |
brendonboshell/supercrawler | A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. | 380 |
feng19/spider_man | A high-level web crawling and scraping framework for Elixir. | 23 |
hu17889/go_spider | A modular, concurrent web crawler framework written in Go. | 1,828 |
joenorton/rubyretriever | A Ruby-based tool for web crawling and data extraction, aiming to be a replacement for paid software in the SEO space. | 143 |
crypto-crawler/crypto-crawler-rs | A Rust-based library for building and managing cryptocurrency crawlers | 234 |
bplawler/crawler | A Scala-based DSL for programmatically accessing and interacting with web pages | 148 |
dyweb/scrala | A web crawling framework written in Scala that allows users to define the start URL and parse response from it | 113 |
turnersoftware/infinitycrawler | A web crawling library for .NET that allows customizable crawling and throttling of websites. | 248 |
internetarchive/brozzler | A distributed web crawler that fetches and extracts links from websites using a real browser. | 673 |
holgerd77/django-dynamic-scraper | An app that allows you to manage Scrapy spiders through a Django admin interface. | 1,154 |
jaimeiniesta/metainspector | A Ruby gem for web scraping and extracting metadata from web pages. | 1,037 |