owlcrawler

Crawler

A distributed web crawler that coordinates crawling tasks across multiple worker processes using a message bus.

Crawl the web using nats.io and Go

GitHub

55 stars
9 watching
4 forks
Language: Go
last commit: over 9 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
puerkitobio/gocrawl A concurrent web crawler written in Go that allows flexible and polite crawling of websites. 2,036
fredwu/crawler A high-performance web crawling and scraping solution with customizable settings and worker pooling. 945
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,827
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 380
cocrawler/cocrawler A versatile web crawler built with modern tools and concurrency to handle various crawl tasks 188
turnersoftware/infinitycrawler A web crawling library for .NET that allows customizable crawling and throttling of websites. 248
wspl/creeper A framework for building cross-platform web crawlers using Go 780
webrecorder/browsertrix-crawler A containerized browser-based crawler system for capturing web content in a high-fidelity and customizable manner. 677
antchfx/antch A framework for building fast and efficient web crawlers and scrapers in Go. 261
puerkitobio/fetchbot A flexible web crawler that follows robots.txt policies and crawl delays. 787
jmg/crawley A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. 188
internetarchive/brozzler A distributed web crawler that fetches and extracts links from websites using a real browser. 678
apache/incubator-stormcrawler A scalable and versatile web crawling framework based on Apache Storm 895
zhegexiaohuozi/seimicrawler A distributed crawler framework that simplifies the process of building crawlers using Spring Boot and Redis 1,980
mvdbos/php-spider A flexible PHP web crawler with configurable traversal algorithms and filters. 1,336