owlcrawler

Crawler

A distributed web crawler that coordinates crawling tasks across multiple worker processes using a message bus.

Crawl the web using nats.io and Go

GitHub

55 stars
9 watching
4 forks
Language: Go
last commit: about 9 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
puerkitobio/gocrawl A concurrent web crawler written in Go that allows flexible and polite crawling of websites. 2,038
fredwu/crawler A high-performance web crawling and scraping solution with customizable settings and worker pooling. 945
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,826
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 378
cocrawler/cocrawler A versatile web crawler built with modern tools and concurrency to handle various crawl tasks 187
turnersoftware/infinitycrawler A web crawling library for .NET that allows customizable crawling and throttling of websites. 248
wspl/creeper A framework for building cross-platform web crawlers using Go 780
webrecorder/browsertrix-crawler A containerized browser-based crawler system for capturing web content in a high-fidelity and customizable manner. 652
antchfx/antch A framework for building fast and efficient web crawlers and scrapers in Go. 260
puerkitobio/fetchbot A flexible web crawler that follows robots.txt policies and crawl delays. 786
jmg/crawley A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. 186
internetarchive/brozzler A distributed web crawler that fetches and extracts links from websites using a real browser. 671
apache/incubator-stormcrawler A collection of resources for building web crawlers on Apache Storm using Java 891
zhegexiaohuozi/seimicrawler An agile and distributed crawler framework designed to simplify and speed up web scraping with Spring Boot support 1,980
mvdbos/php-spider A flexible PHP web crawler with configurable traversal algorithms and filters. 1,332