crawley

Crawler

A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options.

Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

GitHub

188 stars
22 watching
33 forks
Language: Python
last commit: almost 2 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
elliotgao2/gain A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites. 2,037
stewartmckee/cobweb A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner 226
rivermont/spidy A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling 340
hominee/dyer A fast and flexible web crawling tool with features like asynchronous I/O and event-driven design. 135
manning23/mspider A Python-based tool for web crawling and data collection from various websites 348
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,827
puerkitobio/gocrawl A concurrent web crawler written in Go that allows flexible and polite crawling of websites. 2,036
untwisted/sukhoi A minimalist web crawler framework built on top of miners and structure-based data extraction 879
internetarchive/brozzler A distributed web crawler that fetches and extracts links from websites using a real browser. 678
cocrawler/cocrawler A versatile web crawler built with modern tools and concurrency to handle various crawl tasks 188
iamstoxe/urlgrab A tool to crawl websites by exploring links recursively with support for JavaScript rendering. 331
xianhu/pspider A Python web crawler framework with support for multi-threading and proxy usage. 1,828
zhegexiaohuozi/seimicrawler A distributed crawler framework that simplifies the process of building crawlers using Spring Boot and Redis 1,980
antchfx/antch A framework for building fast and efficient web crawlers and scrapers in Go. 261
archiveteam/grab-site A web crawler designed to backup websites by recursively crawling and writing WARC files. 1,406