crawley

Crawler

A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options.

Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

GitHub

188 stars

22 watching

33 forks

Language: Python

last commit: over 2 years ago

Linked from 1 awesome list

project.crawley-cloud.com

Backlinks from these awesome lists:

brucedone/awesome-crawler

Related projects:

Repository	Description	Stars
elliotgao2/gain	A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites.	2,037
stewartmckee/cobweb	A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner	226
rivermont/spidy	A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling	340
hominee/dyer	A fast and flexible web crawling tool with features like asynchronous I/O and event-driven design.	135
manning23/mspider	A Python-based tool for web crawling and data collection from various websites	348
hu17889/go_spider	A modular, concurrent web crawler framework written in Go.	1,827
puerkitobio/gocrawl	A concurrent web crawler written in Go that allows flexible and polite crawling of websites.	2,036
untwisted/sukhoi	A minimalist web crawler framework built on top of miners and structure-based data extraction	879
internetarchive/brozzler	A distributed web crawler that fetches and extracts links from websites using a real browser.	678
cocrawler/cocrawler	A versatile web crawler built with modern tools and concurrency to handle various crawl tasks	188
iamstoxe/urlgrab	A tool to crawl websites by exploring links recursively with support for JavaScript rendering.	331
xianhu/pspider	A Python web crawler framework with support for multi-threading and proxy usage.	1,828
zhegexiaohuozi/seimicrawler	A distributed crawler framework that simplifies the process of building crawlers using Spring Boot and Redis	1,980
antchfx/antch	A framework for building fast and efficient web crawlers and scrapers in Go.	261
archiveteam/grab-site	A web crawler designed to backup websites by recursively crawling and writing WARC files.	1,406