crawley
Crawler
A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options.
Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.
186 stars
22 watching
33 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
elliotgao2/gain | A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites. | 2,035 |
stewartmckee/cobweb | A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner | 226 |
rivermont/spidy | A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling | 340 |
hominee/dyer | A fast and flexible web crawling tool with features like asynchronous I/O and event-driven design. | 133 |
manning23/mspider | A Python-based tool for web crawling and data collection from various websites | 348 |
hu17889/go_spider | A modular, concurrent web crawler framework written in Go. | 1,826 |
puerkitobio/gocrawl | A concurrent web crawler written in Go that allows flexible and polite crawling of websites. | 2,038 |
untwisted/sukhoi | A minimalist web crawler framework built on top of miners and structure-based data extraction | 881 |
internetarchive/brozzler | A distributed web crawler that fetches and extracts links from websites using a real browser. | 671 |
cocrawler/cocrawler | A versatile web crawler built with modern tools and concurrency to handle various crawl tasks | 187 |
iamstoxe/urlgrab | A tool to crawl websites by exploring links recursively with support for JavaScript rendering. | 330 |
xianhu/pspider | A Python web crawler framework with support for multi-threading and proxy usage. | 1,827 |
zhegexiaohuozi/seimicrawler | An agile and distributed crawler framework designed to simplify and speed up web scraping with Spring Boot support | 1,980 |
antchfx/antch | A framework for building fast and efficient web crawlers and scrapers in Go. | 260 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,398 |