gain
crawler
A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites.
Web crawling framework based on asyncio.
2k stars
75 watching
208 forks
Language: Python
last commit: over 5 years ago
Linked from 2 awesome lists
aiohttpasynciocrawlerpythonspideruvloop
Related projects:
Repository | Description | Stars |
---|---|---|
howie6879/ruia | An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling | 1,753 |
jmg/crawley | A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. | 188 |
chenjiandongx/github-spider | A Python-based web crawler for scraping Github user and repository data. | 264 |
feng19/spider_man | A high-level web crawling and scraping framework for Elixir. | 23 |
xianhu/pspider | A Python web crawler framework with support for multi-threading and proxy usage. | 1,828 |
puerkitobio/fetchbot | A flexible web crawler that follows robots.txt policies and crawl delays. | 787 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
hu17889/go_spider | A modular, concurrent web crawler framework written in Go. | 1,827 |
a11ywatch/crawler | Performs web page crawling at high performance. | 51 |
untwisted/sukhoi | A minimalist web crawler framework built on top of miners and structure-based data extraction | 879 |
fredwu/crawler | A high-performance web crawling and scraping solution with customizable settings and worker pooling. | 945 |
cocrawler/cocrawler | A versatile web crawler built with modern tools and concurrency to handle various crawl tasks | 188 |
puerkitobio/gocrawl | A concurrent web crawler written in Go that allows flexible and polite crawling of websites. | 2,036 |
rivermont/spidy | A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling | 340 |
qinxuye/cola | A high-level framework for building distributed data extractors from web pages | 1,501 |