gain
crawler
A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites.
Web crawling framework based on asyncio.
2k stars
75 watching
208 forks
Language: Python
last commit: over 5 years ago
Linked from 2 awesome lists
aiohttpasynciocrawlerpythonspideruvloop
Related projects:
Repository | Description | Stars |
---|---|---|
| An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling | 1,753 |
| A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. | 188 |
| A Python-based web crawler for scraping Github user and repository data. | 264 |
| A high-level web crawling and scraping framework for Elixir. | 23 |
| A Python web crawler framework with support for multi-threading and proxy usage. | 1,828 |
| A flexible web crawler that follows robots.txt policies and crawl delays. | 787 |
| A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
| A modular, concurrent web crawler framework written in Go. | 1,827 |
| Performs web page crawling at high performance. | 51 |
| A minimalist web crawler framework built on top of miners and structure-based data extraction | 879 |
| A high-performance web crawling and scraping solution with customizable settings and worker pooling. | 945 |
| A versatile web crawler built with modern tools and concurrency to handle various crawl tasks | 188 |
| A concurrent web crawler written in Go that allows flexible and polite crawling of websites. | 2,036 |
| A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling | 340 |
| A high-level framework for building distributed data extractors from web pages | 1,501 |