ruia
Crawler framework
An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling
Async Python 3.6+ web scraping micro-framework based on asyncio
2k stars
42 watching
181 forks
Language: Python
last commit: over 1 year ago
Linked from 3 awesome lists
aiohttpasyncioasyncio-spidercrawlercrawling-frameworkmiddlewarespythonpython-ruiaruiaspideruvloop
Related projects:
Repository | Description | Stars |
---|---|---|
elliotgao2/gain | A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites. | 2,035 |
dyweb/scrala | A web crawling framework written in Scala that allows users to define the start URL and parse response from it | 113 |
feng19/spider_man | A high-level web crawling and scraping framework for Elixir. | 23 |
crawlzone/crawlzone | A PHP framework for asynchronous internet crawling and web scraping | 77 |
elixir-crawly/crawly | A framework for extracting structured data from websites | 987 |
spider-rs/spider | A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. | 1,140 |
hu17889/go_spider | A modular, concurrent web crawler framework written in Go. | 1,826 |
xianhu/pspider | A Python web crawler framework with support for multi-threading and proxy usage. | 1,827 |
zhuyingda/webster | A framework for automating web scraping and crawling tasks using Node.js | 515 |
untwisted/sukhoi | A minimalist web crawler framework built on top of miners and structure-based data extraction | 881 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,402 |
joncanning/skyscraper | A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions. | 58 |
qinxuye/cola | A high-level framework for building distributed data extractors from web pages | 1,500 |
zhegexiaohuozi/seimicrawler | An agile and distributed crawler framework designed to simplify and speed up web scraping with Spring Boot support | 1,980 |
jmg/crawley | A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. | 186 |