ruia

Crawler framework

An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling

Async Python 3.6+ web scraping micro-framework based on asyncio

GitHub

2k stars
42 watching
181 forks
Language: Python
last commit: over 1 year ago
Linked from 3 awesome lists

aiohttpasyncioasyncio-spidercrawlercrawling-frameworkmiddlewarespythonpython-ruiaruiaspideruvloop

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
elliotgao2/gain A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites. 2,035
dyweb/scrala A web crawling framework written in Scala that allows users to define the start URL and parse response from it 113
feng19/spider_man A high-level web crawling and scraping framework for Elixir. 23
crawlzone/crawlzone A PHP framework for asynchronous internet crawling and web scraping 77
elixir-crawly/crawly A framework for extracting structured data from websites 987
spider-rs/spider A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. 1,140
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,826
xianhu/pspider A Python web crawler framework with support for multi-threading and proxy usage. 1,827
zhuyingda/webster A framework for automating web scraping and crawling tasks using Node.js 515
untwisted/sukhoi A minimalist web crawler framework built on top of miners and structure-based data extraction 881
archiveteam/grab-site A web crawler designed to backup websites by recursively crawling and writing WARC files. 1,402
joncanning/skyscraper A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions. 58
qinxuye/cola A high-level framework for building distributed data extractors from web pages 1,500
zhegexiaohuozi/seimicrawler An agile and distributed crawler framework designed to simplify and speed up web scraping with Spring Boot support 1,980
jmg/crawley A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. 186