ruia

Crawler framework

An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling

Async Python 3.6+ web scraping micro-framework based on asyncio

GitHub

2k stars

42 watching

180 forks

Language: Python

last commit: over 2 years ago

Linked from 3 awesome lists

aiohttpasyncioasyncio-spidercrawlercrawling-frameworkmiddlewarespythonpython-ruiaruiaspideruvloop

www.howie6879.com/ruia/

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
elliotgao2/gain	A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites.	2,037
dyweb/scrala	A web crawling framework written in Scala that allows users to define the start URL and parse response from it	113
feng19/spider_man	A high-level web crawling and scraping framework for Elixir.	23
crawlzone/crawlzone	A PHP framework for asynchronous internet crawling and web scraping	78
elixir-crawly/crawly	A framework for extracting structured data from websites	994
spider-rs/spider	A tool for web data extraction and processing using Rust	1,234
hu17889/go_spider	A modular, concurrent web crawler framework written in Go.	1,827
xianhu/pspider	A Python web crawler framework with support for multi-threading and proxy usage.	1,828
zhuyingda/webster	A framework for automating web scraping and crawling tasks using Node.js	518
untwisted/sukhoi	A minimalist web crawler framework built on top of miners and structure-based data extraction	879
archiveteam/grab-site	A web crawler designed to backup websites by recursively crawling and writing WARC files.	1,406
joncanning/skyscraper	A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions.	59
qinxuye/cola	A high-level framework for building distributed data extractors from web pages	1,501
zhegexiaohuozi/seimicrawler	A distributed crawler framework that simplifies the process of building crawlers using Spring Boot and Redis	1,980
jmg/crawley	A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options.	188