SeimiCrawler
Crawler framework
An agile and distributed crawler framework designed to simplify and speed up web scraping with Spring Boot support
一个简单、敏捷、分布式的支持SpringBoot的Java爬虫框架;An agile, distributed crawler framework.
2k stars
176 watching
682 forks
Language: Java
last commit: over 1 year ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
codesofun/web-bee | A Java framework for building web-based crawlers with features like distributed crawling and proxy support. | 189 |
wspl/creeper | A framework for building cross-platform web crawlers using Go | 780 |
crawlzone/crawlzone | A PHP framework for asynchronous internet crawling and web scraping | 77 |
turnersoftware/infinitycrawler | A web crawling library for .NET that allows customizable crawling and throttling of websites. | 248 |
kiddyuchina/beanbun | A PHP framework for building distributed web crawlers with modular design and extensibility | 1,248 |
hu17889/go_spider | A modular, concurrent web crawler framework written in Go. | 1,826 |
dyweb/scrala | A web crawling framework written in Scala that allows users to define the start URL and parse response from it | 113 |
apache/incubator-stormcrawler | A collection of resources for building web crawlers on Apache Storm using Java | 891 |
untwisted/sukhoi | A minimalist web crawler framework built on top of miners and structure-based data extraction | 881 |
qinxuye/cola | A high-level framework for building distributed data extractors from web pages | 1,500 |
jmg/crawley | A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. | 186 |
brendonboshell/supercrawler | A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. | 378 |
howie6879/ruia | An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling | 1,752 |
hominee/dyer | A fast and flexible web crawling tool with features like asynchronous I/O and event-driven design. | 133 |
antchfx/antch | A framework for building fast and efficient web crawlers and scrapers in Go. | 260 |