go_spider
Crawler
A modular, concurrent web crawler framework written in Go.
[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
2k stars
154 watching
471 forks
Language: Go
last commit: about 7 years ago
Linked from 1 awesome list
crawlergopipelineschedulespider
Related projects:
Repository | Description | Stars |
---|---|---|
puerkitobio/gocrawl | A concurrent web crawler written in Go that allows flexible and polite crawling of websites. | 2,038 |
wspl/creeper | A framework for building cross-platform web crawlers using Go | 780 |
antchfx/antch | A framework for building fast and efficient web crawlers and scrapers in Go. | 260 |
feng19/spider_man | A high-level web crawling and scraping framework for Elixir. | 23 |
chenjiandongx/github-spider | A Python-based web crawler for scraping Github user and repository data. | 264 |
iamstoxe/urlgrab | A tool to crawl websites by exploring links recursively with support for JavaScript rendering. | 330 |
fmpwizard/owlcrawler | A distributed web crawler that coordinates crawling tasks across multiple worker processes using a message bus. | 55 |
brendonboshell/supercrawler | A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. | 378 |
postmodern/spidr | A Ruby web crawling library that provides flexible and customizable methods to crawl websites | 806 |
elixir-crawly/crawly | A framework for extracting structured data from websites | 987 |
dyweb/scrala | A web crawling framework written in Scala that allows users to define the start URL and parse response from it | 113 |
jmg/crawley | A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. | 186 |
puerkitobio/fetchbot | A flexible web crawler that follows robots.txt policies and crawl delays. | 786 |
3nock/spidersuite | A cross-platform web spider/crawler tool for analyzing and mapping attack surfaces | 608 |
mvdbos/php-spider | A flexible PHP web crawler with configurable traversal algorithms and filters. | 1,332 |