antch

Web crawler

A framework for building fast and efficient web crawlers and scrapers in Go.

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

GitHub

260 stars
16 watching
41 forks
Language: Go
last commit: over 4 years ago
Linked from 2 awesome lists

crawlercrawlingframeworkgolangscrapingweb-crawlerweb-spider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,826
wspl/creeper A framework for building cross-platform web crawlers using Go 780
antchfx/htmlquery A Golang package for extracting data from HTML documents using XPath expressions. 738
puerkitobio/gocrawl A concurrent web crawler written in Go that allows flexible and polite crawling of websites. 2,038
feng19/spider_man A high-level web crawling and scraping framework for Elixir. 23
elixir-crawly/crawly A framework for extracting structured data from websites 987
iamstoxe/urlgrab A tool to crawl websites by exploring links recursively with support for JavaScript rendering. 330
fmpwizard/owlcrawler A distributed web crawler that coordinates crawling tasks across multiple worker processes using a message bus. 55
jmg/crawley A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. 186
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 378
fredwu/crawler A high-performance web crawling and scraping solution with customizable settings and worker pooling. 945
yhat/scrape A collection of utility functions and tools to simplify web scraping in Go. 1,513
antchfx/xpath Provides a Go package for querying and selecting nodes from various document types using XPath expressions. 694
zhegexiaohuozi/seimicrawler An agile and distributed crawler framework designed to simplify and speed up web scraping with Spring Boot support 1,980
slotix/dataflowkit A framework for extracting structured data from web pages using CSS selectors. 662