urlgrab

Link crawler

A tool to crawl websites by exploring links recursively with support for JavaScript rendering.

A golang utility to spider through a website searching for additional links.

GitHub

330 stars
10 watching
60 forks
Language: Go
last commit: about 4 years ago
Linked from 1 awesome list

spider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,826
jmg/crawley A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. 186
antchfx/antch A framework for building fast and efficient web crawlers and scrapers in Go. 260
s0rg/crawley A utility for systematically extracting URLs from web pages and printing them to the console. 263
puerkitobio/gocrawl A concurrent web crawler written in Go that allows flexible and polite crawling of websites. 2,038
amoilanen/js-crawler A Node.js module for crawling web sites and scraping their content 253
dwisiswant0/galer A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript 253
internetarchive/brozzler A distributed web crawler that fetches and extracts links from websites using a real browser. 671
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 378
wspl/creeper A framework for building cross-platform web crawlers using Go 780
stewartmckee/cobweb A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner 226
rivermont/spidy A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling 340
vida-nyu/ache A web crawler designed to efficiently collect and prioritize relevant content from the web 454
twiny/spidy Tools to crawl websites and collect domain names with availability status 149
fmpwizard/owlcrawler A distributed web crawler that coordinates crawling tasks across multiple worker processes using a message bus. 55