crawley
Web URL extractor
A utility for systematically extracting URLs from web pages and printing them to the console.
The unix-way web crawler
265 stars
2 watching
13 forks
Language: Go
last commit: 14 days ago
Linked from 4 awesome lists
clicrawlergogolanggolang-applicationpentestpentest-toolpentestingunix-wayweb-crawlerweb-scrapingweb-spider
Related projects:
Repository | Description | Stars |
---|---|---|
dwisiswant0/galer | A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript | 253 |
mvdan/xurls | A tool to extract URLs from text using regular expressions in the Go programming language. | 1,187 |
karust/gogetcrawl | A tool and package for extracting web archive data from popular sources like Wayback Machine and Common Crawl using the Go programming language. | 147 |
003random/getjs | A tool to extract JavaScript sources from URLs and web pages efficiently | 712 |
foolin/pagser | A tool for automatically extracting structured data from HTML pages | 105 |
jakopako/goskyr | A tool to simplify web scraping of list-like structured data from web pages | 35 |
eloopwoo/chrome-url-dumper | A tool to extract and dump URLs from Chrome's stored databases. | 34 |
go-shiori/obelisk | Archives a web page as a single HTML file with embedded resources. | 263 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,402 |
slotix/dataflowkit | A framework for extracting structured data from web pages using CSS selectors. | 662 |
iamstoxe/urlgrab | A tool to crawl websites by exploring links recursively with support for JavaScript rendering. | 330 |
archiveteam/wpull | Downloads and crawls web pages, allowing for the archiving of websites. | 556 |
puerkitobio/gocrawl | A concurrent web crawler written in Go that allows flexible and polite crawling of websites. | 2,038 |
stewartmckee/cobweb | A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner | 226 |
rivermont/spidy | A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling | 340 |