galer
URL extractor
A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript
A fast tool to fetch URLs from HTML attributes by crawl-in.
255 stars
6 watching
38 forks
Language: Go
last commit: 13 days ago crawlerdevtoolextractorgalergogolangspiderurl-extractorurl-parserwaybackurls
Related projects:
Repository | Description | Stars |
---|---|---|
s0rg/crawley | A utility for systematically extracting URLs from web pages and printing them to the console. | 268 |
mvdan/xurls | A tool to extract URLs from text using regular expressions in the Go programming language. | 1,193 |
003random/getjs | A tool to extract JavaScript sources from URLs and web pages efficiently | 732 |
eloopwoo/chrome-url-dumper | A tool to extract and dump URLs from Chrome's stored databases. | 34 |
karust/gogetcrawl | A tool and package for extracting web archive data from popular sources like Wayback Machine and Common Crawl using the Go programming language. | 148 |
coleifer/micawber | A library for extracting metadata and content from URLs | 635 |
davemolk/gogetjs | Tools for extracting and analyzing JavaScript files from web pages | 41 |
foolin/pagser | A tool for automatically extracting structured data from HTML pages | 105 |
iamstoxe/urlgrab | A tool to crawl websites by exploring links recursively with support for JavaScript rendering. | 331 |
offensivedev/urldozer | A tool for analyzing URLs to extract various information such as paths, domains, and parameters. | 29 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
gamallo/galextra | A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input. | 2 |
limiu82214/gojmapr | A library to extract specific properties from complex JSON structures into Go structs with minimal code changes. | 22 |
plainas/tq | Tool that extracts content from HTML documents based on CSS selectors | 236 |
patternhelloworld/url-knife | A JavaScript library to extract and decompose URLs in texts with robust patterns, including email addresses. | 341 |