galer

URL extractor

A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript

A fast tool to fetch URLs from HTML attributes by crawl-in.

GitHub

255 stars

6 watching

38 forks

Language: Go

last commit: over 1 year ago

crawlerdevtoolextractorgalergogolangspiderurl-extractorurl-parserwaybackurls

Related projects:

Repository	Description	Stars
s0rg/crawley	A utility for systematically extracting URLs from web pages and printing them to the console.	268
mvdan/xurls	A tool to extract URLs from text using regular expressions in the Go programming language.	1,193
003random/getjs	A tool to extract JavaScript sources from URLs and web pages efficiently	732
eloopwoo/chrome-url-dumper	A tool to extract and dump URLs from Chrome's stored databases.	34
karust/gogetcrawl	A tool and package for extracting web archive data from popular sources like Wayback Machine and Common Crawl using the Go programming language.	148
coleifer/micawber	A library for extracting metadata and content from URLs	635
davemolk/gogetjs	Tools for extracting and analyzing JavaScript files from web pages	41
foolin/pagser	A tool for automatically extracting structured data from HTML pages	105
iamstoxe/urlgrab	A tool to crawl websites by exploring links recursively with support for JavaScript rendering.	331
offensivedev/urldozer	A tool for analyzing URLs to extract various information such as paths, domains, and parameters.	29
archiveteam/grab-site	A web crawler designed to backup websites by recursively crawling and writing WARC files.	1,406
gamallo/galextra	A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input.	2
limiu82214/gojmapr	A library to extract specific properties from complex JSON structures into Go structs with minimal code changes.	22
plainas/tq	Tool that extracts content from HTML documents based on CSS selectors	236
patternhelloworld/url-knife	A JavaScript library to extract and decompose URLs in texts with robust patterns, including email addresses.	341