grab-site
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
1k stars
40 watching
134 forks
Language: Python
last commit: 3 months ago
Linked from 1 awesome list
archivingcrawlcrawlerspiderwarc