wasapi-downloader
Web archiver
An application to download archives of web archiving projects
Java application to download WARCs from WASAPI
6 stars
22 watching
4 forks
Language: Java
last commit: 2 months ago
Linked from 1 awesome list
applicationinfrastructurejava
Related projects:
Repository | Description | Stars |
---|---|---|
webis-de/wasp | A containerized web archive and search system using Elastic Search | 27 |
unt-libraries/py-wasapi-client | Downloads WARC files from a WASAPI access point. | 15 |
wabarc/cairn | A tool for archiving web pages as single HTML files | 45 |
ukwa/webarchive-discovery | Tools for indexing and discovering archived web content | 117 |
turicas/crau | A command-line tool for archiving and playing back websites in WARC format | 59 |
helgeho/warcpartitioner | Tool for partitioning and merging Web archive files by MIME type and year | 1 |
richardlehane/webarchive | Provides tools for reading and parsing web archive formats used in digital preservation. | 20 |
machawk1/wail | A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. | 353 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
nla/httrack2warc | Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs | 32 |
containerd/runwasi | Facilitates running WebAssembly workloads on a container runtime | 1,114 |
peterk/warcworker | A web archiving tool that archives websites with high-fidelity preservation capabilities. | 57 |
netarchivesuite/solrwayback | A search interface and archival tool for browsing historical web pages | 102 |
internetarchive/warcprox | An HTTP proxy designed to capture and archive web traffic, including encrypted HTTPS connections. | 389 |
webrecorder/archiveweb.page | A high-fidelity web archiving system for storing and replaying interactive web pages in browsers. | 903 |