wasapi-downloader
Web archiver
An application to download archives of web archiving projects
Java application to download WARCs from WASAPI
6 stars
22 watching
4 forks
Language: Java
last commit: 10 days ago
Linked from 1 awesome list
applicationinfrastructurejava
Related projects:
Repository | Description | Stars |
---|---|---|
webis-de/wasp | A containerized web archive and search system using Elastic Search | 26 |
unt-libraries/py-wasapi-client | Downloads WARC files from a WASAPI access point. | 14 |
wabarc/cairn | A tool for archiving web pages as single HTML files | 43 |
ukwa/webarchive-discovery | Tools for indexing and discovering archived web content | 116 |
turicas/crau | A command-line tool for archiving and playing back websites in WARC format | 57 |
helgeho/warcpartitioner | Tool for partitioning and merging Web archive files by MIME type and year | 1 |
richardlehane/webarchive | Provides tools for reading and parsing web archive formats used in digital preservation. | 20 |
machawk1/wail | A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. | 350 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,398 |
nla/httrack2warc | Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs | 30 |
containerd/runwasi | Facilitates running WebAssembly workloads managed by containerd | 1,093 |
peterk/warcworker | A web archiving tool that archives websites with high-fidelity preservation capabilities. | 55 |
netarchivesuite/solrwayback | A web-based search interface and Wayback machine for browsing archived web pages using an index of WARC files. | 102 |
internetarchive/warcprox | An HTTP proxy designed to capture and archive web traffic, including encrypted HTTPS connections. | 381 |
webrecorder/archiveweb.page | A high-fidelity web archiving system for storing and replaying interactive web pages in browsers. | 862 |