jwat

Web archive analyzer

A toolkit for analyzing and extracting data from legacy web archives in a structured format suitable for further analysis or reuse

Java Web Archive Toolkit

GitHub

3 stars
8 watching
2 forks
Language: Java
last commit: 12 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
netarchivesuite/jwat-tools An extension of utility libraries with command-line tools for archiving and compression tasks. 5
ukwa/webarchive-discovery Tools for indexing and discovering archived web content 116
archivesunleashed/aut An open-source toolkit for analyzing web archives using Apache Spark. 137
richardlehane/webarchive Provides tools for reading and parsing web archive formats used in digital preservation. 20
internetarchive/arch A distributed compute analysis system for web archive collections 15
peterk/warcworker A web archiving tool that archives websites with high-fidelity preservation capabilities. 55
webis-de/wasp A containerized web archive and search system using Elastic Search 26
jarofghosts/memento-client Provides a simple JavaScript interface to access historical web pages via the Wayback Machine 14
chatnoir-eu/chatnoir-resiliparse A toolkit for processing and analyzing web archive data 84
netarchivesuite/solrwayback A web-based search interface and Wayback machine for browsing archived web pages using an index of WARC files. 102
jameshabben/evolve A web interface for analyzing memory dumps using the Volatility framework, providing an interactive and collaborative environment for forensic analysis. 259
jjjake/internetarchive A command-line and Python interface to access Archive.org's services 1,625
wabarc/cairn A tool for archiving web pages as single HTML files 43
machawk1/wail A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. 350
webrecorder/pywb A toolkit for archiving and replaying web content accurately and efficiently 1,407