solrwayback

Web archiver

A search interface and archival tool for browsing historical web pages

A search interface and wayback machine for the UKWA Solr based warc-indexer framework.

GitHub

102 stars
24 watching
21 forks
Language: Java
last commit: about 1 month ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
wabarc/wayback A tool for capturing and preserving web content and making it accessible in the future. 1,839
ukwa/webarchive-discovery Tools for indexing and discovering archived web content 117
akamhy/waybackpy An API interface and command-line tool for interacting with the Wayback Machine's web archiving service 489
jarofghosts/memento-client Provides a simple JavaScript interface to access historical web pages via the Wayback Machine 14
oduwsdl/ipwb A system for dispersing and replaying archived web content using peer-to-peer technology. 617
wabarc/playback Replays archived webpages from the Wayback Machine 8
nla/outbackcdx A RocksDB-based server for managing and replicating capture indexes used in web archiving 33
iipc/openwayback A Java-based tool for recording and replaying web pages from archives. 487
wabarc/cairn A tool for archiving web pages as single HTML files 45
ukwa/shine A web archive exploration UI built on top of the Solr search engine and warc-discovery indexer. 43
machawk1/wail A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. 353
p3gleg/pwnback Generates a sitemap of a website using Wayback Machine 225
netarchivesuite/jwat A toolkit for analyzing and extracting data from legacy web archives in a structured format suitable for further analysis or reuse 3
ikreymer/webarchive-indexing Tools for bulk indexing of WARC/ARC files to create a shared url index 43
richardlehane/webarchive Provides tools for reading and parsing web archive formats used in digital preservation. 20