wasp
Web archiver
A containerized web archive and search system using Elastic Search
26 stars
13 watching
4 forks
Language: Java
last commit: about 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
sul-dlss/wasapi-downloader | An application to download archives of web archiving projects | 6 |
ukwa/webarchive-discovery | Tools for indexing and discovering archived web content | 116 |
webrecorder/archiveweb.page | A high-fidelity web archiving system for storing and replaying interactive web pages in browsers. | 862 |
derfenix/webarchive | A web-based archive service that allows users to store and manage web pages in various formats. | 112 |
webrecorder/pywb | A toolkit for archiving and replaying web content accurately and efficiently | 1,407 |
vida-nyu/ache | A web crawler designed to efficiently collect and prioritize relevant content from the web | 454 |
oduwsdl/ipwb | A system for dispersing and replaying archived web content using peer-to-peer technology. | 617 |
jarofghosts/memento-client | Provides a simple JavaScript interface to access historical web pages via the Wayback Machine | 14 |
internetarchive/arch | A distributed compute analysis system for web archive collections | 15 |
florents-tselai/warcdb | A library for storing and querying web crawl data in a compact, easily sharable format. | 394 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,402 |
ikreymer/webarchive-indexing | Tools for bulk indexing of WARC/ARC files to create a shared url index | 42 |
elastic/elasticsearch | A distributed search and analytics engine for scalable data storage and real-time search capabilities | 1,332 |
netarchivesuite/jwat | A toolkit for analyzing and extracting data from legacy web archives in a structured format suitable for further analysis or reuse | 3 |
stevepolitodesign/my_site_archive | A simple Rails application for archiving websites | 27 |