wasp
Web archiver
A containerized web archive and search system using Elastic Search
27 stars
13 watching
4 forks
Language: Java
last commit: over 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| An application to download archives of web archiving projects | 6 |
| Tools for indexing and discovering archived web content | 117 |
| A high-fidelity web archiving system for storing and replaying interactive web pages in browsers. | 903 |
| A web-based archive service that allows users to store and manage web pages in various formats. | 115 |
| A toolkit for archiving and replaying web content accurately and efficiently | 1,418 |
| A web crawler designed to efficiently collect and prioritize relevant content from the web | 459 |
| A system for dispersing and replaying archived web content using peer-to-peer technology. | 617 |
| Provides a simple JavaScript interface to access historical web pages via the Wayback Machine | 14 |
| A distributed compute analysis system for web archive collections | 15 |
| A library for storing and querying web crawl data in a compact, easily sharable format. | 397 |
| A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
| Tools for bulk indexing of WARC/ARC files to create a shared url index | 43 |
| A distributed search and analytics engine for scalable data storage and real-time search capabilities | 71,007 |
| A toolkit for analyzing and extracting data from legacy web archives in a structured format suitable for further analysis or reuse | 3 |
| A simple Rails application for archiving websites | 27 |