warcworker
Web archiver
A web archiving tool that archives websites with high-fidelity preservation capabilities.
A dockerized, queued high fidelity web archiver based on Squidwarc
57 stars
6 watching
9 forks
Language: Python
last commit: 5 months ago
Linked from 1 awesome list
archivinghigh-fidelity-preservationpreservationwebarchiveswebarchiving
Related projects:
Repository | Description | Stars |
---|---|---|
webrecorder/pywb | A toolkit for archiving and replaying web content accurately and efficiently | 1,418 |
n0tan3rd/squidwarc | An archival crawler built on top of Chrome or Chromium to preserve the web in high fidelity and user scriptable manner | 170 |
webrecorder/archiveweb.page | A high-fidelity web archiving system for storing and replaying interactive web pages in browsers. | 903 |
internetarchive/warcprox | An HTTP proxy designed to capture and archive web traffic, including encrypted HTTPS connections. | 389 |
turicas/crau | A command-line tool for archiving and playing back websites in WARC format | 59 |
machawk1/wail | A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. | 353 |
wabarc/wayback | A tool for capturing and preserving web content and making it accessible in the future. | 1,839 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
wabarc/cairn | A tool for archiving web pages as single HTML files | 45 |
oduwsdl/archivenow | A tool to automate archiving of web resources into public archives. | 409 |
webrecorder/har2warc | Converts HTTP Archive format to Web Archive format | 48 |
oduwsdl/ipwb | A system for dispersing and replaying archived web content using peer-to-peer technology. | 617 |
internetarchive/warctools | Tools for working with archived web content | 153 |
richardlehane/webarchive | Provides tools for reading and parsing web archive formats used in digital preservation. | 20 |
webrecorder/warcio | A fast streaming library for working with WARC format web archival data | 391 |