webarchive
Web archive parser
Provides tools for reading and parsing web archive formats used in digital preservation.
golang readers for ARC and WARC webarchive formats
20 stars
7 watching
2 forks
Language: Go
last commit: over 1 year ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
ukwa/webarchive-discovery | Tools for indexing and discovering archived web content | 116 |
webrecorder/archiveweb.page | A high-fidelity web archiving system for storing and replaying interactive web pages in browsers. | 862 |
derfenix/webarchive | A web-based archive service that allows users to store and manage web pages in various formats. | 112 |
helgeho/warcpartitioner | Tool for partitioning and merging Web archive files by MIME type and year | 1 |
peterk/warcworker | A web archiving tool that archives websites with high-fidelity preservation capabilities. | 55 |
go-shiori/obelisk | Archives a web page as a single HTML file with embedded resources. | 263 |
turicas/crau | A command-line tool for archiving and playing back websites in WARC format | 57 |
n0tan3rd/node-warc | A tool for parsing and generating Web Archive files in JavaScript using Node.js | 94 |
wabarc/rivet | A tool for archiving webpages to IPFS | 12 |
internetarchive/warctools | Tools for working with archived web content | 152 |
ikreymer/webarchive-indexing | Tools for bulk indexing of WARC/ARC files to create a shared url index | 42 |
wabarc/wayback | A tool for capturing and preserving web content and making it accessible in the future. | 1,818 |
machawk1/wail | A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. | 350 |
jarofghosts/memento-client | Provides a simple JavaScript interface to access historical web pages via the Wayback Machine | 14 |
webrecorder/har2warc | Converts HTTP Archive format to Web Archive format | 46 |