ArchiveBox
Preservation tool
Automated preservation of internet content in durable formats
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
22k stars
174 watching
1k forks
Language: Python
last commit: 5 days ago
Linked from 1 awesome list
archiveboxbackupsbookmark-archiverbrowser-bookmarkschromiumdigipresfirefoxheadless-browserinternet-archivingpinboardpocketpythonrssself-hostedsinglefilewarcwayback-machineweb-archivingwgetyoutube-dl
Related projects:
Repository | Description | Stars |
---|---|---|
browserbox/browserbox | A browser that runs on a remote server and provides isolated access to web content for security, compliance, and other purposes. | 3,454 |
bellingcat/auto-archiver | Automates archiving of online content from various sources into local storage or cloud services | 570 |
oduwsdl/archivenow | A tool to automate archiving of web resources into public archives. | 410 |
go-shiori/obelisk | Archives a web page as a single HTML file with embedded resources. | 263 |
stevepolitodesign/my_site_archive | A simple Rails application for archiving websites | 27 |
machawk1/wail | A graphical user interface layer for preserving and replaying web pages using multiple archiving tools. | 350 |
wabarc/wayback | A tool for capturing and preserving web content and making it accessible in the future. | 1,811 |
tubearchivist/tubearchivist | A tool to organize and search archived YouTube videos | 5,246 |
googlechrome/workbox | A suite of tools and strategies for efficiently caching and serving web assets | 12,366 |
peterk/warcworker | A web archiving tool that archives websites with high-fidelity preservation capabilities. | 55 |
jjjake/internetarchive | A command-line and Python interface to access Archive.org's services | 1,625 |
mholt/archiver | A multi-format archive utility and Go library that provides a generic replacement for platform-specific or format-specific archive utilities. | 4,442 |
kovah/linkace | A tool to collect and manage links to websites and other online resources for long-term archiving. | 2,643 |
derfenix/webarchive | A web-based archive service that allows users to store and manage web pages in various formats. | 112 |
apache/pdfbox | A Java library for working with PDF documents. | 2,675 |