node-warc
WARC parser
A tool for parsing and generating Web Archive files in JavaScript using Node.js
Parse And Create Web ARChive (WARC) files with node.js
95 stars
9 watching
21 forks
Language: JavaScript
last commit: almost 3 years ago
Linked from 1 awesome list
chrome-remote-interfacepupeteerwarcwarc-filesweb-archivesweb-archivingwebarchivewebarchiving
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Tools for working with archived web content | 153 |
| | Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs | 32 |
| | A fast streaming library for working with WARC format web archival data | 391 |
| | Tool for partitioning and merging Web archive files by MIME type and year | 1 |
| | Provides tools for reading and parsing web archive formats used in digital preservation. | 20 |
| | A web archiving tool that archives websites with high-fidelity preservation capabilities. | 57 |
| | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,406 |
| | A Node.js library for parsing CDXJ files produced by Pywb | 0 |
| | Converts HTTP Archive format to Web Archive format | 48 |
| | A command-line tool for archiving and playing back websites in WARC format | 59 |
| | Tool for handling Web Archive files | 152 |
| | An archival crawler built on top of Chrome or Chromium to preserve the web in high fidelity and user scriptable manner | 170 |
| | Tools for bulk indexing of WARC/ARC files to create a shared url index | 43 |
| | A tool for archiving web pages as single HTML files | 45 |
| | A tool for indexing and serving contents of WARC files. | 15 |