jwarc
WARC library
A Java library for reading and writing WARC files with a typed API
Java library for reading and writing WARC files with a typed API
48 stars
6 watching
10 forks
Language: Java
last commit: about 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
iipc/warc2html | Converts WARC files to static HTML with relative link rewriting and renaming | 41 |
nla/httrack2warc | Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs | 32 |
internetarchive/warctools | Tools for working with archived web content | 153 |
webrecorder/warcio | A fast streaming library for working with WARC format web archival data | 391 |
helgeho/warcpartitioner | Tool for partitioning and merging Web archive files by MIME type and year | 1 |
nlnwa/gowarcserver | A tool for indexing and serving contents of WARC files. | 15 |
n0tan3rd/node-warc | A tool for parsing and generating Web Archive files in JavaScript using Node.js | 95 |
unt-libraries/py-wasapi-client | Downloads WARC files from a WASAPI access point. | 15 |
ikreymer/webarchive-indexing | Tools for bulk indexing of WARC/ARC files to create a shared url index | 43 |
chfoo/warcat | Tool for handling Web Archive files | 152 |
webrecorder/har2warc | Converts HTTP Archive format to Web Archive format | 48 |
owlcs/owlapi | An API for working with OWL ontologies in Java | 832 |
steffenfritz/html2warc | Converts offline data into a standard archival format | 18 |
uakihir0/jtw | A Java library providing a simple API to interact with the Twitter v2 API | 7 |
warhub/wham | A CLI tool and library for managing wargame data files, converting formats between different systems. | 21 |