jwarc

WARC library

A Java library for reading and writing WARC files with a typed API

Java library for reading and writing WARC files with a typed API

GitHub

48 stars
6 watching
10 forks
Language: Java
last commit: about 2 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
iipc/warc2html Converts WARC files to static HTML with relative link rewriting and renaming 41
nla/httrack2warc Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs 32
internetarchive/warctools Tools for working with archived web content 153
webrecorder/warcio A fast streaming library for working with WARC format web archival data 391
helgeho/warcpartitioner Tool for partitioning and merging Web archive files by MIME type and year 1
nlnwa/gowarcserver A tool for indexing and serving contents of WARC files. 15
n0tan3rd/node-warc A tool for parsing and generating Web Archive files in JavaScript using Node.js 95
unt-libraries/py-wasapi-client Downloads WARC files from a WASAPI access point. 15
ikreymer/webarchive-indexing Tools for bulk indexing of WARC/ARC files to create a shared url index 43
chfoo/warcat Tool for handling Web Archive files 152
webrecorder/har2warc Converts HTTP Archive format to Web Archive format 48
owlcs/owlapi An API for working with OWL ontologies in Java 832
steffenfritz/html2warc Converts offline data into a standard archival format 18
uakihir0/jtw A Java library providing a simple API to interact with the Twitter v2 API 7
warhub/wham A CLI tool and library for managing wargame data files, converting formats between different systems. 21