ArchiveTools

Data extractor

A collection of tools for extracting and analyzing data from web archives

A collection of tools for archiving and analysing the internet.

GitHub

70 stars
6 watching
15 forks
Language: Python
last commit: over 2 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
raulfraile/distill A tool that extracts files from compressed archives using various methods and strategies to optimize bandwidth or decompression speed. 224
karust/gogetcrawl A tool and package for extracting web archive data from popular sources like Wayback Machine and Common Crawl using the Go programming language. 149
chatnoir-eu/chatnoir-resiliparse A toolkit for processing and analyzing web archive data 85
rmendels/rerddapxtracto A package for accessing and extracting environmental data from remote ERDDAP servers. 14
anonyfox/elixir-scrape A tool for extracting structured data from web resources using information-retrieval techniques. 328
pxyup/fitter A utility for extracting and processing data from various sources, including APIs, websites, and static text 120
eset-la/lord-of-the-strings A tool to extract and classify relevant strings from binary files 9
oduwsdl/archivenow A tool to automate archiving of web resources into public archives. 409
eyurtsev/kor An open-source wrapper around LLMs to extract structured data from text 1,636
le0me55i/zsh-extract A plugin that automates the extraction of archive files from various formats. 19
keydet89/regripper3.0 A tool designed to extract and analyze data from Windows registry files 560
jiiks/asar.net A .NET implementation of the Atom Asar archive format, allowing extraction and manipulation of archived files. 36
deviantech/rack-referrals Extracts information about referring search engines from HTTP requests. 17
thetic/extract A plugin that allows users to extract files from various archive formats without specifying the extraction command. 9
pbiecek/archivist A tool for managing and archiving data analysis results in R. 74