bulk_extractor

Data extractor

Extracts structured information from digital data without parsing file systems

This is the development tree. Production downloads are at:

GitHub

1k stars
76 watching
189 forks
Language: C++
last commit: about 2 months ago
Linked from 4 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
eyurtsev/kor An open-source wrapper around LLMs to extract structured data from text 1,638
sblom/regextract A tool that enables easy and efficient data extraction from text using regular expressions in C#. 698
cmu-sei/cyobstract Extracts structured cyber information from incident reports. 79
bromiumlabs/packerattacker An application designed to detect and extract hidden code from malicious Windows executables. 270
idea-fasoc/datasheet-scrubber Automates extraction of key circuit information from PDF datasheets/documents to build a database of commercial off-the-shelf IP. 51
suse/clang-extract A tool to extract code content from source files using the clang and LLVM infrastructure. 17
siguza/imobax Extracts and processes iOS mobile backups 182
gskril/farcaster-indexer An indexer tool for extracting data from the Farcaster protocol and storing it in a Postgres database 152
51j0/android-storage-extractor A tool to extract local data storage of an Android application in one click. 16
nissl-lab/toxy A .NET framework for extracting text from various document formats across multiple platforms. 362
syntax-tree/hast-util-to-text Utility function to extract plain text from HTML-like data structures 19
anssi-fr/bits_parser Extracts and stores BITS job data from QMGR queues as CSV records. 74
fox-it/dissect.target Provides a programming API and command line tools to access various data sources inside disk images or file collections. 48
egorbo/simdjsonsharp Library for fast JSON parsing and minification using SIMD instructions 651
recrm/archivetools A collection of tools for extracting and analyzing data from web archives 71