LM_Memorization

Content Extractor

A tool to extract memorized content from large language models like GPT-2 by analyzing their training data

Training data extraction on GPT-2

179 stars

7 watching

33 forks

Language: Python

last commit: over 3 years ago

Related projects:

Repository	Description	Stars
ftramer/steal-ml	A tool for extracting machine learning models from cloud-based services using prediction APIs	344
eyurtsev/kor	An open-source wrapper around LLMs to extract structured data from text	1,638
recrm/archivetools	A collection of tools for extracting and analyzing data from web archives	71
iamgroot42/mimir	A Python package for measuring memorization in Large Language Models.	126
ir193/amextractor	A tool to extract physical memory from Android devices without kernel source code or LKM support.	12
kost/memdump	A tool to extract and display the contents of a system's physical memory	12
eset-la/lord-of-the-strings	A tool to extract and classify relevant strings from binary files	9
cognesy/instructor-php	A PHP library that simplifies the integration of Large Language Models into applications by providing structured data extraction and validation.	230
halpomeranz/lmg	Tools and scripts for capturing and analyzing Linux memory	266
gamallo/galextra	A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input.	2
bfelbo/deepmoji	A deep learning model for analyzing sentiment and emotion in text based on emojis.	1,525
os6sense/defmemo	A macro that memoizes the results of functions with identical signatures	33
monarch-initiative/ontogpt	An LLM-based tool for extracting structured information from text with ontology-based grounding.	626
knowledgecaptureanddiscovery/somef	A tool that automatically extracts relevant metadata from code repositories, including software descriptions and bibliographic citations.	47
yomurb/yomu	A Ruby library for extracting text and metadata from various file formats.	498