LM_Memorization
Content Extractor
A tool to extract memorized content from large language models like GPT-2 by analyzing their training data
Training data extraction on GPT-2
179 stars
7 watching
33 forks
Language: Python
last commit: almost 2 years ago Related projects:
Repository | Description | Stars |
---|---|---|
ftramer/steal-ml | A tool for extracting machine learning models from cloud-based services using prediction APIs | 344 |
eyurtsev/kor | An open-source wrapper around LLMs to extract structured data from text | 1,638 |
recrm/archivetools | A collection of tools for extracting and analyzing data from web archives | 71 |
iamgroot42/mimir | A Python package for measuring memorization in Large Language Models. | 126 |
ir193/amextractor | A tool to extract physical memory from Android devices without kernel source code or LKM support. | 12 |
kost/memdump | A tool to extract and display the contents of a system's physical memory | 12 |
eset-la/lord-of-the-strings | A tool to extract and classify relevant strings from binary files | 9 |
cognesy/instructor-php | A PHP library that simplifies the integration of Large Language Models into applications by providing structured data extraction and validation. | 230 |
halpomeranz/lmg | Tools and scripts for capturing and analyzing Linux memory | 266 |
gamallo/galextra | A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input. | 2 |
bfelbo/deepmoji | A deep learning model for analyzing sentiment and emotion in text based on emojis. | 1,525 |
os6sense/defmemo | A macro that memoizes the results of functions with identical signatures | 33 |
monarch-initiative/ontogpt | An LLM-based tool for extracting structured information from text with ontology-based grounding. | 626 |
knowledgecaptureanddiscovery/somef | A tool that automatically extracts relevant metadata from code repositories, including software descriptions and bibliographic citations. | 47 |
yomurb/yomu | A Ruby library for extracting text and metadata from various file formats. | 498 |