LM_Memorization
Content Extractor
A tool to extract memorized content from large language models like GPT-2 by analyzing their training data
Training data extraction on GPT-2
179 stars
7 watching
33 forks
Language: Python
last commit: about 2 years ago Related projects:
Repository | Description | Stars |
---|---|---|
| A tool for extracting machine learning models from cloud-based services using prediction APIs | 344 |
| An open-source wrapper around LLMs to extract structured data from text | 1,638 |
| A collection of tools for extracting and analyzing data from web archives | 71 |
| A Python package for measuring memorization in Large Language Models. | 126 |
| A tool to extract physical memory from Android devices without kernel source code or LKM support. | 12 |
| A tool to extract and display the contents of a system's physical memory | 12 |
| A tool to extract and classify relevant strings from binary files | 9 |
| A PHP library that simplifies the integration of Large Language Models into applications by providing structured data extraction and validation. | 230 |
| Tools and scripts for capturing and analyzing Linux memory | 266 |
| A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input. | 2 |
| A deep learning model for analyzing sentiment and emotion in text based on emojis. | 1,525 |
| A macro that memoizes the results of functions with identical signatures | 33 |
| An LLM-based tool for extracting structured information from text with ontology-based grounding. | 626 |
| A tool that automatically extracts relevant metadata from code repositories, including software descriptions and bibliographic citations. | 47 |
| A Ruby library for extracting text and metadata from various file formats. | 498 |