TransformerLens

Model decipherer

A library for reverse engineering the algorithms learned by large language models from their weights

A library for mechanistic interpretability of GPT-style language models

GitHub

2k stars
17 watching
304 forks
Language: Python
last commit: 17 days ago

Related projects:

Repository Description Stars
matlab-deep-learning/transformer-models An implementation of deep learning transformer models in MATLAB 206
lucidrains/reformer-pytorch An implementation of Reformer, an efficient Transformer model for natural language processing tasks. 2,120
jlevy/repren A tool for refactoring and transforming text files according to regular expression patterns 347
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,160
chrislemke/sk-transformers Provides a collection of reusable data transformation tools 8
leviswind/pytorch-transformer Implementation of a transformer-based translation model in PyTorch 239
feature-engine/feature_engine A Python library with multiple transformers to engineer and select features for use in machine learning models. 1,926
fastnlp/cpt A pre-trained transformer model for natural language understanding and generation tasks in Chinese 481
marella/ctransformers Provides a unified interface to various transformer models implemented in C/C++ using GGML library 1,814
nlgranger/seqtools A Python library to manipulate and transform indexable data 48
pylons/colander A library for serializing and deserializing data structures into strings, mappings, and lists while performing validation. 451
jbloomaus/decisiontransformerinterpretability An open-source project that provides tools and utilities to understand how transformers are used in reinforcement learning tasks. 73
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
bigscience-workshop/megatron-deepspeed A collection of tools and scripts for training large transformer language models at scale 1,335
neukg/techgpt A generative transformer model designed to process and generate text in various vertical domains, including computer science, finance, and more. 212