transformer-debugger

Model debugger

An open-source tool that helps investigate specific behaviors of small language models by combining automated interpretability techniques with sparse autoencoders.

GitHub

4k stars
25 watching
235 forks
Language: Python
last commit: 6 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
jessevig/bertviz An interactive tool for visualizing attention in Transformer language models. 6,946
poloclub/transformer-explainer An interactive visualization tool to help users understand how large language models like GPT work 3,347
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,160
openai/gpt-2 A repository providing code and models for research into language modeling and multitask learning 22,516
openai/tiktoken A fast and efficient tokeniser for natural language models based on Byte Pair Encoding (BPE) 12,420
lucidrains/dalle2-pytorch An implementation of DALL-E 2's text-to-image synthesis neural network in PyTorch 11,148
kimiyoung/transformer-xl Implementations of a neural network architecture for language modeling 3,611
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 135,022
google-research/vision_transformer Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax 10,450
openai/whisper A general-purpose speech recognition system trained on large-scale weak supervision 71,257
openbmb/toolbench A platform for training, serving, and evaluating large language models to enable tool use capability 4,843
minimaxir/gpt-2-simple A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets. 3,397
jbloomaus/decisiontransformerinterpretability An open-source project that provides tools and utilities to understand how transformers are used in reinforcement learning tasks. 73
karpathy/mingpt A minimal PyTorch implementation of a transformer-based language model 20,175
openai/sparse_attention Provides primitives for sparse attention mechanisms used in transformer models to improve computational efficiency and scalability 1,524