transformer-debugger
Model debugger
An open-source tool that helps investigate specific behaviors of small language models by combining automated interpretability techniques with sparse autoencoders.
4k stars
25 watching
235 forks
Language: Python
last commit: 6 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
jessevig/bertviz | An interactive tool for visualizing attention in Transformer language models. | 6,946 |
poloclub/transformer-explainer | An interactive visualization tool to help users understand how large language models like GPT work | 3,347 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
openai/gpt-2 | A repository providing code and models for research into language modeling and multitask learning | 22,516 |
openai/tiktoken | A fast and efficient tokeniser for natural language models based on Byte Pair Encoding (BPE) | 12,420 |
lucidrains/dalle2-pytorch | An implementation of DALL-E 2's text-to-image synthesis neural network in PyTorch | 11,148 |
kimiyoung/transformer-xl | Implementations of a neural network architecture for language modeling | 3,611 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 135,022 |
google-research/vision_transformer | Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,450 |
openai/whisper | A general-purpose speech recognition system trained on large-scale weak supervision | 71,257 |
openbmb/toolbench | A platform for training, serving, and evaluating large language models to enable tool use capability | 4,843 |
minimaxir/gpt-2-simple | A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets. | 3,397 |
jbloomaus/decisiontransformerinterpretability | An open-source project that provides tools and utilities to understand how transformers are used in reinforcement learning tasks. | 73 |
karpathy/mingpt | A minimal PyTorch implementation of a transformer-based language model | 20,175 |
openai/sparse_attention | Provides primitives for sparse attention mechanisms used in transformer models to improve computational efficiency and scalability | 1,524 |