transformer-debugger
Model debugger
An open-source tool that helps investigate specific behaviors of small language models by combining automated interpretability techniques with sparse autoencoders.
4k stars
25 watching
242 forks
Language: Python
last commit: 9 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| An interactive tool for visualizing attention in Transformer language models. | 7,019 |
| An interactive visualization tool to help users understand how large language models like GPT work | 3,604 |
| This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| A repository providing code and models for research into language modeling and multitask learning | 22,644 |
| A fast and efficient tokeniser for natural language models based on Byte Pair Encoding (BPE) | 12,703 |
| An implementation of DALL-E 2's text-to-image synthesis neural network in PyTorch | 11,184 |
| Implementations of a neural network architecture for language modeling | 3,619 |
| A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
| Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,620 |
| A general-purpose speech recognition system trained on large-scale weak supervision | 72,752 |
| A platform for training, serving, and evaluating large language models to enable tool use capability | 4,888 |
| A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets. | 3,398 |
| An open-source project that provides tools and utilities to understand how transformers are used in reinforcement learning tasks. | 75 |
| A minimal PyTorch implementation of a transformer-based language model | 20,474 |
| Provides primitives for sparse attention mechanisms used in transformer models to improve computational efficiency and scalability | 1,533 |