pythia
Knowledge analyzer
Analyzing knowledge development and evolution in large language models during training
The hub for EleutherAI's work on interpretability and learning dynamics
2k stars
33 watching
173 forks
Language: Jupyter Notebook
last commit: about 1 month ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,968 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 135,747 |
pytorch/examples | A collection of curated examples showcasing various PyTorch applications in computer vision, natural language processing, and reinforcement learning. | 22,490 |
pytorch/captum | Provides tools and algorithms to understand how machine learning models make predictions | 4,957 |
karpathy/mingpt | A minimal PyTorch implementation of a transformer-based language model | 20,314 |
timeseriesai/tsai | A comprehensive deep learning package for time series data analysis and forecasting. | 5,289 |
databrickslabs/dolly | A large language model trained on a commercial machine learning platform with limited capabilities | 10,820 |
lucidrains/reformer-pytorch | An implementation of Reformer, an efficient Transformer model for natural language processing tasks. | 2,129 |
codertimo/bert-pytorch | An implementation of Google's 2018 BERT model in PyTorch, allowing pre-training and fine-tuning for natural language processing tasks | 6,233 |
huggingface/pytorch-openai-transformer-lm | Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities | 1,512 |
huggingface/trl | A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,208 |
google-research/electra | A method for pre-training transformer networks to learn language representations from text data without labeled supervision | 2,339 |
sktime/pytorch-forecasting | A PyTorch-based package for state-of-the-art time series forecasting with deep learning architectures | 4,027 |
kimiyoung/transformer-xl | Implementations of a neural network architecture for language modeling | 3,610 |
adapter-hub/adapters | A unified library for parameter-efficient and modular transfer learning in NLP tasks | 2,593 |