Megatron-LM
LLM trainer
A framework for training large language models using scalable and optimized GPU techniques
Ongoing research training transformer models at scale
11k stars
165 watching
2k forks
Language: Python
last commit: 3 months ago
Linked from 4 awesome lists
large-language-modelsmodel-paratransformers
Related projects:
Repository | Description | Stars |
---|---|---|
| Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,997 |
| Research tool for training large transformer language models at scale | 1,926 |
| A native PyTorch library for training large language models using distributed parallelism and optimization techniques. | 2,765 |
| Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,620 |
| A collection of tools and scripts for training large transformer language models at scale | 1,342 |
| A codebase for working with Open Pre-trained Transformers, enabling deployment and fine-tuning of transformer models on various platforms. | 6,519 |
| A research initiative focused on developing fundamental technology to improve the performance and efficiency of large language models. | 3,747 |
| A system that uses large language and vision models to generate and process visual instructions | 20,683 |
| A high-performance transformer-based NLP component optimized for GPU acceleration and integration into various frameworks. | 5,937 |
| Implementations of a neural network architecture for language modeling | 3,619 |
| A high-performance inference engine for transformer models | 3,467 |
| Develops large multimodal models for various computer vision tasks including image and video analysis | 3,099 |
| A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,308 |
| A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
| An open-source toolkit for pretraining and fine-tuning large language models | 2,732 |