Megatron-LM
LLM trainer
A research framework for training large language models at scale using GPU optimized techniques.
Ongoing research training transformer models at scale
11k stars
161 watching
2k forks
Language: Python
last commit: 6 days ago
Linked from 4 awesome lists
large-language-modelsmodel-paratransformers
Related projects:
Repository | Description | Stars |
---|---|---|
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,941 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,895 |
pytorch/torchtitan | A native PyTorch library for large-scale language model training with distributed training capabilities | 2,615 |
google-research/vision_transformer | Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax | 10,450 |
bigscience-workshop/megatron-deepspeed | A collection of tools and scripts for training large transformer language models at scale | 1,335 |
facebookresearch/metaseq | A codebase for working with Open Pre-trained Transformers, enabling deployment and fine-tuning of transformer models on various platforms. | 6,515 |
microsoft/lmops | A research initiative focused on developing fundamental technology to improve the performance and efficiency of large language models. | 3,695 |
haotian-liu/llava | A system that uses large language and vision models to generate and process visual instructions | 20,232 |
nvidia/fastertransformer | A high-performance transformer-based NLP component optimized for GPU acceleration and integration into various frameworks. | 5,886 |
kimiyoung/transformer-xl | Implementations of a neural network architecture for language modeling | 3,611 |
opennmt/ctranslate2 | A high-performance library for efficient inference with Transformer models on CPUs and GPUs. | 3,404 |
llava-vl/llava-next | Develops large multimodal models for various computer vision tasks including image and video analysis | 2,872 |
huggingface/trl | A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,053 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 135,022 |
alpha-vllm/llama2-accessory | An open-source toolkit for pretraining and fine-tuning large language models | 2,720 |