Megatron-LM

LLM trainer

A framework for training large language models using scalable and optimized GPU techniques

Ongoing research training transformer models at scale

GitHub

11k stars

165 watching

2k forks

Language: Python

last commit: 10 months ago

Linked from 4 awesome lists

large-language-modelsmodel-paratransformers

Screenshot of NVIDIA/Megatron-LM website

docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
eleutherai/gpt-neox	Provides a framework for training large-scale language models on GPUs with advanced features and optimizations.	6,997
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926
pytorch/torchtitan	A native PyTorch library for training large language models using distributed parallelism and optimization techniques.	2,765
google-research/vision_transformer	Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax	10,620
bigscience-workshop/megatron-deepspeed	A collection of tools and scripts for training large transformer language models at scale	1,342
facebookresearch/metaseq	A codebase for working with Open Pre-trained Transformers, enabling deployment and fine-tuning of transformer models on various platforms.	6,519
microsoft/lmops	A research initiative focused on developing fundamental technology to improve the performance and efficiency of large language models.	3,747
haotian-liu/llava	A system that uses large language and vision models to generate and process visual instructions	20,683
nvidia/fastertransformer	A high-performance transformer-based NLP component optimized for GPU acceleration and integration into various frameworks.	5,937
kimiyoung/transformer-xl	Implementations of a neural network architecture for language modeling	3,619
opennmt/ctranslate2	A high-performance inference engine for transformer models	3,467
llava-vl/llava-next	Develops large multimodal models for various computer vision tasks including image and video analysis	3,099
huggingface/trl	A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods.	10,308
huggingface/transformers	A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects.	136,357
alpha-vllm/llama2-accessory	An open-source toolkit for pretraining and fine-tuning large language models	2,732