Megatron-LM

LLM trainer

A research framework for training large language models at scale using GPU optimized techniques.

Ongoing research training transformer models at scale

GitHub

11k stars
161 watching
2k forks
Language: Python
last commit: 6 days ago
Linked from 4 awesome lists

large-language-modelsmodel-paratransformers

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,941
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
pytorch/torchtitan A native PyTorch library for large-scale language model training with distributed training capabilities 2,615
google-research/vision_transformer Provides pre-trained models and code for training vision transformers and mixers using JAX/Flax 10,450
bigscience-workshop/megatron-deepspeed A collection of tools and scripts for training large transformer language models at scale 1,335
facebookresearch/metaseq A codebase for working with Open Pre-trained Transformers, enabling deployment and fine-tuning of transformer models on various platforms. 6,515
microsoft/lmops A research initiative focused on developing fundamental technology to improve the performance and efficiency of large language models. 3,695
haotian-liu/llava A system that uses large language and vision models to generate and process visual instructions 20,232
nvidia/fastertransformer A high-performance transformer-based NLP component optimized for GPU acceleration and integration into various frameworks. 5,886
kimiyoung/transformer-xl Implementations of a neural network architecture for language modeling 3,611
opennmt/ctranslate2 A high-performance library for efficient inference with Transformer models on CPUs and GPUs. 3,404
llava-vl/llava-next Develops large multimodal models for various computer vision tasks including image and video analysis 2,872
huggingface/trl A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. 10,053
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 135,022
alpha-vllm/llama2-accessory An open-source toolkit for pretraining and fine-tuning large language models 2,720