Megatron-DeepSpeed

Transformer trainer

A collection of tools and scripts for training large transformer language models at scale

Ongoing research training transformer language models at scale, including: BERT & GPT-2

GitHub

1k stars
24 watching
220 forks
Language: Python
last commit: 10 months ago

Related projects:

Repository Description Stars
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,926
german-nlp-group/german-transformer-training Trains German transformer models to improve language understanding 23
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,167
openbmb/bmtrain A toolkit for training large models in a distributed manner while keeping code simple and efficient. 570
fastnlp/cpt A pre-trained transformer model for natural language understanding and generation tasks in Chinese 482
matlab-deep-learning/transformer-models An implementation of deep learning transformer models in MATLAB 209
jsksxs360/how-to-use-transformers A comprehensive guide to using the Transformers library for natural language processing tasks 1,220
huggingface/nanotron A pretraining framework for large language models using 3D parallelism and scalable training techniques 1,332
pytorchbearer/torchbearer A PyTorch model fitting library designed to simplify the process of training deep learning models. 636
ist-daslab/gptq An implementation of post-training quantization algorithm for transformer models to reduce memory usage and improve inference speed 1,964
tongjilibo/bert4torch An implementation of transformer models in PyTorch for natural language processing tasks 1,257
marella/ctransformers Provides a unified interface to various transformer models implemented in C/C++ using GGML library 1,823
chrislemke/sk-transformers Provides a collection of reusable data transformation tools 10
maxpumperla/elephas Enables distributed deep learning with Keras and Spark for scalable model training 1,574
ibrahimsobh/transformers An implementation of deep neural network architectures, including Transformers, in Python. 214