Megatron-DeepSpeed
Transformer trainer
Research tool for training large transformer language models at scale
Ongoing research training transformer language models at scale, including: BERT & GPT-2
2k stars
24 watching
344 forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
bigscience-workshop/megatron-deepspeed | A collection of tools and scripts for training large transformer language models at scale | 1,335 |
german-nlp-group/german-transformer-training | Trains German transformer models to improve language understanding | 23 |
matlab-deep-learning/transformer-models | An implementation of deep learning transformer models in MATLAB | 206 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
fastnlp/cpt | A pre-trained transformer model for natural language understanding and generation tasks in Chinese | 481 |
jsksxs360/how-to-use-transformers | A comprehensive guide to using the Transformers library for natural language processing tasks | 1,133 |
rdspring1/pytorch_gbw_lm | Trains a large-scale PyTorch language model on the 1-Billion Word dataset | 123 |
huggingface/nanotron | A library for training large language models with parallel computing and mixed precision training methods | 1,244 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 506 |
maxpumperla/elephas | Enables distributed deep learning with Keras and Spark for scalable model training | 1,574 |
nlpodyssey/cybertron | A Go package providing an easy interface to use pre-trained NLP models from the HuggingFace repository for tasks like text classification and machine translation. | 286 |
marella/ctransformers | Provides a unified interface to various transformer models implemented in C/C++ using GGML library | 1,814 |
tongjilibo/bert4torch | An implementation of transformer models in PyTorch for natural language processing tasks | 1,241 |
gram-ai/radio-transformer-networks | An implementation of a machine learning-based communications system using deep learning techniques. | 127 |
pixart-alpha/pixart-sigma | Develops a PyTorch model for 4K text-to-image generation using diffusion transformer | 1,675 |