nanotron

Parallelism framework

A pretraining framework for large language models using 3D parallelism and scalable training techniques

Minimalistic large language model 3D-parallelism training

GitHub

1k stars

43 watching

132 forks

Language: Python

last commit: 8 months ago

Linked from 1 awesome list

Backlinks from these awesome lists:

ethicalml/awesome-production-machine-learning

Related projects:

Repository	Description	Stars
huggingface/hmtl	A neural network model for learning semantic representations from multiple natural language processing tasks	1,191
microsoft/mpnet	Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.	288
open-mmlab/mmengine	Provides a flexible and configurable framework for training deep learning models with PyTorch.	1,196
huggingface/pytorch-openai-transformer-lm	Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities	1,511
maxpumperla/elephas	Enables distributed deep learning with Keras and Spark for scalable model training	1,574
huggingface/setfit	A framework for efficient few-shot learning with Sentence Transformers	2,267
wyy-123-xyy/ra-fed	A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs	6
bigscience-workshop/megatron-deepspeed	A collection of tools and scripts for training large transformer language models at scale	1,342
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926
chendelong1999/polite-flamingo	Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models	63
openbmb/bmtrain	A toolkit for training large models in a distributed manner while keeping code simple and efficient.	570
openbmb/cpm-live	A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment.	511
royson/fedl2p	This project enables personalized learning models by collaborating on learning the best strategy for each client	19
pytorchbearer/torchbearer	A PyTorch model fitting library designed to simplify the process of training deep learning models.	636
csuhan/onellm	A framework for training and fine-tuning multimodal language models on various data types	601