nanotron

Parallel model trainer

A library for training large language models with parallel computing and mixed precision training methods

Minimalistic large language model 3D-parallelism training

GitHub

1k stars
42 watching
122 forks
Language: Python
last commit: 17 days ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
huggingface/hmtl A neural network model for learning semantic representations from multiple natural language processing tasks 1,191
microsoft/mpnet Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. 288
open-mmlab/mmengine Provides a flexible and configurable framework for training deep learning models with PyTorch. 1,179
huggingface/pytorch-openai-transformer-lm Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities 1,511
maxpumperla/elephas Enables distributed deep learning with Keras and Spark for scalable model training 1,574
huggingface/setfit A framework for efficient few-shot learning with Sentence Transformers 2,236
wyy-123-xyy/ra-fed A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs 6
bigscience-workshop/megatron-deepspeed A collection of tools and scripts for training large transformer language models at scale 1,335
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
chendelong1999/polite-flamingo Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models 63
openbmb/bmtrain A toolkit for training large models in a distributed manner while keeping code simple and efficient. 563
openbmb/cpm-live A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. 511
royson/fedl2p This project enables personalized learning models by collaborating on learning the best strategy for each client 19
pytorchbearer/torchbearer A PyTorch model fitting library designed to simplify the process of training deep learning models. 636
csuhan/onellm A framework for training and fine-tuning multimodal language models on various data types 588