nanotron
Parallelism framework
A pretraining framework for large language models using 3D parallelism and scalable training techniques
Minimalistic large language model 3D-parallelism training
1k stars
43 watching
132 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A neural network model for learning semantic representations from multiple natural language processing tasks | 1,191 |
| Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
| Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,196 |
| Implementing OpenAI's transformer language model in PyTorch with pre-trained weights and fine-tuning capabilities | 1,511 |
| Enables distributed deep learning with Keras and Spark for scalable model training | 1,574 |
| A framework for efficient few-shot learning with Sentence Transformers | 2,267 |
| A Python implementation of a distributed machine learning framework for training neural networks on multiple GPUs | 6 |
| A collection of tools and scripts for training large transformer language models at scale | 1,342 |
| Research tool for training large transformer language models at scale | 1,926 |
| Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models | 63 |
| A toolkit for training large models in a distributed manner while keeping code simple and efficient. | 570 |
| A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
| This project enables personalized learning models by collaborating on learning the best strategy for each client | 19 |
| A PyTorch model fitting library designed to simplify the process of training deep learning models. | 636 |
| A framework for training and fine-tuning multimodal language models on various data types | 601 |