BMTrain
Large model trainer
A toolkit for training large models in a distributed manner while keeping code simple and efficient.
Efficient Training (including pre-training and fine-tuning) for Big Models
570 stars
11 watching
78 forks
Language: Python
last commit: 8 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
| A curated list of large machine learning models tracked over time | 341 |
| Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,196 |
| A collection of tools and scripts for training large transformer language models at scale | 1,342 |
| A tool for training neural networks using large batch sizes and analyzing the trade-offs between longer training periods and better generalization performance. | 148 |
| A pretraining framework for large language models using 3D parallelism and scalable training techniques | 1,332 |
| A platform for training and deploying large language and vision models that can use tools to perform tasks | 717 |
| A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages | 1,098 |
| A PyTorch model fitting library designed to simplify the process of training deep learning models. | 636 |
| A PyTorch-based model training framework designed to simplify and streamline training workflows by providing a unified interface for various loss functions, optimizers, and validation metrics. | 1,822 |
| A tool for training and fine-tuning large language models using advanced techniques | 387 |
| Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
| Enables distributed deep learning with Keras and Spark for scalable model training | 1,574 |
| This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,167 |
| Research tool for training large transformer language models at scale | 1,926 |