BMTrain
Large model trainer
A toolkit for training large models in a distributed manner while keeping code simple and efficient.
Efficient Training (including pre-training and fine-tuning) for Big Models
563 stars
11 watching
77 forks
Language: Python
last commit: 4 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
openbmb/cpm-live | A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
openbmb/bmlist | A curated list of large machine learning models tracked over time | 341 |
open-mmlab/mmengine | Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,179 |
bigscience-workshop/megatron-deepspeed | A collection of tools and scripts for training large transformer language models at scale | 1,335 |
eladhoffer/bigbatch | A tool for training neural networks using large batch sizes and analyzing the trade-offs between longer training periods and better generalization performance. | 148 |
huggingface/nanotron | A library for training large language models with parallel computing and mixed precision training methods | 1,244 |
llava-vl/llava-plus-codebase | A platform for training and deploying large language and vision models that can use tools to perform tasks | 704 |
openbmb/viscpm | A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages | 1,089 |
pytorchbearer/torchbearer | A PyTorch model fitting library designed to simplify the process of training deep learning models. | 636 |
lyhue1991/torchkeras | A PyTorch-based model training framework designed to simplify and streamline training workflows by providing a unified interface for various loss functions, optimizers, and validation metrics. | 1,782 |
bobazooba/xllm | A tool for training and fine-tuning large language models using advanced techniques | 380 |
microsoft/mpnet | Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
maxpumperla/elephas | Enables distributed deep learning with Keras and Spark for scalable model training | 1,574 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,895 |