BMTrain

Large model trainer

A toolkit for training large models in a distributed manner while keeping code simple and efficient.

Efficient Training (including pre-training and fine-tuning) for Big Models

GitHub

570 stars
11 watching
78 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
openbmb/cpm-live A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. 511
openbmb/bmlist A curated list of large machine learning models tracked over time 341
open-mmlab/mmengine Provides a flexible and configurable framework for training deep learning models with PyTorch. 1,196
bigscience-workshop/megatron-deepspeed A collection of tools and scripts for training large transformer language models at scale 1,342
eladhoffer/bigbatch A tool for training neural networks using large batch sizes and analyzing the trade-offs between longer training periods and better generalization performance. 148
huggingface/nanotron A pretraining framework for large language models using 3D parallelism and scalable training techniques 1,332
llava-vl/llava-plus-codebase A platform for training and deploying large language and vision models that can use tools to perform tasks 717
openbmb/viscpm A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages 1,098
pytorchbearer/torchbearer A PyTorch model fitting library designed to simplify the process of training deep learning models. 636
lyhue1991/torchkeras A PyTorch-based model training framework designed to simplify and streamline training workflows by providing a unified interface for various loss functions, optimizers, and validation metrics. 1,822
bobazooba/xllm A tool for training and fine-tuning large language models using advanced techniques 387
microsoft/mpnet Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. 288
maxpumperla/elephas Enables distributed deep learning with Keras and Spark for scalable model training 1,574
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,167
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,926