BMTrain

Large model trainer

A toolkit for training large models in a distributed manner while keeping code simple and efficient.

Efficient Training (including pre-training and fine-tuning) for Big Models

GitHub

570 stars

11 watching

78 forks

Language: Python

last commit: over 1 year ago

Linked from 1 awesome list

Backlinks from these awesome lists:

hannibal046/awesome-llm

Related projects:

Repository	Description	Stars
openbmb/cpm-live	A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment.	511
openbmb/bmlist	A curated list of large machine learning models tracked over time	341
open-mmlab/mmengine	Provides a flexible and configurable framework for training deep learning models with PyTorch.	1,196
bigscience-workshop/megatron-deepspeed	A collection of tools and scripts for training large transformer language models at scale	1,342
eladhoffer/bigbatch	A tool for training neural networks using large batch sizes and analyzing the trade-offs between longer training periods and better generalization performance.	148
huggingface/nanotron	A pretraining framework for large language models using 3D parallelism and scalable training techniques	1,332
llava-vl/llava-plus-codebase	A platform for training and deploying large language and vision models that can use tools to perform tasks	717
openbmb/viscpm	A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages	1,098
pytorchbearer/torchbearer	A PyTorch model fitting library designed to simplify the process of training deep learning models.	636
lyhue1991/torchkeras	A PyTorch-based model training framework designed to simplify and streamline training workflows by providing a unified interface for various loss functions, optimizers, and validation metrics.	1,822
bobazooba/xllm	A tool for training and fine-tuning large language models using advanced techniques	387
microsoft/mpnet	Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.	288
maxpumperla/elephas	Enables distributed deep learning with Keras and Spark for scalable model training	1,574
openai/finetune-transformer-lm	This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.	2,167
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926