higgsfield

GPU orchestration framework

A framework for efficient and fault-tolerant distributed training of large neural networks on multiple GPUs.

Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters

GitHub

3k stars
76 watching
554 forks
Language: Jupyter Notebook
last commit: 6 months ago
cluster-managementdeep-learningdistributedllamallama2llmmachine-learningmlopspytorch

Related projects:

Repository Description Stars
higgsfield/rl-adventure A tutorial on implementing and extending the Deep Q Network algorithm for reinforcement learning tasks 3,034
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,941
ludwig-ai/ludwig A low-code framework for building custom deep learning models and neural networks 11,189
google-research/t5x A modular framework for training and deploying sequence models at scale 2,682
hiyouga/llama-factory A unified platform for fine-tuning multiple large language models with various training approaches and methods 34,436
openvinotoolkit/openvino A toolkit for optimizing and deploying artificial intelligence models in various applications 7,279
ahkarami/deep-learning-in-production A collection of notes and references on deploying deep learning models in production environments 4,306
fminference/flexllmgen Generates large language model outputs in high-throughput mode on single GPUs 9,192
pytorch/torchtitan A native PyTorch library for large-scale language model training with distributed training capabilities 2,615
huggingface/peft An efficient method for fine-tuning large pre-trained models by adapting only a small fraction of their parameters 16,437
sjtu-ipads/powerinfer An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs 7,964
mlfoundations/open_flamingo A framework for training large multimodal models to generate text conditioned on images or other text. 3,742
openvinotoolkit/open_model_zoo A collection of pre-trained deep learning models and demo applications for accelerating inference tasks 4,098
bytedance/byteps A high-performance distributed deep learning framework supporting multiple frameworks and networks 3,630
microsoft/lightgbm A high-performance gradient boosting framework for machine learning tasks 16,694