ColossalAI

AI parallelism toolkit

A toolkit for training and deploying large AI models in parallel on distributed computing infrastructure

Making large AI models cheaper, faster and more accessible

GitHub

39k stars
385 watching
4k forks
Language: Python
last commit: about 1 month ago
Linked from 8 awesome lists

aibig-modeldata-parallelismdeep-learningdistributed-computingfoundation-modelsheterogeneous-traininghpcinferencelarge-scalemodel-parallelismpipeline-parallelism

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
sjtu-ipads/powerinfer An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs 8,011
microsoft/deepspeed A deep learning optimization library that simplifies distributed training and inference on modern computing hardware. 35,863
flagai-open/flagai An open-source toolkit for training and deploying large-scale AI models on various downstream tasks with multi-modality 3,840
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,997
mediar-ai/screenpipe A platform for building and deploying AI agents with full context using screen recordings, allowing for 24/7 monitoring and control. 11,060
significant-gravitas/autogpt A platform for building and deploying autonomous AI agents to automate complex workflows 169,186
postgresml/postgresml An open-source Postgres extension for machine learning and AI operations directly within the database. 6,070
huawei-noah/efficient-ai-backbones A collection of efficient AI backbone architectures developed by Huawei Noah's Ark Lab. 4,098
plasma-umass/scalene A high-performance Python profiler that analyzes CPU, GPU, and memory usage, providing detailed information and AI-powered optimization suggestions. 12,274
exo-explore/exo An experimental software framework to run AI models on diverse devices without requiring expensive GPUs. 17,369
portkey-ai/gateway A fast and secure routing service for integrating with large language models 6,557
google-research/big_vision Supports large-scale vision model training on GPU machines or Google Cloud TPUs using scalable input pipelines. 2,439
coqui-ai/tts A deep learning toolkit for generating human-like speech from text 36,118
tencent/hunyuandit A PyTorch model definition and inference/sampling code repository for a powerful diffusion transformer with fine-grained Chinese understanding 3,678
skypilot-org/skypilot A framework for running AI and batch workloads on any infrastructure, offering unified execution, cost savings, and high GPU availability. 6,905