DeepSpeed
Deep learning optimizer
A deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
35k stars
346 watching
4k forks
Language: Python
last commit: 7 days ago
Linked from 6 awesome lists
billion-parameterscompressiondata-parallelismdeep-learninggpuinferencemachine-learningmixture-of-expertsmodel-parallelismpipeline-parallelismpytorchtrillion-parameterszero
Related projects:
Repository | Description | Stars |
---|---|---|
sjtu-ipads/powerinfer | An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs | 7,964 |
microsoft/deepspeed-mii | A Python library designed to accelerate model inference with high-throughput and low latency capabilities | 1,898 |
deepseek-ai/deepseek-v2 | A high-performance mixture-of-experts language model with strong performance and efficient inference capabilities. | 3,590 |
neuralmagic/deepsparse | A sparsity-aware deep learning inference runtime for CPUs that optimizes neural network performance on CPU hardware. | 3,026 |
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,941 |
dennybritz/deeplearning-papernotes | A collection of notes and summaries on various deep learning research papers, including their topics, techniques, and applications. | 4,410 |
jolibrain/deepdetect | A machine learning API and server written in C++ that supports multiple deep learning libraries and provides a flexible interface for building and deploying machine learning models. | 2,519 |
paddlepaddle/paddle | A high-performance deep learning framework designed for industrial-scale training and deployment of neural networks. | 22,258 |
coqui-ai/tts | A deep learning toolkit for generating human-like speech from text | 35,453 |
paddlepaddle/fastdeploy | A toolkit for easy and high-performance deployment of deep learning models on various hardware platforms | 2,998 |
microsoft/megatron-deepspeed | Research tool for training large transformer language models at scale | 1,895 |
hpcaitech/colossalai | Making large AI models cheaper, faster, and more accessible by providing tools and strategies for efficient distributed training and inference. | 38,797 |
internlm/lmdeploy | A toolkit for optimizing and serving large language models | 4,653 |
deep-floyd/if | A text-to-image synthesis model with a modular design, utilizing a frozen text encoder and cascaded pixel diffusion modules to generate photorealistic images. | 7,688 |
alirezadir/production-level-deep-learning | A guide to building production-ready deep learning systems for real-world applications | 4,351 |