DeepSpeed

Deep learning optimizer

A deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

GitHub

35k stars
346 watching
4k forks
Language: Python
last commit: 7 days ago
Linked from 6 awesome lists

billion-parameterscompressiondata-parallelismdeep-learninggpuinferencemachine-learningmixture-of-expertsmodel-parallelismpipeline-parallelismpytorchtrillion-parameterszero

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
sjtu-ipads/powerinfer An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs 7,964
microsoft/deepspeed-mii A Python library designed to accelerate model inference with high-throughput and low latency capabilities 1,898
deepseek-ai/deepseek-v2 A high-performance mixture-of-experts language model with strong performance and efficient inference capabilities. 3,590
neuralmagic/deepsparse A sparsity-aware deep learning inference runtime for CPUs that optimizes neural network performance on CPU hardware. 3,026
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,941
dennybritz/deeplearning-papernotes A collection of notes and summaries on various deep learning research papers, including their topics, techniques, and applications. 4,410
jolibrain/deepdetect A machine learning API and server written in C++ that supports multiple deep learning libraries and provides a flexible interface for building and deploying machine learning models. 2,519
paddlepaddle/paddle A high-performance deep learning framework designed for industrial-scale training and deployment of neural networks. 22,258
coqui-ai/tts A deep learning toolkit for generating human-like speech from text 35,453
paddlepaddle/fastdeploy A toolkit for easy and high-performance deployment of deep learning models on various hardware platforms 2,998
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
hpcaitech/colossalai Making large AI models cheaper, faster, and more accessible by providing tools and strategies for efficient distributed training and inference. 38,797
internlm/lmdeploy A toolkit for optimizing and serving large language models 4,653
deep-floyd/if A text-to-image synthesis model with a modular design, utilizing a frozen text encoder and cascaded pixel diffusion modules to generate photorealistic images. 7,688
alirezadir/production-level-deep-learning A guide to building production-ready deep learning systems for real-world applications 4,351