DeepSpeed-MII

Inference accelerator

A Python library designed to accelerate model inference with high-throughput and low latency capabilities

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

GitHub

2k stars

42 watching

175 forks

Language: Python

last commit: 9 months ago

Linked from 2 awesome lists

deep-learninginferencepytorch

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
utensor/utensor	A lightweight machine learning inference framework built on Tensorflow optimized for Arm targets.	1,742
xboot/libonnx	An onnx inference engine for embedded devices with hardware acceleration support	589
megvii-research/tlc	Improves image restoration performance by converting global operations to local ones during inference	231
xilinx/finn	Fast and scalable neural network inference framework for FPGAs.	770
microsoft/archai	Automates the search for optimal neural network configurations in deep learning applications	468
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926
mims-harvard/ohmnet	An algorithm for learning feature representations in multi-layer networks	81
torchpipe/torchpipe	An open-source framework that enables the deployment and serving of PyTorch models in various acceleration frameworks.	147
jgreenemi/parris	Automates the setup and training of machine learning algorithms on remote servers	316
lge-arc-advancedai/auptimizer	Automates model building and deployment process by optimizing hyperparameters and compressing models for edge computing.	200
denosaurs/netsaur	A machine learning library with GPU, CPU, and WASM backends for building neural networks.	233
arthurpaulino/miraiml	An asynchronous engine for continuous and autonomous machine learning	26
intel/neural-compressor	Tools and techniques for optimizing large language models on various frameworks and hardware platforms.	2,257
mlcommons/inference	Measures the performance of deep learning models in various deployment scenarios.	1,256