DeepSpeed-MII

Inference accelerator

A Python library designed to accelerate model inference with high-throughput and low latency capabilities

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

GitHub

2k stars
41 watching
175 forks
Language: Python
last commit: 13 days ago
Linked from 2 awesome lists

deep-learninginferencepytorch

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
intel/scikit-learn-intelex An acceleration toolkit for scikit-learn machine learning algorithms on Intel hardware 1,227
utensor/utensor A lightweight machine learning inference framework built on Tensorflow optimized for Arm targets. 1,729
xboot/libonnx A lightweight onnx inference engine for embedded devices with hardware acceleration support 583
megvii-research/tlc Improves image restoration performance by converting global operations to local ones during inference 231
xilinx/finn Fast and scalable neural network inference framework for FPGAs. 747
microsoft/archai Automates the search for optimal neural network configurations in deep learning applications 467
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
mims-harvard/ohmnet An algorithm for learning feature representations in multi-layer networks 79
torchpipe/torchpipe A C++ library that provides a multi-instance pipeline parallel framework for accelerating deep learning inference on various hardware accelerators. 144
jgreenemi/parris Automates the setup and training of machine learning algorithms on remote servers 316
lge-arc-advancedai/auptimizer Automates model building and deployment process by optimizing hyperparameters and compressing models for edge computing. 200
denosaurs/netsaur A machine learning library with GPU, CPU, and WASM backends for building neural networks. 232
arthurpaulino/miraiml An asynchronous engine for continuous and autonomous machine learning 26
intel/neural-compressor Tools and techniques for optimizing large language models on various frameworks and hardware platforms. 2,226
mlcommons/inference Measures the performance of deep learning models in various deployment scenarios. 1,236