DeepSpeed-MII
Inference accelerator
A Python library designed to accelerate model inference with high-throughput and low latency capabilities
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
2k stars
42 watching
175 forks
Language: Python
last commit: 4 months ago
Linked from 2 awesome lists
deep-learninginferencepytorch
Related projects:
Repository | Description | Stars |
---|---|---|
| A lightweight machine learning inference framework built on Tensorflow optimized for Arm targets. | 1,742 |
| An onnx inference engine for embedded devices with hardware acceleration support | 589 |
| Improves image restoration performance by converting global operations to local ones during inference | 231 |
| Fast and scalable neural network inference framework for FPGAs. | 770 |
| Automates the search for optimal neural network configurations in deep learning applications | 468 |
| Research tool for training large transformer language models at scale | 1,926 |
| An algorithm for learning feature representations in multi-layer networks | 81 |
| An open-source framework that enables the deployment and serving of PyTorch models in various acceleration frameworks. | 147 |
| Automates the setup and training of machine learning algorithms on remote servers | 316 |
| Automates model building and deployment process by optimizing hyperparameters and compressing models for edge computing. | 200 |
| A machine learning library with GPU, CPU, and WASM backends for building neural networks. | 233 |
| An asynchronous engine for continuous and autonomous machine learning | 26 |
| Tools and techniques for optimizing large language models on various frameworks and hardware platforms. | 2,257 |
| Measures the performance of deep learning models in various deployment scenarios. | 1,256 |