accelerate
Model trainer
A tool to simplify training and deployment of PyTorch models on various devices and configurations
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
8k stars
97 watching
989 forks
Language: Python
last commit: 2 months ago
Linked from 3 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
| Tools for streamlined mixed precision and distributed training in PyTorch | 8,460 |
| A Python library providing tensors and dynamic neural networks with strong GPU acceleration | 84,978 |
| A native PyTorch library for training large language models using distributed parallelism and optimization techniques. | 2,765 |
| A toolkit for optimizing and accelerating the training and inference of machine learning models on various hardware platforms. | 2,618 |
| A tool for serving and scaling PyTorch models in production environments | 4,238 |
| A PyTorch extension library that provides high-performance and large-scale training techniques. | 3,210 |
| A PyTorch implementation of differentiable ODE solvers with GPU support and efficient backpropagation | 5,655 |
| A high-level library to help with training and evaluating neural networks in PyTorch | 4,554 |
| Implementation of a face parsing model using PyTorch and a modified BiSeNet architecture. | 2,346 |
| An implementation of a deep learning-based object detection system in PyTorch. | 5,160 |
| A PyTorch framework for accelerating deep learning research and development by focusing on reproducibility, rapid experimentation, and codebase reuse. | 3,300 |
| An implementation of Squeeze-and-Excitation Networks (SE-Nets) for deep learning image classification tasks | 2,291 |
| An efficient method for fine-tuning large pre-trained models by adapting only a small fraction of their parameters | 16,699 |
| An object detection implementation built on top of PyTorch, supporting multi-image batch training and multiple GPUs. | 7,721 |
| A compiler and execution engine for neural networks that generates optimized code for hardware accelerators | 3,247 |