nccl

GPU communication library

A library of optimized primitives for efficient inter-GPU communication and data transfer.

Optimized primitives for collective multi-GPU communication

GitHub

3k stars
154 watching
836 forks
Language: C++
last commit: 4 months ago
Linked from 2 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
sergio0694/computesharp Enables C# code to run on the GPU through DirectX and dynamically generated shaders 2,799
keylase/nvidia-patch Removes Nvidia's restriction on simultaneous NVENC video encoding sessions 3,606
uncomplicate/clojurecl A Clojure library that enables parallel computations on GPU using OpenCL 278
nvlabs/instant-ngp A software toolkit for training and rendering neural graphics primitives 16,115
nvidia-ai-iot/cupcl A set of libraries and sample code for 3D point cloud processing using CUDA. 584
vczh-libraries/gacui A comprehensive C++ library for building GPU-accelerated user interfaces with WYSIWYG editing tools and XML support. 2,354
nvidia/apex Tools for streamlined mixed precision and distributed training in PyTorch 8,460
nvidia/matx A C++17 GPU-accelerated numerical computing library with Python-like syntax 1,229
sony/nnabla A deep learning framework that provides a flexible and expressive Python API for building and training neural networks on various platforms. 2,729
nvidia/multi-gpu-programming-models A collection of examples demonstrating various approaches to programming multiple GPUs in parallel 575
rapidsai/cuml A suite of libraries implementing machine learning algorithms and mathematical primitives on NVIDIA GPUs 4,292
nrwl/nx A build system designed to optimize monorepos and integrate well with various frameworks and tools for fast CI. 23,951
zeux/meshoptimizer A C++ library that optimizes 3D meshes for faster rendering on GPUs. 5,795
nvlabs/tiny-cuda-nn A C++/CUDA framework for training and querying neural networks using GPUs 3,791
baidu-research/warp-ctc An implementation of a loss function used in sequence data analysis and machine learning 4,070