cuml
GPU ML library
A suite of libraries implementing machine learning algorithms and mathematical primitives on NVIDIA GPUs
cuML - RAPIDS Machine Learning Library
4k stars
77 watching
535 forks
Language: C++
last commit: 5 days ago
Linked from 3 awesome lists
cudagpumachine-learningmachine-learning-algorithmsnvidia
Related projects:
Repository | Description | Stars |
---|---|---|
rapidsai/cudf | A GPU-accelerated data manipulation library built on top of C++/CUDA and Apache Arrow. | 8,492 |
xtra-computing/thundergbm | Accelerates machine learning algorithms on GPUs to improve performance and efficiency | 695 |
sjtu-ipads/powerinfer | An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs | 7,986 |
nvlabs/tiny-cuda-nn | A C++/CUDA framework for training and querying neural networks using GPUs | 3,782 |
plasma-umass/scalene | A high-performance Python profiler that analyzes CPU, GPU, and memory usage, providing detailed information and AI-powered optimization suggestions. | 12,237 |
postgresml/postgresml | An open-source Postgres extension for machine learning and AI operations directly within the database. | 6,050 |
paddlepaddle/paddle | A high-performance deep learning framework designed for industrial-scale training and deployment of neural networks. | 22,297 |
sony/nnabla | A deep learning framework that provides a flexible and expressive Python API for building and training neural networks on various platforms. | 2,729 |
microsoft/flaml | Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms | 3,938 |
iterative/cml | Automates machine learning workflows and generates reports on every pull request. | 4,046 |
nvlabs/instant-ngp | A software toolkit for training and rendering neural graphics primitives | 16,074 |
fminference/flexllmgen | Generates large language model outputs in high-throughput mode on single GPUs | 9,223 |
ddbourgin/numpy-ml | A collection of machine learning algorithms implemented in NumPy for rapid experimentation and prototyping. | 15,609 |
baidu-research/warp-ctc | An implementation of a loss function used in sequence data analysis and machine learning | 4,069 |
cupy/cupy | A Python library for running NumPy/SciPy code on NVIDIA CUDA or AMD ROCm platforms using GPU acceleration. | 9,542 |