tvm
Deep learning compiler
Compiler stack for deep learning systems to improve performance and efficiency on CPU, GPU, and specialized hardware.
Open deep learning compiler stack for cpu, gpu and specialized accelerators
12k stars
376 watching
3k forks
Language: Python
last commit: 6 days ago compilerdeep-learninggpujavascriptmachine-learningmetalopenclperformancerocmspirvtensortvmvulkan
Related projects:
Repository | Description | Stars |
---|---|---|
apache/tvm-vta | A comprehensive hardware design stack for accelerating deep learning models | 254 |
deeplearning4j/deeplearning4j | A suite of tools for building and training deep learning models using the JVM. | 13,682 |
uber/petastorm | Enables training and evaluation of deep learning models from Apache Parquet datasets in various machine learning frameworks | 1,799 |
llvm/llvm-project | A modular toolkit for building highly optimized compilers and run-time environments. | 29,107 |
ahkarami/deep-learning-in-production | A collection of notes and references on deploying deep learning models in production environments | 4,306 |
young-geng/easylm | A framework for training and serving large language models using JAX/Flax | 2,409 |
scisharp/llamasharp | A C#/.NET library to efficiently run Large Language Models (LLMs) on local devices | 2,673 |
jolibrain/deepdetect | A machine learning API and server written in C++ that supports multiple deep learning libraries and provides a flexible interface for building and deploying machine learning models. | 2,519 |
microsoft/deepspeed | A deep learning optimization library that makes distributed training and inference easy, efficient, and effective. | 35,463 |
dmmiller612/sparktorch | A PyTorch implementation on Apache Spark for distributed deep learning model training and inference. | 339 |
deepjavalibrary/djl | A high-level Java framework for building and deploying deep learning models | 4,144 |
salesforce/transmogrifai | An AutoML library that automates machine learning model development on Apache Spark with minimal hand-tuning | 2,244 |
acceleratehs/accelerate-llvm | Compiles Accelerate code to LLVM IR and executes it on CPUs or NVIDIA GPUs | 158 |
ggerganov/llama.cpp | Enables efficient inference of large language models using optimized C/C++ implementations and various backend frameworks | 67,866 |
lightning-ai/lit-llama | An implementation of a large language model using the nanoGPT architecture | 5,993 |