tvm

Deep learning compiler

Compiler stack for deep learning systems to improve performance and efficiency on CPU, GPU, and specialized hardware.

Open deep learning compiler stack for cpu, gpu and specialized accelerators

GitHub

12k stars
376 watching
3k forks
Language: Python
last commit: 6 days ago
compilerdeep-learninggpujavascriptmachine-learningmetalopenclperformancerocmspirvtensortvmvulkan

Related projects:

Repository Description Stars
apache/tvm-vta A comprehensive hardware design stack for accelerating deep learning models 254
deeplearning4j/deeplearning4j A suite of tools for building and training deep learning models using the JVM. 13,682
uber/petastorm Enables training and evaluation of deep learning models from Apache Parquet datasets in various machine learning frameworks 1,799
llvm/llvm-project A modular toolkit for building highly optimized compilers and run-time environments. 29,107
ahkarami/deep-learning-in-production A collection of notes and references on deploying deep learning models in production environments 4,306
young-geng/easylm A framework for training and serving large language models using JAX/Flax 2,409
scisharp/llamasharp A C#/.NET library to efficiently run Large Language Models (LLMs) on local devices 2,673
jolibrain/deepdetect A machine learning API and server written in C++ that supports multiple deep learning libraries and provides a flexible interface for building and deploying machine learning models. 2,519
microsoft/deepspeed A deep learning optimization library that makes distributed training and inference easy, efficient, and effective. 35,463
dmmiller612/sparktorch A PyTorch implementation on Apache Spark for distributed deep learning model training and inference. 339
deepjavalibrary/djl A high-level Java framework for building and deploying deep learning models 4,144
salesforce/transmogrifai An AutoML library that automates machine learning model development on Apache Spark with minimal hand-tuning 2,244
acceleratehs/accelerate-llvm Compiles Accelerate code to LLVM IR and executes it on CPUs or NVIDIA GPUs 158
ggerganov/llama.cpp Enables efficient inference of large language models using optimized C/C++ implementations and various backend frameworks 67,866
lightning-ai/lit-llama An implementation of a large language model using the nanoGPT architecture 5,993