tvm

Deep learning compiler

Compiler stack for deep learning systems to improve performance and efficiency on CPU, GPU, and specialized hardware.

Open deep learning compiler stack for cpu, gpu and specialized accelerators

GitHub

12k stars
376 watching
3k forks
Language: Python
last commit: about 2 months ago
compilerdeep-learninggpujavascriptmachine-learningmetalopenclperformancerocmspirvtensortvmvulkan

Related projects:

Repository Description Stars
apache/tvm-vta A comprehensive hardware design stack for accelerating deep learning models 258
deeplearning4j/deeplearning4j A suite of tools for building and training deep learning models using the JVM. 13,718
uber/petastorm Enables training and evaluation of deep learning models from Apache Parquet datasets in various machine learning frameworks 1,805
llvm/llvm-project A modular toolkit for building highly optimized compilers and run-time environments. 29,633
ahkarami/deep-learning-in-production A collection of notes and references on deploying deep learning models in production environments 4,313
young-geng/easylm A framework for training and serving large language models using JAX/Flax 2,428
scisharp/llamasharp An efficient C#/.NET library for running Large Language Models (LLMs) on local devices 2,750
jolibrain/deepdetect A machine learning API and server written in C++ that supports multiple deep learning libraries and provides a flexible interface for building and deploying machine learning models. 2,520
microsoft/deepspeed A deep learning optimization library that simplifies distributed training and inference on modern computing hardware. 35,863
dmmiller612/sparktorch A PyTorch implementation on Apache Spark for distributed deep learning model training and inference. 339
deepjavalibrary/djl A high-level Java framework for building and deploying deep learning models 4,204
salesforce/transmogrifai An AutoML library that automates machine learning model development on Apache Spark with minimal hand-tuning 2,248
acceleratehs/accelerate-llvm Compiles Accelerate code to LLVM IR and executes it on CPUs or NVIDIA GPUs 159
ggerganov/llama.cpp Enables LLM inference with minimal setup and high performance on various hardware platforms 69,185
lightning-ai/lit-llama An implementation of a large language model using the nanoGPT architecture 6,013