tvm-vta
Deep Learning Accelerator
A comprehensive hardware design stack for accelerating deep learning models
Open, Modular, Deep Learning Accelerator
254 stars
40 watching
72 forks
Language: Scala
last commit: 8 months ago
Linked from 1 awesome list
hardwaremachine-learningtensortvmvta
Related projects:
Repository | Description | Stars |
---|---|---|
nvdla/hw | The NVDLA project provides hardware designs and tools for building deep learning inference accelerators. | 1,744 |
vlang/vtl | A C library providing an n-dimensional tensor data structure and linear algebra routines | 148 |
doonny/pipecnn | A tool for accelerating convolutional neural networks on Field-Programmable Gate Arrays (FPGAs) using OpenCL-based hardware design | 1,253 |
homles11/igcv3 | An implementation of an efficient deep neural network architecture | 189 |
eaplatanios/tensorflow_scala | A Scala API for TensorFlow's deep learning functionality | 939 |
vict0rsch/deep_learning | A collection of tutorials and resources on implementing deep learning models using Python libraries such as Keras and Lasagne. | 426 |
jnhwkim/nips-mrn-vqa | This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework. | 39 |
acceleratehs/accelerate-llvm | Compiles Accelerate code to LLVM IR and executes it on CPUs or NVIDIA GPUs | 158 |
google/cfu-playground | A framework for designing and optimizing machine learning accelerators on FPGAs. | 472 |
coreylowman/dfdx | A deep learning library for Rust with GPU acceleration and ergonomic API. | 1,737 |
vlgiitr/dmn-plus | A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms | 64 |
vlfeat/autonn | An API wrapper around MatConvNet that adds automatic differentiation for easy deep learning prototyping and research | 89 |
uber/petastorm | Enables training and evaluation of deep learning models from Apache Parquet datasets in various machine learning frameworks | 1,799 |
intel/intel-extension-for-tensorflow | Enables heterogeneous high-performance computing on Intel CPUs and GPUs for deep learning workloads | 317 |
mit-han-lab/proxylessnas | Direct neural architecture search on target task and hardware for efficient model deployment | 1,425 |