tvm-vta

Deep Learning Accelerator

A comprehensive hardware design stack for accelerating deep learning models

Open, Modular, Deep Learning Accelerator

GitHub

254 stars
40 watching
72 forks
Language: Scala
last commit: 8 months ago
Linked from 1 awesome list

hardwaremachine-learningtensortvmvta

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
nvdla/hw The NVDLA project provides hardware designs and tools for building deep learning inference accelerators. 1,744
vlang/vtl A C library providing an n-dimensional tensor data structure and linear algebra routines 148
doonny/pipecnn A tool for accelerating convolutional neural networks on Field-Programmable Gate Arrays (FPGAs) using OpenCL-based hardware design 1,253
homles11/igcv3 An implementation of an efficient deep neural network architecture 189
eaplatanios/tensorflow_scala A Scala API for TensorFlow's deep learning functionality 939
vict0rsch/deep_learning A collection of tutorials and resources on implementing deep learning models using Python libraries such as Keras and Lasagne. 426
jnhwkim/nips-mrn-vqa This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework. 39
acceleratehs/accelerate-llvm Compiles Accelerate code to LLVM IR and executes it on CPUs or NVIDIA GPUs 158
google/cfu-playground A framework for designing and optimizing machine learning accelerators on FPGAs. 472
coreylowman/dfdx A deep learning library for Rust with GPU acceleration and ergonomic API. 1,737
vlgiitr/dmn-plus A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms 64
vlfeat/autonn An API wrapper around MatConvNet that adds automatic differentiation for easy deep learning prototyping and research 89
uber/petastorm Enables training and evaluation of deep learning models from Apache Parquet datasets in various machine learning frameworks 1,799
intel/intel-extension-for-tensorflow Enables heterogeneous high-performance computing on Intel CPUs and GPUs for deep learning workloads 317
mit-han-lab/proxylessnas Direct neural architecture search on target task and hardware for efficient model deployment 1,425