ViT-pytorch

Vision Transformer

A PyTorch implementation of the Vision Transformer model for image recognition tasks.

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

GitHub

2k stars

13 watching

376 forks

Language: Jupyter Notebook

last commit: about 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

weiaicunzai/awesome-image-classification

Related projects:

Repository	Description	Stars
kaiyangzhou/dassl.pytorch	A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision.	1,236
lucidrains/reformer-pytorch	An implementation of Reformer, an efficient Transformer model for natural language processing tasks.	2,132
yitu-opensource/t2t-vit	A deep learning framework for training vision transformers from scratch on image data.	1,162
google-research/nested-transformer	An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency.	195
felixgwu/img_classification_pk_pytorch	A PyTorch project for comparing image classification models and facilitating quick experiment setup	366
whai362/pvt	An implementation of Pyramid Vision Transformers for image classification, object detection, and semantic segmentation tasks	1,745
pixart-alpha/pixart-sigma	Develops a PyTorch model for 4K text-to-image generation using diffusion transformer	1,711
leviswind/pytorch-transformer	Implementation of a transformer-based translation model in PyTorch	240
t-vi/pytorch-tvmisc	A collection of miscellaneous PyTorch implementations covering various machine learning concepts and techniques	468
jhjacobsen/pytorch-i-revnet	Deep invertible neural network implementation using PyTorch for image recognition and reconstruction tasks.	390
kunpengli1994/vsrn	An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching	294
nickjiang2378/vl-interp	This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions.	46
potterhsu/svhnclassifier-pytorch	A PyTorch implementation of multi-digit number recognition from street view imagery using deep convolutional neural networks	200
mattmacy/vnet.pytorch	A PyTorch implementation of V-Net for volumetric medical image segmentation	703
mchong6/soat	This repository provides a PyTorch implementation of an image manipulation technique using a pretrained StyleGAN model.	380