ViT-pytorch
Vision Transformer
A PyTorch implementation of the Vision Transformer model for image recognition tasks.
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
2k stars
13 watching
371 forks
Language: Jupyter Notebook
last commit: over 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
kaiyangzhou/dassl.pytorch | A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,217 |
lucidrains/reformer-pytorch | An implementation of Reformer, an efficient Transformer model for natural language processing tasks. | 2,120 |
yitu-opensource/t2t-vit | A deep learning framework for training vision transformers from scratch on image data. | 1,148 |
google-research/nested-transformer | An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency. | 193 |
felixgwu/img_classification_pk_pytorch | A PyTorch project for comparing image classification models and facilitating quick experiment setup | 365 |
whai362/pvt | An implementation of Pyramid Vision Transformers for image classification, object detection, and semantic segmentation tasks | 1,728 |
pixart-alpha/pixart-sigma | Develops a PyTorch model for 4K text-to-image generation using diffusion transformer | 1,675 |
leviswind/pytorch-transformer | Implementation of a transformer-based translation model in PyTorch | 239 |
t-vi/pytorch-tvmisc | A collection of utilities and tools for building and improving deep learning models in PyTorch | 468 |
jhjacobsen/pytorch-i-revnet | Deep invertible neural network implementation using PyTorch for image recognition and reconstruction tasks. | 389 |
kunpengli1994/vsrn | An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching | 294 |
nickjiang2378/vl-interp | This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. | 31 |
potterhsu/svhnclassifier-pytorch | A PyTorch implementation of multi-digit number recognition from street view imagery using deep convolutional neural networks | 200 |
mattmacy/vnet.pytorch | A PyTorch implementation of V-Net for volumetric medical image segmentation | 694 |
mchong6/soat | This repository provides a PyTorch implementation of an image manipulation technique using a pretrained StyleGAN model. | 380 |