dvit_repo

Vision transformer improvement

An implementation of Deep Vision Transformer models with modifications to improve performance by preventing attention collapse

137 stars

5 watching

23 forks

Language: Python

last commit: over 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

weiaicunzai/awesome-image-classification

Related projects:

Repository	Description	Stars
yitu-opensource/t2t-vit	A deep learning framework for training vision transformers from scratch on image data.	1,162
zhendongwang6/uformer	An implementation of a deep learning model for restoring images in various conditions	817
jeonsworld/vit-pytorch	A PyTorch implementation of the Vision Transformer model for image recognition tasks.	1,959
whai362/pvt	An implementation of Pyramid Vision Transformers for image classification, object detection, and semantic segmentation tasks	1,745
google-research/nested-transformer	An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency.	195
gordonhu608/mqt-llava	A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens.	101
zsdonghao/spatial-transformer-nets	An implementation of Spatial Transformer Networks in TensorFlow for learning to apply transformations to images via classification tasks.	36
atiyo/deep_image_prior	Reconstructs images using untrained neural networks to manipulate and transform existing images	216
megvii-research/tlc	Improves image restoration performance by converting global operations to local ones during inference	231
huawei-noah/pretrained-ipt	This project develops a pre-trained transformer model for image processing tasks such as denoising, super-resolution, and deraining.	451
yiren-jian/blitext	Develops and trains models for vision-language learning with decoupled language pre-training	24
yiyangzhou/lure	Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability.	136
fastnlp/cpt	A pre-trained transformer model for natural language understanding and generation tasks in Chinese	482
dong-huo/vdip-deconvolution	A method for blind image deconvolution using variational deep image prior.	13
dirtyharrylyl/transformer-in-vision	A collection of resources and papers related to Transformer-based computer vision models and techniques.	1,324