Transformer-in-Vision
CV Transformers
A collection of resources and papers related to Transformer-based computer vision models and techniques.
Recent Transformer-based CV and related works.
1k stars
87 watching
143 forks
last commit: about 2 years ago
Linked from 1 awesome list
computer-visiondeep-learningmulti-modalpaperself-attentiontransformervision-transformersvisual-language
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Compiles and shares 3D computer vision papers using transformer models | 409 |
| | An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency. | 195 |
| | An implementation of a new neural network architecture that combines the strengths of convolutional and transformer designs to improve performance on image classification tasks. | 559 |
| | A PyTorch-based framework for building and training deep learning models in computer vision. | 47 |
| | A PyTorch implementation of the Vision Transformer model for image recognition tasks. | 1,959 |
| | This project focuses on manipulating 3D views using deep learning techniques. | 6 |
| | An implementation of transformer models in PyTorch for natural language processing tasks | 1,257 |
| | An implementation of a deep learning model for grounding situation recognition in images | 45 |
| | An implementation of deep neural network architectures, including Transformers, in Python. | 214 |
| | An implementation of the Video Swin Transformer architecture for video recognition tasks | 1,463 |
| | Reconstructs images using untrained neural networks to manipulate and transform existing images | 216 |
| | Proposes an efficient neural architecture model for high-resolution image restoration tasks | 1,845 |
| | Rust wrapper around OpenCV 3.x | 204 |
| | A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,236 |
| | An implementation of Pyramid Vision Transformers for image classification, object detection, and semantic segmentation tasks | 1,745 |