Focal-Transformer

Attention-based transformer

A vision transformer architecture that uses a novel attention mechanism to capture local-global interactions in images

[NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

GitHub

545 stars
16 watching
60 forks
Language: Python
last commit: over 2 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
google-research/nested-transformer An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency. 193
leviswind/pytorch-transformer Implementation of a transformer-based translation model in PyTorch 239
openai/sparse_attention Provides primitives for sparse attention mechanisms used in transformer models to improve computational efficiency and scalability 1,524
microsoft/vision-longformer An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms 241
gamrix/cs231n_proj This project focuses on manipulating 3D views using deep learning techniques. 6
ngxbac/gain A PyTorch implementation of an attention-guided inference network to focus on specific areas of objects in images 48
chrislemke/sk-transformers Provides a collection of reusable data transformation tools 8
microsoft/megatron-deepspeed Research tool for training large transformer language models at scale 1,895
feature-engine/feature_engine A Python library with multiple transformers to engineer and select features for use in machine learning models. 1,926
gabeur/mmt Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text 258
canjie-luo/moran_v2 A deep learning framework for scene text recognition with rectification and attention mechanisms. 636
pp00704831/banet-tip-2022 A PyTorch implementation of an attention network for dynamic scene deblurring 37
focalnet/networks-beyond-attention A collection of modern neural network architectures for computer vision tasks that don't use self-attention mechanisms. 77
pixart-alpha/pixart-sigma Develops a PyTorch model for 4K text-to-image generation using diffusion transformer 1,675
jeonsworld/vit-pytorch A PyTorch implementation of the Vision Transformer model for image recognition tasks. 1,940