Focal-Transformer

Attention-based transformer

A vision transformer architecture that uses a novel attention mechanism to capture local-global interactions in images

[NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

GitHub

547 stars

16 watching

60 forks

Language: Python

last commit: over 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

weiaicunzai/awesome-image-classification

Related projects:

Repository	Description	Stars
google-research/nested-transformer	An implementation of a transformer-based vision model that aggregates local transformers on image blocks to improve accuracy and efficiency.	195
leviswind/pytorch-transformer	Implementation of a transformer-based translation model in PyTorch	240
openai/sparse_attention	Provides primitives for sparse attention mechanisms used in transformer models to improve computational efficiency and scalability	1,533
microsoft/vision-longformer	An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms	243
gamrix/cs231n_proj	This project focuses on manipulating 3D views using deep learning techniques.	6
ngxbac/gain	A PyTorch implementation of an attention-guided inference network to focus on specific areas of objects in images	48
chrislemke/sk-transformers	Provides a collection of reusable data transformation tools	10
microsoft/megatron-deepspeed	Research tool for training large transformer language models at scale	1,926
feature-engine/feature_engine	A Python library with multiple transformers to engineer and select features for use in machine learning models.	1,956
gabeur/mmt	Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text	259
canjie-luo/moran_v2	A deep learning framework for scene text recognition with rectification and attention mechanisms.	639
pp00704831/banet-tip-2022	A PyTorch implementation of an attention network for dynamic scene deblurring	37
focalnet/networks-beyond-attention	A collection of modern neural network architectures for computer vision tasks that don't use self-attention mechanisms.	77
pixart-alpha/pixart-sigma	Develops a PyTorch model for 4K text-to-image generation using diffusion transformer	1,711
jeonsworld/vit-pytorch	A PyTorch implementation of the Vision Transformer model for image recognition tasks.	1,959