awesome-transformer-search

Transformer search

A curated list of resources and papers on combining Transformers with Neural Architecture Search to improve deep learning model architectures

A curated list of awesome resources combining Transformers with Neural Architecture Search

GitHub

260 stars
9 watching
28 forks
last commit: over 1 year ago
Linked from 1 awesome list

neural-architecture-searchtransformer

Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Training Free Transformer Architecture Search
LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models
Searching the Search Space of Vision Transformer
UniNet: Unified Architecture Search with Convolutions, Transformer and MLP
Analyzing and Mitigating Interference in Neural Architecture Search
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
Memory-Efficient Differentiable Transformer Architecture Search
Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition
AutoTrans: Automating Transformer Design via Reinforced Architecture Search
NASABN: A Neural Architecture Search Framework for Attention-Based Networks
NAT: Neural Architecture Transformer for Accurate and Compact Architectures
The Evolved Transformer

Awesome Transformer Architecture Search: / Domain Specific Transformer Search / Vision

š¯›¼NAS: Neural Architecture Search using Property Guided Synthesis
NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training
AutoFormer: Searching Transformers for Visual Recognition
GLiT: Neural Architecture Search for Global and Local Image Transformer
Searching for Efficient Multi-Stage Vision Transformers
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

Awesome Transformer Architecture Search: / Domain Specific Transformer Search / Natural Language Processing

AutoBERT-Zero: Evolving the BERT backbone from scratch
Primer: Searching for Efficient Transformers for Language Modeling
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Awesome Transformer Architecture Search: / Domain Specific Transformer Search / Automatic Speech Recognition

SFA: Searching faster architectures for end-to-end automatic speech recognition models
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Efficient Gradient-Based Neural Architecture Search For End-to-End ASR
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition

Awesome Transformer Architecture Search: / Domain Specific Transformer Search / Transformers Knowledge: Insights, Searchable parameters, Attention

RWKV: Reinventing RNNs for the Transformer Era
Patches are All You Need ?
Seperable Self Attention for Mobile Vision Transformers
Parameter-efficient Fine-tuning for Vision Transformers
EfficientFormer: Vision Transformers at MobileNet Speed
Neighborhood Attention Transformer
Training Compute Optimal Large Language Models
CMT: Convolutional Neural Networks meet Vision Transformers
Patch Slimming for Efficient Vision Transformers
Lite Vision Transformer with Enhanced Self-Attention
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Beyond Fixation: Dynamic Window Visual Transformer
BEiT: BERT Pre-Training of Image Transformers
How Do Vision Transformers Work?
Scale Efficiently: Insights from Pretraining and FineTuning Transformers
Tuformer: Data-Driven Design of Expressive Transformer by Tucker Tensor Representation
DictFormer: Tiny Transformer with Shared Dictionary
QuadTree Attention for Vision Transformers
Expediting Vision Transformers via Token Reorganization
UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning
Hierarchical Transformers Are More Efficient Language Models
Transformer in Transformer
Long-Short Transformer: Efficient Transformers for Language and Vision
Memory-efficient Transformers via Top-k Attention
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Rethinking Spatial Dimensions of Vision Transformers
What makes for hierarchical vision transformers
AutoAttend: Automated Attention Representation Search
Rethinking Attention with Performers
LambdaNetworks: Modeling long-range Interactions without Attention
HyperGrid Transformers
LocalViT: Bringing Locality to Vision Transformers
Compressive Transformers for Long Range Sequence Modelling
Improving Transformer Models by Reordering their Sublayers
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned

Awesome Transformer Architecture Search: / Transformer Surveys

Transformers in Vision: A Survey
A Survey of Vision Transformers
Efficient Transformers: A Survey
Neural Architecture Search for Transformers: A Survey

Awesome Transformer Architecture Search: / Foundation Models

Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models

Awesome Transformer Architecture Search: / Foundation Models / Misc resources

Awesome Visual Transformer 3,387 over 1 year ago
Vision Transformer & Attention Awesome List 4,651 4 months ago

Backlinks from these awesome lists:

More related projects: