awesome-computer-vision-models

Computer Vision Models

A curated list of popular computer vision models with their performance metrics

A list of popular deep learning models related to classification, segmentation and detection problems

GitHub

515 stars
26 watching
94 forks
last commit: over 3 years ago
Linked from 4 awesome lists

awesomeawesome-listawesome-listscomputer-visioncomputer-vision-algorithmsdeep-learningdensenetdiracnetefficientnetimage-classificationmachine-learningmachine-learning-algorithmsmachine-learning-modelsmixnetnasnetobject-detectionproxylessnasresnetsemantic-segmentationsknet

Awesome Computer Vision Models / Classification models

'One weird trick for parallelizing convolutional neural networks' AlexNet ( )
'Very Deep Convolutional Networks for Large-Scale Image Recognition' VGG-16 ( )
'Deep Residual Learning for Image Recognition' ResNet-10 ( )
'Deep Residual Learning for Image Recognition' ResNet-18 ( )
'Deep Residual Learning for Image Recognition' ResNet-34 ( )
'Deep Residual Learning for Image Recognition' ResNet-50 ( )
'Rethinking the Inception Architecture for Computer Vision' InceptionV3 ( )
'Identity Mappings in Deep Residual Networks' PreResNet-18 ( )
'Identity Mappings in Deep Residual Networks' PreResNet-34 ( )
'Identity Mappings in Deep Residual Networks' PreResNet-50 ( )
'Densely Connected Convolutional Networks' DenseNet-121 ( )
'Densely Connected Convolutional Networks' DenseNet-161 ( )
'Deep Pyramidal Residual Networks' PyramidNet-101 ( )
'Aggregated Residual Transformations for Deep Neural Networks' ResNeXt-14(32x4d) ( )
'Aggregated Residual Transformations for Deep Neural Networks' ResNeXt-26(32x4d) ( )
'Wide Residual Networks' WRN-50-2 ( )
'Xception: Deep Learning with Depthwise Separable Convolutions' Xception ( )
'Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning' InceptionV4 ( )
'Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning' InceptionResNetV2 ( )
'PolyNet: A Pursuit of Structural Diversity in Very Deep Networks' PolyNet ( )
'Darknet: Open source neural networks in C' 25,894 9 months ago DarkNet Ref ( )
'Darknet: Open source neural networks in C' 25,894 9 months ago DarkNet Tiny ( )
'Darknet: Open source neural networks in C' 25,894 9 months ago DarkNet 53 ( )
'SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size' SqueezeResNet1.1 ( )
'SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size' SqueezeNet1.1 ( )
'Residual Attention Network for Image Classification' ResAttNet-92 ( )
'CondenseNet: An Efficient DenseNet using Learned Group Convolutions' CondenseNet (G=C=8) ( )
'Dual Path Networks' DPN-68 ( )
'ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices' ShuffleNet x1.0 (g=1) ( )
'DiracNets: Training Very Deep Neural Networks Without Skip-Connections' DiracNetV2-18 ( )
'DiracNets: Training Very Deep Neural Networks Without Skip-Connections' DiracNetV2-34 ( )
'Squeeze-and-Excitation Networks' SENet-16 ( )
'Squeeze-and-Excitation Networks' SENet-154 ( )
'MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications' MobileNet ( )
'Learning Transferable Architectures for Scalable Image Recognition' NASNet-A 4@1056 ( )
'Learning Transferable Architectures for Scalable Image Recognition' NASNet-A 6@4032( )
'Deep Layer Aggregation' DLA-34 ( )
'Attention Inspiring Receptive-Fields Network for Learning Invariant Representations' AirNet50-1x64d (r=2) ( )
'BAM: Bottleneck Attention Module' BAM-ResNet-50 ( )
'CBAM: Convolutional Block Attention Module' CBAM-ResNet-50 ( )
'SqueezeNext: Hardware-Aware Neural Network Design' 1.0-SqNxt-23v5 ( )
'SqueezeNext: Hardware-Aware Neural Network Design' 1.5-SqNxt-23v5 ( )
'SqueezeNext: Hardware-Aware Neural Network Design' 2.0-SqNxt-23v5 ( )
'ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design' ShuffleNetV2 ( )
'Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications' 456-MENet-24×1(g=3) ( )
'FD-MobileNet: Improved MobileNet with A Fast Downsampling Strategy' FD-MobileNet ( )
'MobileNetV2: Inverted Residuals and Linear Bottlenecks' MobileNetV2 ( )
'IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks' IGCV3 ( )
'DARTS: Differentiable Architecture Search' DARTS ( )
'Progressive Neural Architecture Search' PNASNet-5 ( )
'Regularized Evolution for Image Classifier Architecture Search' AmoebaNet-C ( )
'MnasNet: Platform-Aware Neural Architecture Search for Mobile' MnasNet ( )
'Two at Once: Enhancing Learning andGeneralization Capacities via IBN-Net' IBN-Net50-a ( )
'Large Margin Deep Networks for Classification' MarginNet ( )
'A^2-Nets: Double Attention Networks' A^2 Net ( )
'FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction' FishNeXt-150 ( )
'IMAGENET-TRAINED CNNS ARE BIASED TOWARDS TEXTURE; INCREASING SHAPE BIAS IMPROVES ACCURACY AND ROBUSTNESS' Shape-ResNet ( )
'Greedy Layerwise Learning Can Scale to ImageNet' SimCNN(k=3 train) ( )
'Selective Kernel Networks' SKNet-50 ( )
'SRM : A Style-based Recalibration Module for Convolutional Neural Networks' SRM-ResNet-50 ( )
'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks' EfficientNet-B0 ( )
'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks' EfficientNet-B7b ( )
'PROXYLESSNAS: DIRECT NEURAL ARCHITECTURE SEARCH ON TARGET TASK AND HARDWARE' ProxylessNAS ( )
'MixNet: Mixed Depthwise Convolutional Kernels' MixNet-L ( ))
'ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks' ECA-Net50 ( )
'ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks' ECA-Net101 ( )
'ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks' ACNet-Densenet121 ( )
'LIP: Local Importance-based Pooling' LIP-ResNet-50 ( )
'LIP: Local Importance-based Pooling' LIP-ResNet-101 ( )
'LIP: Local Importance-based Pooling' LIP-DenseNet-BC-121 ( )
'MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning' MuffNet_1.0 ( )
'MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning' MuffNet_1.5 ( )
'Making Convolutional Networks Shift-Invariant Again' ResNet-34-Bin-5 ( )
'Making Convolutional Networks Shift-Invariant Again' ResNet-50-Bin-5 ( )
'Making Convolutional Networks Shift-Invariant Again' MobileNetV2-Bin-5 ( )
'Fixing the train-test resolution discrepancy' FixRes ResNeXt101 WSL ( )
'Self-training with Noisy Student improves ImageNet classification' Noisy Student*(L2) ( )
'TResNet: High Performance GPU-Dedicated Architecture' TResNet-M ( )
'DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search' DA-NAS-C ( )
'ResNeSt: Split-Attention Networks' ResNeSt-50 ( )
'ResNeSt: Split-Attention Networks' ResNeSt-101 ( )
'Funnel Activation for Visual Recognition' ResNet-50-FReLU ( )
'Funnel Activation for Visual Recognition' ResNet-101-FReLU ( )
'MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks' ResNet-50-MEALv2 ( )
'MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks' ResNet-50-MEALv2 + CutMix ( )
'MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks' MobileNet V3-Large-MEALv2 ( )
'MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks' EfficientNet-B0-MEALv2 ( )
'Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet' T2T-ViT-7 ( )
'Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet' T2T-ViT-14 ( )
'Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet' T2T-ViT-19 ( )
'High-Performance Large-Scale Image Recognition Without Normalization' NFNet-F0 ( )
'High-Performance Large-Scale Image Recognition Without Normalization' NFNet-F1 ( )
'High-Performance Large-Scale Image Recognition Without Normalization' NFNet-F6+SAM ( )
'EfficientNetV2: Smaller Models and Faster Training' EfficientNetV2-S ( )
'EfficientNetV2: Smaller Models and Faster Training' EfficientNetV2-M ( )
'EfficientNetV2: Smaller Models and Faster Training' EfficientNetV2-L ( )
'EfficientNetV2: Smaller Models and Faster Training' EfficientNetV2-S (21k) ( )
'EfficientNetV2: Smaller Models and Faster Training' EfficientNetV2-M (21k) ( )
'EfficientNetV2: Smaller Models and Faster Training' EfficientNetV2-L (21k) ( )

Awesome Computer Vision Models / Segmentation models

'U-Net: Convolutional Networks for Biomedical Image Segmentation' U-Net ( )
'Learning Deconvolution Network for Semantic Segmentation' DeconvNet ( )
'ParseNet: Looking Wider to See Better' ParseNet ( )
'Efficient piecewise training of deep structured models for semantic segmentation' Piecewise ( )
'SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation' SegNet ( )
'Fully Convolutional Networks for Semantic Segmentation' FCN ( )
'ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation' ENet ( )
'MULTI-SCALE CONTEXT AGGREGATION BY DILATED CONVOLUTIONS' DilatedNet ( )
'PixelNet: Towards a General Pixel-Level Architecture' PixelNet ( )
'RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation' RefineNet ( )
'Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation' LRR ( )
'Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes' FRRN ( )
'MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving' MultiNet ( )
'DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs' DeepLab ( )
'LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation' LinkNet ( )
'The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation' Tiramisu ( )
'ICNet for Real-Time Semantic Segmentation on High-Resolution Images' ICNet ( )
'Efficient ConvNet for Real-time Semantic Segmentation' ERFNet ( )
'Pyramid Scene Parsing Network' PSPNet ( )
'Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network' GCN ( )
'Segmentation-Aware Convolutional Networks Using Local Attention Masks' Segaware ( )
'PIXEL DECONVOLUTIONAL NETWORKS' PixelDCN ( )
'Rethinking Atrous Convolution for Semantic Image Segmentation' DeepLabv3 ( )
'Understanding Convolution for Semantic Segmentation' DUC, HDC ( )
'SHUFFLESEG: REAL-TIME SEMANTIC SEGMENTATION NETWORK' ShuffleSeg ( )
'Learning to Adapt Structured Output Space for Semantic Segmentation' AdaptSegNet ( )
'Understanding Convolution for Semantic Segmentation' TuSimple-DUC ( )
'Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation' R2U-Net ( )
'Attention U-Net: Learning Where to Look for the Pancreas' Attention U-Net ( )
'Dual Attention Network for Scene Segmentation' DANet ( )
'Context Encoding for Semantic Segmentation' ENCNet ( )
'ShelfNet for Real-time Semantic Segmentation' ShelfNet ( )
'LADDERNET: MULTI-PATH NETWORKS BASED ON U-NET FOR MEDICAL IMAGE SEGMENTATION' LadderNet ( )
'Concentrated-Comprehensive Convolutions for lightweight semantic segmentation' CCC-ERFnet ( )
'DifNet: Semantic Segmentation by Diffusion Networks' DifNet-101 ( )
'BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation' BiSeNet(Res18) ( )
'ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation' ESPNet ( )
'Semantic Image Synthesis with Spatially-Adaptive Normalization' SPADE ( )
'Seamless Scene Segmentation' SeamlessSeg ( )
'Expectation-Maximization Attention Networks for Semantic Segmentation' EMANet ( )

Awesome Computer Vision Models / Detection models

'Rich feature hierarchies for accurate object detection and semantic segmentation' R-CNN ( )
'OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks' OverFeat ( )
'Scalable Object Detection using Deep Neural Networks' MultiBox ( )
'Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition' SPP-Net ( )
'Object detection via a multi-region & semantic segmentation-aware CNN model' MR-CNN ( )
'AttentionNet: Aggregating Weak Directions for Accurate Object Detection' AttentionNet ( )
'Fast R-CNN' Fast R-CNN ( )
'Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks' Fast R-CNN ( )
'You Only Look Once: Unified, Real-Time Object Detection' YOLO v1 ( )
'G-CNN: an Iterative Grid Based Object Detector' G-CNN ( )
'Adaptive Object Detection Using Adjacency and Zoom Prediction' AZNet ( )
'Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks' ION ( )
'HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection' HyperNet ( )
'Training Region-based Object Detectors with Online Hard Example Mining' OHEM ( )
'A MultiPath Network for Object Detection' MPN ( )
'SSD: Single Shot MultiBox Detector' SSD ( )
'Crafting GBD-Net for Object Detection' GBDNet ( )
'Contextual Priming and Feedback for Faster R-CNN' CPF ( )
'A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection' MS-CNN ( )
'R-FCN: Object Detection via Region-based Fully Convolutional Networks' R-FCN ( )
'PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection' PVANET ( )
'DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection' DeepID-Net ( )
'Object Detection Networks on Convolutional Feature Maps' NoC ( )
'DSSD : Deconvolutional Single Shot Detector' DSSD ( )
'Beyond Skip Connections: Top-Down Modulation for Object Detection' TDM ( )
'Feature Pyramid Networks for Object Detection' FPN ( )
'YOLO9000: Better, Faster, Stronger' YOLO v2 ( )
'RON: Reverse Connection with Objectness Prior Networks for Object Detection' RON ( )
'Deformable Convolutional Networks' DCN ( )
'DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling' DeNet ( )
'CoupleNet: Coupling Global Structure with Local Parts for Object Detection' CoupleNet ( )
'Focal Loss for Dense Object Detection' RetinaNet ( )
'Mask R-CNN' Mask R-CNN ( )
'DSOD: Learning Deeply Supervised Object Detectors from Scratch' DSOD ( )
'Spatial Memory for Context Reasoning in Object Detection' SMN ( )
'YOLOv3: An Incremental Improvement' YOLO v3 ( )
'Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships' SIN ( )
'Scale-Transferrable Object Detection' STDN ( )
'Single-Shot Refinement Neural Network for Object Detection' RefineDet ( )
'MegDet: A Large Mini-Batch Object Detector' MegDet ( )
'Receptive Field Block Net for Accurate and Fast Object Detection' RFBNet ( )
'CornerNet: Detecting Objects as Paired Keypoints' CornerNet ( )
'Libra R-CNN: Towards Balanced Learning for Object Detection' LibraRetinaNet ( )
'YOLACT Real-time Instance Segmentation' YOLACT-700 ( )
'DetNAS: Backbone Search for Object Detection' DetNASNet(3.8) ( )
'YOLOv4: Optimal Speed and Accuracy of Object Detection' YOLOv4 ( )
'SOLO: Segmenting Objects by Locations' SOLO ( )
'SOLO: Segmenting Objects by Locations' D-SOLO ( )
'Scale Normalized Image Pyramids with AutoFocus for Object Detection' SNIPER ( )
'Scale Normalized Image Pyramids with AutoFocus for Object Detection' AutoFocus ( )

Backlinks from these awesome lists:

0