dual_encoding
Video text retrieval model
A deep learning project that provides a video-text retrieval model and tools for training and evaluating it on the MSR-VTT dataset
[CVPR2019] Dual Encoding for Zero-Example Video Retrieval
155 stars
7 watching
31 forks
Language: Python
last commit: almost 2 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
danieljf24/hybrid_space | Develops a deep learning framework for video retrieval using text and computer vision | 87 |
danieljf24/w2vv | A deep neural network architecture that predicts visual features from text to improve image and video caption retrieval | 69 |
li-xirong/w2vvpp | A deep learning-based video search system using pre-trained models and datasets | 28 |
albanie/collaborative-experts | A framework for improving video retrieval by leveraging multiple text encoders and their collaborative expertise. | 336 |
shangwei5/vidue | A deep learning model that jointly performs video frame interpolation and deblurring with unknown exposure time | 66 |
gabeur/mmt | Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text | 258 |
jayleicn/clipbert | An efficient framework for end-to-end learning on image-text and video-text tasks | 704 |
antoine77340/mixture-of-embedding-experts | An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. | 118 |
codeslake/pvdnet | An open-source implementation of a deep learning model for video deblurring and motion estimation. | 114 |
cshizhe/hgr_v2t | An implementation of a video-text retrieval model using hierarchical graph reasoning with PyTorch. | 209 |
jaywongwang/densevideocaptioning | An implementation of a dense video captioning model with attention-based fusion and context gating | 148 |
kuleshov/deep-learning-models | Implementations of various deep learning algorithms in Python using Theano and Lasagne. | 24 |
tiger-ai-lab/uniir | Trains and evaluates a universal multimodal retrieval model to perform various information retrieval tasks. | 110 |
millionintegrals/vel | A collection of modular deep learning components that can be easily configured and reused in various applications. | 276 |
zhanghang1989/pytorch-encoding | A Python framework for building deep learning models with optimized encoding layers and batch normalization. | 2,041 |