hgr_v2t

Video retriever

An implementation of a video-text retrieval model using hierarchical graph reasoning with PyTorch.

Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".

GitHub

209 stars
14 watching
21 forks
Language: Python
last commit: over 4 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
gabeur/mmt Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text 258
danieljf24/hybrid_space Develops a deep learning framework for video retrieval using text and computer vision 87
chaoyuaw/pytorch-coviar A PyTorch implementation of a compressed video action recognition system 502
xiadingz/video-caption.pytorch PyTorch implementation of video captioning, combining deep learning and computer vision techniques. 401
huguyuehuhu/hcn-pytorch Replication of a PyTorch model for action recognition and detection from skeleton data 219
antoine77340/mixture-of-embedding-experts An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. 118
clementpinard/sfmlearner-pytorch PyTorch implementation of unsupervised depth and ego-motion learning from video sequences 1,014
danieljf24/dual_encoding A deep learning project that provides a video-text retrieval model and tools for training and evaluating it on the MSR-VTT dataset 155
gsig/pyvideoresearch A collection of video analysis methods and datasets for research and development 533
penghao-wu/vstar PyTorch implementation of guided visual search mechanism for multimodal LLMs 527
huicongzhang/stdan Deblurring algorithm for videos using a neural network 52
hhk1/prynttrimmerview A toolset for trimming and cropping videos using Swift 861
thibaudgg/video_info A Ruby gem that retrieves metadata from various video sharing platforms 429
xingyizhou/pytorch-pose-hg-3d A PyTorch implementation of a 3D human pose estimation algorithm using weak supervision. 613
li-xirong/w2vvpp A deep learning-based video search system using pre-trained models and datasets 28