mmt

Video retriever

Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text

Multi-Modal Transformer for Video Retrieval

GitHub

259 stars

10 watching

40 forks

Language: Python

last commit: 10 months ago

Linked from 1 awesome list

fusionlanguagemultimodalnlpvideovision

thoth.inrialpes.fr/research/MMT/

Backlinks from these awesome lists:

danieljf24/awesome-video-text-retrieval

Related projects:

Repository	Description	Stars
cshizhe/hgr_v2t	An implementation of a video-text retrieval model using hierarchical graph reasoning with PyTorch.	210
open-mmlab/multimodal-gpt	Trains a multimodal chatbot that combines visual and language instructions to generate responses	1,478
danieljf24/hybrid_space	Develops a deep learning framework for video retrieval using text and computer vision	87
mltframework/mlt	A multimedia framework designed for video editing, providing tools and libraries for audio and video processing.	1,522
krassowski/jupyter-manim	Enables display of video output from 3D animation software in Jupyter notebooks	196
jvt038/metatube	A Python-based tool to download YouTube videos and add metadata from various providers.	328
mdhiggins/sickbeard_mp4_automator	Automates video file conversion and metadata tagging to create a uniform media library	1,536
danieljf24/dual_encoding	A deep learning project that provides a video-text retrieval model and tools for training and evaluating it on the MSR-VTT dataset	154
vision-cair/longvu	An artificial intelligence system designed to understand and describe long-form video content	329
rese1f/moviechat	Develops a method for long video understanding by optimizing memory usage	550
antoine77340/mixture-of-embedding-experts	An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks.	118
dyne/frei0r	A collection of reusable video processing components	452
lettier/movie-monad	A lightweight video player written in Haskell with support for various media formats and playback controls.	423
gamrix/cs231n_proj	This project focuses on manipulating 3D views using deep learning techniques.	6
nickvisionapps/parabolic	Downloads videos from the web and provides an interface for managing downloads	1,108