mmt
Video retriever
Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text
Multi-Modal Transformer for Video Retrieval
259 stars
10 watching
40 forks
Language: Python
last commit: 6 months ago
Linked from 1 awesome list
fusionlanguagemultimodalnlpvideovision
Related projects:
Repository | Description | Stars |
---|---|---|
| An implementation of a video-text retrieval model using hierarchical graph reasoning with PyTorch. | 210 |
| Trains a multimodal chatbot that combines visual and language instructions to generate responses | 1,478 |
| Develops a deep learning framework for video retrieval using text and computer vision | 87 |
| A multimedia framework designed for video editing, providing tools and libraries for audio and video processing. | 1,522 |
| Enables display of video output from 3D animation software in Jupyter notebooks | 196 |
| A Python-based tool to download YouTube videos and add metadata from various providers. | 328 |
| Automates video file conversion and metadata tagging to create a uniform media library | 1,536 |
| A deep learning project that provides a video-text retrieval model and tools for training and evaluating it on the MSR-VTT dataset | 154 |
| An artificial intelligence system designed to understand and describe long-form video content | 329 |
| Develops a method for long video understanding by optimizing memory usage | 550 |
| An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. | 118 |
| A collection of reusable video processing components | 452 |
| A lightweight video player written in Haskell with support for various media formats and playback controls. | 423 |
| This project focuses on manipulating 3D views using deep learning techniques. | 6 |
| Downloads videos from the web and provides an interface for managing downloads | 1,108 |