mmt
Video retriever
Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text
Multi-Modal Transformer for Video Retrieval
259 stars
10 watching
40 forks
Language: Python
last commit: about 1 year ago
Linked from 1 awesome list
fusionlanguagemultimodalnlpvideovision
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | An implementation of a video-text retrieval model using hierarchical graph reasoning with PyTorch. | 210 |
| | Trains a multimodal chatbot that combines visual and language instructions to generate responses | 1,478 |
| | Develops a deep learning framework for video retrieval using text and computer vision | 87 |
| | A multimedia framework designed for video editing, providing tools and libraries for audio and video processing. | 1,522 |
| | Enables display of video output from 3D animation software in Jupyter notebooks | 196 |
| | A Python-based tool to download YouTube videos and add metadata from various providers. | 328 |
| | Automates video file conversion and metadata tagging to create a uniform media library | 1,536 |
| | A deep learning project that provides a video-text retrieval model and tools for training and evaluating it on the MSR-VTT dataset | 154 |
| | An artificial intelligence system designed to understand and describe long-form video content | 329 |
| | Develops a method for long video understanding by optimizing memory usage | 550 |
| | An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. | 118 |
| | A collection of reusable video processing components | 452 |
| | A lightweight video player written in Haskell with support for various media formats and playback controls. | 423 |
| | This project focuses on manipulating 3D views using deep learning techniques. | 6 |
| | Downloads videos from the web and provides an interface for managing downloads | 1,108 |