lsmdc
Video QA framework
A framework implementing a joint sequence fusion model for video question answering and retrieval
31 stars
5 watching
5 forks
Language: Python
last commit: about 6 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
jiasenlu/hiecoattenvqa | A framework for training Hierarchical Co-Attention models for Visual Question Answering using preprocessed data and a specific image model. | 349 |
liuzhao1225/youdub-webui | A web-based video processing tool that uses AI to facilitate cultural and linguistic tasks such as transcription, translation, and audio synthesis. | 1,980 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 121 |
jayleicn/tvqa | PyTorch implementation of video question answering system based on TVQA dataset | 172 |
kylejginavan/youtube_it | A Ruby wrapper for accessing YouTube's video API and managing video content | 595 |
opengvlab/internvideo | Develops general video foundation models and related datasets for multimodal understanding and generation through generative and discriminative learning. | 1,467 |
jsmidt/quantpy | A framework for building quantitative finance applications in Python. | 709 |
yasar-rehman/fedvssl | Implementation of Federated Self-Superivised Learning for video understanding | 24 |
jpzwolak/qflow-suite | A machine learning framework for training models on quantum dot data | 40 |
gsig/pyvideoresearch | A collection of video analysis methods and datasets for research and development | 533 |
llyx97/tempcompass | A tool to evaluate video language models' ability to understand and describe video content | 91 |
makarandtapaswi/movieqa_cvpr2016 | This project explores question-answering in movies using various machine learning approaches. | 80 |
dwqs/mp-jithub | A mini program for interacting with GitHub | 30 |
rese1f/moviechat | Develops a method for long video understanding by optimizing memory usage | 550 |
yomorun/yomo-wasmedge-tensorflow | A project that demonstrates how to process video streams in real-time using WebAssembly and TensorFlow Lite for food classification. | 64 |