ROLL-VideoQA

VideoQA model

A PyTorch-based model for answering questions about videos based on unseen scenes and storylines

PyTorch code for ROLL, a knowledge-based video story question answering model.

GitHub

19 stars

3 watching

4 forks

Language: Python

last commit: almost 6 years ago

knowledge-based-reasoningvideo-question-answeringvideo-understandingvisual-question-answering

Related projects:

Repository	Description	Stars
cadene/vqa.pytorch	A PyTorch implementation of visual question answering with multimodal representation learning	718
jayleicn/tvqa	PyTorch implementation of video question answering system based on TVQA dataset	172
hitvoice/drqa	Implementing reading comprehension from Wikipedia questions to answer open-domain queries using PyTorch and SQuAD dataset	401
markdtw/vqa-winner-cvprw-2017	Implementations and tools for training and fine-tuning a visual question answering model based on the 2017 CVPR workshop winner's approach.	164
jayleicn/clipbert	An efficient framework for end-to-end learning on image-text and video-text tasks	709
makarandtapaswi/movieqa_cvpr2016	This project explores question-answering in movies using various machine learning approaches.	80
akirafukui/vqa-mcb	A software framework for training and deploying multimodal visual question answering models using compact bilinear pooling.	222
hengyuan-hu/bottom-up-attention-vqa	An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks.	755
gt-vision-lab/vqa_lstm_cnn	A Visual Question Answering model using a deeper LSTM and normalized CNN architecture.	377
milvlg/prophet	An implementation of a two-stage framework designed to prompt large language models with answer heuristics for knowledge-based visual question answering tasks.	270
mbzuai-oryx/video-chatgpt	A video conversation model that generates meaningful conversations about videos using large vision and language models	1,246
zcyang/imageqa-san	This project provides code for training image question answering models using stacked attention networks and convolutional neural networks.	108
xiaoman-zhang/pmc-vqa	A medical visual question-answering dataset and toolkit for training models to understand medical images and instructions.	180
codeslake/pvdnet	An open-source implementation of a deep learning model for video deblurring and motion estimation.	114
pasqal-io/pyqtorch	A PyTorch-based simulator for quantum machine learning	45