TAG

Text VQA Generator

A Python-based system for generating visual question-answer pairs using text-aware approaches to improve Text-VQA performance.

GitHub

21 stars
1 watching
0 forks
Language: Python
last commit: almost 2 years ago

Related projects:

Repository Description Stars
hyeonwoonoh/vqa-transfer-externaldata Tools and scripts for training and evaluating a visual question answering model using transfer learning from an external data source. 20
hengyuan-hu/bottom-up-attention-vqa An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks. 754
markdtw/vqa-winner-cvprw-2017 Implementations and tools for training and fine-tuning a visual question answering model based on the 2017 CVPR workshop winner's approach. 164
milvlg/prophet An implementation of a two-stage framework designed to prompt large language models with answer heuristics for knowledge-based visual question answering tasks. 267
opendatalab/vigc Autonomously generates high-quality image-text instruction fine-tuning datasets 90
hslcy/vcwe This project provides code and corpora for creating word embeddings by considering the visual characteristics of words. 15
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 716
cvlab-columbia/viper A framework for generating and executing Python code to solve visual inference tasks using large language models 1,660
hvqzao/report-ng Automates the creation of uniform reports from various input sources in web application security assessments 66
pyramation/latex2html5 A tool for rendering LaTeX documents to HTML5, enabling interactive content on the web. 61
eps696/aphantasia A text-to-image tool using CLIP and FFT/DWT parameters to generate detailed images from user-provided text prompts. 776
zhuiyitechnology/t5-pegasus Chinese generation model based on T5 architecture, trained using PEGASUS method 555
iamwangyunkai/carla_py Generates data for CARLA's visual navigation system using raw camera images and instructions. 8
hitvoice/drqa Implementing reading comprehension from Wikipedia questions to answer open-domain queries using PyTorch and SQuAD dataset 401
jayleicn/tvqa PyTorch implementation of video question answering system based on TVQA dataset 172