TAG
Text VQA Generator
A Python-based system for generating visual question-answer pairs using text-aware approaches to improve Text-VQA performance.
21 stars
1 watching
0 forks
Language: Python
last commit: almost 2 years ago Related projects:
Repository | Description | Stars |
---|---|---|
hyeonwoonoh/vqa-transfer-externaldata | Tools and scripts for training and evaluating a visual question answering model using transfer learning from an external data source. | 20 |
hengyuan-hu/bottom-up-attention-vqa | An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks. | 754 |
markdtw/vqa-winner-cvprw-2017 | Implementations and tools for training and fine-tuning a visual question answering model based on the 2017 CVPR workshop winner's approach. | 164 |
milvlg/prophet | An implementation of a two-stage framework designed to prompt large language models with answer heuristics for knowledge-based visual question answering tasks. | 267 |
opendatalab/vigc | Autonomously generates high-quality image-text instruction fine-tuning datasets | 90 |
hslcy/vcwe | This project provides code and corpora for creating word embeddings by considering the visual characteristics of words. | 15 |
cadene/vqa.pytorch | A PyTorch implementation of visual question answering with multimodal representation learning | 716 |
cvlab-columbia/viper | A framework for generating and executing Python code to solve visual inference tasks using large language models | 1,660 |
hvqzao/report-ng | Automates the creation of uniform reports from various input sources in web application security assessments | 66 |
pyramation/latex2html5 | A tool for rendering LaTeX documents to HTML5, enabling interactive content on the web. | 61 |
eps696/aphantasia | A text-to-image tool using CLIP and FFT/DWT parameters to generate detailed images from user-provided text prompts. | 776 |
zhuiyitechnology/t5-pegasus | Chinese generation model based on T5 architecture, trained using PEGASUS method | 555 |
iamwangyunkai/carla_py | Generates data for CARLA's visual navigation system using raw camera images and instructions. | 8 |
hitvoice/drqa | Implementing reading comprehension from Wikipedia questions to answer open-domain queries using PyTorch and SQuAD dataset | 401 |
jayleicn/tvqa | PyTorch implementation of video question answering system based on TVQA dataset | 172 |