Visual-Table

Visual representation generator

A project that generates visual representations tailored for general visual reasoning, leveraging hierarchical scene descriptions and instance-level world knowledge.

[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"

GitHub

14 stars
6 watching
1 forks
Language: Python
last commit: about 1 month ago

Related projects:

Repository Description Stars
jy0205/lavit A unified framework for training large language models to understand and generate visual content 528
opengvlab/visionllm A large language model designed to process and generate visual information 915
gordonhu608/mqt-llava A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. 97
labforcomputationalvision/texturesynth Generates synthetic digital images of visual textures based on mathematical models 34
dvlab-research/llama-vid An image-based language model that uses large language models to generate visual and text features from videos 733
nvlabs/relvit A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. 64
parrt/lolviz A tool for visualizing data structures in Python, allowing developers to represent complex data in a graphical format. 829
lxtgh/omg-seg Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. 1,300
vividvilla/csvtotable Converts CSV files to searchable and sortable HTML tables with features like pagination and export options. 1,117
vega/vega A declarative format for creating interactive visualization designs 11,239
tiagolr/vnodes A Vue-based library for creating interactive SVG graphs and diagrams. 120
luogen1996/lavin An open-source implementation of a vision-language instructed large language model 508
trifacta/vega A JSON-based format for describing and generating interactive visualization designs. 30
gicentre/litvis An approach to designing and building visualizations through literate programming with Elm, Markdown, and Vega. 379
megvii-research/tlc Improves image restoration performance by converting global operations to local ones during inference 231