Visual-Table

Visual representation generator

A project that generates visual representations tailored for general visual reasoning, leveraging hierarchical scene descriptions and instance-level world knowledge.

[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"

GitHub

14 stars
6 watching
1 forks
Language: Python
last commit: 2 months ago

Related projects:

Repository Description Stars
jy0205/lavit A unified framework for training large language models to understand and generate visual content 544
opengvlab/visionllm A large language model designed to process and generate visual information 956
gordonhu608/mqt-llava A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. 101
labforcomputationalvision/texturesynth Generates synthetic digital images of visual textures based on mathematical models 34
dvlab-research/llama-vid An image-based language model that uses large language models to generate visual and text features from videos 748
nvlabs/relvit A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. 64
parrt/lolviz A tool for visualizing data structures in Python, allowing developers to represent complex data in a graphical format. 830
lxtgh/omg-seg Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. 1,336
vividvilla/csvtotable Converts CSV files to searchable and sortable HTML tables with features like pagination and export options. 1,121
vega/vega A declarative format for creating interactive visualization designs 11,276
tiagolr/vnodes A Vue-based library for creating interactive SVG graphs and diagrams. 122
luogen1996/lavin An open-source implementation of a vision-language instructed large language model 513
trifacta/vega A JSON-based format for describing and generating interactive visualization designs. 30
gicentre/litvis An approach to designing and building visualizations through literate programming with Elm, Markdown, and Vega. 382
megvii-research/tlc Improves image restoration performance by converting global operations to local ones during inference 231