Visual-Table

Visual representation generator

A project that generates visual representations tailored for general visual reasoning, leveraging hierarchical scene descriptions and instance-level world knowledge.

[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"

GitHub

14 stars

6 watching

1 forks

Language: Python

last commit: almost 2 years ago

Related projects:

Repository	Description	Stars
jy0205/lavit	A unified framework for training large language models to understand and generate visual content	544
opengvlab/visionllm	A large language model designed to process and generate visual information	956
gordonhu608/mqt-llava	A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens.	101
labforcomputationalvision/texturesynth	Generates synthetic digital images of visual textures based on mathematical models	34
dvlab-research/llama-vid	An image-based language model that uses large language models to generate visual and text features from videos	748
nvlabs/relvit	A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations.	64
parrt/lolviz	A tool for visualizing data structures in Python, allowing developers to represent complex data in a graphical format.	830
lxtgh/omg-seg	Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model.	1,336
vividvilla/csvtotable	Converts CSV files to searchable and sortable HTML tables with features like pagination and export options.	1,121
vega/vega	A declarative format for creating interactive visualization designs	11,276
tiagolr/vnodes	A Vue-based library for creating interactive SVG graphs and diagrams	122
luogen1996/lavin	An open-source implementation of a vision-language instructed large language model	513
trifacta/vega	A JSON-based format for describing and generating interactive visualization designs.	30
gicentre/litvis	An approach to designing and building visualizations through literate programming with Elm, Markdown, and Vega.	382
megvii-research/tlc	Improves image restoration performance by converting global operations to local ones during inference	231