Visual-Table
Visual representation generator
A project that generates visual representations tailored for general visual reasoning, leveraging hierarchical scene descriptions and instance-level world knowledge.
[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"
14 stars
6 watching
1 forks
Language: Python
last commit: 2 months ago Related projects:
Repository | Description | Stars |
---|---|---|
jy0205/lavit | A unified framework for training large language models to understand and generate visual content | 544 |
opengvlab/visionllm | A large language model designed to process and generate visual information | 956 |
gordonhu608/mqt-llava | A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. | 101 |
labforcomputationalvision/texturesynth | Generates synthetic digital images of visual textures based on mathematical models | 34 |
dvlab-research/llama-vid | An image-based language model that uses large language models to generate visual and text features from videos | 748 |
nvlabs/relvit | A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. | 64 |
parrt/lolviz | A tool for visualizing data structures in Python, allowing developers to represent complex data in a graphical format. | 830 |
lxtgh/omg-seg | Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. | 1,336 |
vividvilla/csvtotable | Converts CSV files to searchable and sortable HTML tables with features like pagination and export options. | 1,121 |
vega/vega | A declarative format for creating interactive visualization designs | 11,276 |
tiagolr/vnodes | A Vue-based library for creating interactive SVG graphs and diagrams. | 122 |
luogen1996/lavin | An open-source implementation of a vision-language instructed large language model | 513 |
trifacta/vega | A JSON-based format for describing and generating interactive visualization designs. | 30 |
gicentre/litvis | An approach to designing and building visualizations through literate programming with Elm, Markdown, and Vega. | 382 |
megvii-research/tlc | Improves image restoration performance by converting global operations to local ones during inference | 231 |