VinVL
Visual representation improvement
A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations.
project page for VinVL
350 stars
9 watching
25 forks
last commit: over 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
zuoxingdong/vin_pytorch_visdom | Implementation of Value Iteration Networks in PyTorch with visualization capabilities using Visdom. | 226 |
dotnet/vblang | Design of a Visual Basic .NET language and runtime library | 291 |
yunxinli/lingcloud | Enhances language models by incorporating human-like eyes to improve visual comprehension and interaction with external world | 48 |
lucasvazq/lucasvazq | Personal showcase of a developer's experience and expertise in building web platforms with a focus on accessibility, performance, and robust code. | 30 |
pku-yuangroup/chat-univi | A framework for unified visual representation in image and video understanding models, enabling efficient training of large language models on multimodal data. | 895 |
ivanreese/visual-programming-codex | An online resource showcasing alternative visual programming languages and projects, reflecting on their concepts, ideas, and potential benefits. | 1,366 |
byungkwanlee/collavo | Develops a PyTorch implementation of an enhanced vision language model | 93 |
liaoning97/revo-lion | A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models | 11 |
vanshkapoor/vanshkapoor | Utility tools and project showcases | 24 |
dbuenzli/vg | A declarative 2D vector graphics library written in OCaml | 91 |
byungkwanlee/moai | Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 314 |
nickjiang2378/vl-interp | This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. | 46 |
ys-zong/vl-icl | A benchmarking suite for multimodal in-context learning models | 31 |
kunpengli1994/vsrn | An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching | 294 |
brucesherwood/vpython-jupyter | An integration of VPython with Jupyter Notebook for interactive 3D visualization and simulation in scientific computing. | 64 |