VinVL

Visual representation improvement

A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations.

project page for VinVL

GitHub

350 stars
9 watching
25 forks
last commit: over 1 year ago

Related projects:

Repository Description Stars
zuoxingdong/vin_pytorch_visdom Implementation of Value Iteration Networks in PyTorch with visualization capabilities using Visdom. 226
dotnet/vblang Design of a Visual Basic .NET language and runtime library 288
yunxinli/lingcloud An approach to enhance large language models by incorporating visual information using human-like eyes 48
lucasvazq/lucasvazq Personal showcase of a developer's experience and expertise in building web platforms with a focus on accessibility, performance, and robust code. 30
pku-yuangroup/chat-univi A framework for unified visual representation in image and video understanding models, enabling efficient training of large language models on multimodal data. 847
ivanreese/visual-programming-codex An online resource showcasing alternative visual programming languages and projects, reflecting on their concepts, ideas, and potential benefits. 1,356
byungkwanlee/collavo Develops a PyTorch implementation of an enhanced vision language model 93
liaoning97/revo-lion A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models 11
vanshkapoor/vanshkapoor Utility tools and project showcases 24
dbuenzli/vg A declarative 2D vector graphics library written in OCaml 91
byungkwanlee/moai Improves performance of vision language tasks by integrating computer vision capabilities into large language models 311
nickjiang2378/vl-interp This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. 31
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 28
kunpengli1994/vsrn An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching 294
brucesherwood/vpython-jupyter An integration of VPython with Jupyter Notebook for interactive 3D visualization and simulation in scientific computing. 64