vstar

Visual search framework

PyTorch implementation of guided visual search mechanism for multimodal LLMs

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

GitHub

527 stars
11 watching
33 forks
Language: Python
last commit: 11 months ago

Related projects:

Repository Description Stars
kunpengli1994/vsrn An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching 294
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,217
fartashf/vsepp A PyTorch implementation of visual-semantic embedding methods for image-caption retrieval 489
hal3/macarico An implementation of an imperative learning to search framework in PyTorch for deep learning-based structured prediction and reinforcement learning. 111
vpgtrans/vpgtrans Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs 269
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 716
volcengine/vescale A PyTorch-based framework for training large language models in parallel on multiple devices 663
vlgiitr/dmn-plus A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms 64
leaderj1001/mobilenetv3-pytorch An implementation of MobileNetV3 using PyTorch with search space optimization 292
nexusapoorvacus/deepvariationstructuredrl An implementation of reinforcement learning for visual relationship and attribute detection using PyTorch. 63
jayleicn/clipbert An efficient framework for end-to-end learning on image-text and video-text tasks 704
openseg-group/openseg.pytorch Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. 1,190
felixgwu/img_classification_pk_pytorch A PyTorch project for comparing image classification models and facilitating quick experiment setup 365
clementpinard/sfmlearner-pytorch PyTorch implementation of unsupervised depth and ego-motion learning from video sequences 1,014
vinhkhuc/pytorch-mini-tutorials A collection of tutorials and lessons on building deep learning models using the PyTorch library. 326