vstar

Visual search framework

PyTorch implementation of guided visual search mechanism for multimodal LLMs

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

GitHub

541 stars
11 watching
37 forks
Language: Python
last commit: about 1 year ago

Related projects:

Repository Description Stars
kunpengli1994/vsrn An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching 294
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,236
fartashf/vsepp A PyTorch implementation of visual-semantic embedding methods for image-caption retrieval 492
hal3/macarico An implementation of an imperative learning to search framework in PyTorch for deep learning-based structured prediction and reinforcement learning. 111
vpgtrans/vpgtrans Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs 270
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 718
volcengine/vescale A PyTorch-based framework for training large language models in parallel on multiple devices 679
vlgiitr/dmn-plus A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms 64
leaderj1001/mobilenetv3-pytorch An implementation of MobileNetV3 using PyTorch with search space optimization 292
nexusapoorvacus/deepvariationstructuredrl An implementation of reinforcement learning for visual relationship and attribute detection using PyTorch. 63
jayleicn/clipbert An efficient framework for end-to-end learning on image-text and video-text tasks 709
openseg-group/openseg.pytorch Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. 1,191
felixgwu/img_classification_pk_pytorch A PyTorch project for comparing image classification models and facilitating quick experiment setup 366
clementpinard/sfmlearner-pytorch Pytorch implementation of unsupervised depth and ego-motion learning from video sequences 1,022
vinhkhuc/pytorch-mini-tutorials A collection of tutorials and lessons on building deep learning models using the PyTorch library. 326