LingCloud

Visual augmentation for LLMs

An approach to enhance large language models by incorporating visual information using human-like eyes

Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""

GitHub

48 stars
1 watching
1 forks
Language: Python
last commit: 4 months ago

Related projects:

Repository Description Stars
jiutian-vl/jiutian-lion This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. 121
junyanz/mcilboost An open-source software package implementing two boosting-based Multiple Instance Learning methods for image segmentation and classification tasks. 29
pzzhang/vinvl A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations. 350
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
sy-xuan/pink This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. 76
yxuansu/tacl Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations. 92
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 85
google-research/noisystudent A semi-supervised learning method to improve the accuracy of machine learning models by using noisy teacher models and student models. 753
shi-labs/vcoder An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities 261
finnfiddle/potion A collection of React components for creating animated and interactive visualizations. 184
younghjung/onlinemlrboostingwithvfdt An implementation of online multi-label ranking boosting using VFDT as weak learners 4
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94
tianyi-lab/hallusionbench An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy 243
yunwentechnology/unilm This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. 438
yiyangzhou/lure Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. 134