LingCloud

Visual enhancer for LLMs

Enhances language models by incorporating human-like eyes to improve visual comprehension and interaction with external world

Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""

GitHub

48 stars
1 watching
1 forks
Language: Python
last commit: 6 months ago

Related projects:

Repository Description Stars
jiutian-vl/jiutian-lion This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. 124
junyanz/mcilboost An open-source software package implementing two boosting-based Multiple Instance Learning methods for image segmentation and classification tasks. 29
pzzhang/vinvl A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations. 350
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
sy-xuan/pink This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. 79
yxuansu/tacl Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations. 92
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 88
google-research/noisystudent A semi-supervised learning method to improve the accuracy of machine learning models by using noisy teacher models and student models. 755
shi-labs/vcoder An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities 266
finnfiddle/potion A collection of React components for creating animated and interactive visualizations. 184
younghjung/onlinemlrboostingwithvfdt An implementation of online multi-label ranking boosting using VFDT as weak learners 4
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94
tianyi-lab/hallusionbench An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy 259
yunwentechnology/unilm This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. 439
yiyangzhou/lure Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. 136