LingCloud

Visual enhancer for LLMs

Enhances language models by incorporating human-like eyes to improve visual comprehension and interaction with external world

Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""

GitHub

48 stars

1 watching

1 forks

Language: Python

last commit: about 2 years ago

Related projects:

Repository	Description	Stars
jiutian-vl/jiutian-lion	This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations.	124
junyanz/mcilboost	An open-source software package implementing two boosting-based Multiple Instance Learning methods for image segmentation and classification tasks.	29
pzzhang/vinvl	A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations.	350
yiren-jian/blitext	Develops and trains models for vision-language learning with decoupled language pre-training	24
sy-xuan/pink	This project enables multi-modal language models to understand and generate text about visual content using referential comprehension.	79
yxuansu/tacl	Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations.	92
vlf-silkie/vlfeedback	An annotated preference dataset and training framework for improving large vision language models.	88
google-research/noisystudent	A semi-supervised learning method to improve the accuracy of machine learning models by using noisy teacher models and student models.	755
shi-labs/vcoder	An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities	266
finnfiddle/potion	A collection of React components for creating animated and interactive visualizations.	184
younghjung/onlinemlrboostingwithvfdt	An implementation of online multi-label ranking boosting using VFDT as weak learners	4
vishaal27/sus-x	This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required.	94
tianyi-lab/hallusionbench	An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy	259
yunwentechnology/unilm	This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese.	439
yiyangzhou/lure	Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability.	136