LingCloud
Visual enhancer for LLMs
Enhances language models by incorporating human-like eyes to improve visual comprehension and interaction with external world
Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""
48 stars
1 watching
1 forks
Language: Python
last commit: 6 months ago Related projects:
Repository | Description | Stars |
---|---|---|
jiutian-vl/jiutian-lion | This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. | 124 |
junyanz/mcilboost | An open-source software package implementing two boosting-based Multiple Instance Learning methods for image segmentation and classification tasks. | 29 |
pzzhang/vinvl | A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations. | 350 |
yiren-jian/blitext | Develops and trains models for vision-language learning with decoupled language pre-training | 24 |
sy-xuan/pink | This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. | 79 |
yxuansu/tacl | Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations. | 92 |
vlf-silkie/vlfeedback | An annotated preference dataset and training framework for improving large vision language models. | 88 |
google-research/noisystudent | A semi-supervised learning method to improve the accuracy of machine learning models by using noisy teacher models and student models. | 755 |
shi-labs/vcoder | An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities | 266 |
finnfiddle/potion | A collection of React components for creating animated and interactive visualizations. | 184 |
younghjung/onlinemlrboostingwithvfdt | An implementation of online multi-label ranking boosting using VFDT as weak learners | 4 |
vishaal27/sus-x | This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. | 94 |
tianyi-lab/hallusionbench | An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy | 259 |
yunwentechnology/unilm | This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. | 439 |
yiyangzhou/lure | Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. | 136 |