LingCloud
Visual augmentation for LLMs
An approach to enhance large language models by incorporating visual information using human-like eyes
Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large Language Model""
48 stars
1 watching
1 forks
Language: Python
last commit: 4 months ago Related projects:
Repository | Description | Stars |
---|---|---|
jiutian-vl/jiutian-lion | This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. | 121 |
junyanz/mcilboost | An open-source software package implementing two boosting-based Multiple Instance Learning methods for image segmentation and classification tasks. | 29 |
pzzhang/vinvl | A project aimed at improving visual representations in vision-language models by developing an object detection model for richer visual object and concept representations. | 350 |
yiren-jian/blitext | Develops and trains models for vision-language learning with decoupled language pre-training | 24 |
sy-xuan/pink | This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. | 76 |
yxuansu/tacl | Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations. | 92 |
vlf-silkie/vlfeedback | An annotated preference dataset and training framework for improving large vision language models. | 85 |
google-research/noisystudent | A semi-supervised learning method to improve the accuracy of machine learning models by using noisy teacher models and student models. | 753 |
shi-labs/vcoder | An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities | 261 |
finnfiddle/potion | A collection of React components for creating animated and interactive visualizations. | 184 |
younghjung/onlinemlrboostingwithvfdt | An implementation of online multi-label ranking boosting using VFDT as weak learners | 4 |
vishaal27/sus-x | This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. | 94 |
tianyi-lab/hallusionbench | An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy | 243 |
yunwentechnology/unilm | This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. | 438 |
yiyangzhou/lure | Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. | 134 |