AGLA
Image descriptor model
Improves large vision-language models' ability to accurately describe images by combining global and local attention mechanisms.
[Arxiv 2024] AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
18 stars
2 watching
0 forks
Language: Python
last commit: over 1 year ago Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Develops a PyTorch implementation of an enhanced vision language model | 93 |
| | A multimodal AI model that enables real-world vision-language understanding applications | 2,145 |
| | A system that uses large language models to generate segmentation masks for images based on complex queries and world knowledge. | 1,923 |
| | A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities | 246 |
| | Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. | 136 |
| | Automates object removal from images using computer vision techniques | 99 |
| | Debiasing techniques to minimize hallucinations in large visual language models | 75 |
| | An approach to reduce object hallucinations in large vision-language models by contrasting output distributions derived from original and distorted visual inputs | 222 |
| | A deep learning library for image segmentation and object detection using PyTorch. | 1,054 |
| | Developing a Large Language Model capable of processing 3D representations as inputs | 979 |
| | A large language model designed to process and generate visual information | 956 |
| | Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 314 |
| | Evaluating and improving large multimodal models through in-context learning | 21 |
| | This project controls vision-language models to restore degraded images in various environments and conditions. | 673 |
| | Efficient Contextual Representation Learning Model with Continuous Outputs | 4 |