LLMGA
Image editing assistant
An implementation of a multimodal generation assistant using large language models and various image editing techniques.
This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024 Oral
463 stars
13 watching
29 forks
Language: Python
last commit: 6 months ago aigcimage-design-assistantimage-editingimage-generationlarge-language-modelllmmllmmulti-modal
Related projects:
Repository | Description | Stars |
---|---|---|
| An image-based language model that uses large language models to generate visual and text features from videos | 748 |
| A system that uses large language models to generate segmentation masks for images based on complex queries and world knowledge. | 1,923 |
| An all-in-one demo for interactive image processing and generation | 353 |
| An implementation of a multimodal language model with capabilities for comprehension and generation | 585 |
| A minor mode in Emacs for integrating generative AI models into text editing, with support for speech input/output and image generation. | 710 |
| A Go library and command-line tool for manipulating images | 236 |
| A MATLAB implementation of a variable-model data fusion algorithm for removing noise from images and generating denoised images | 81 |
| A research project aimed at building large language and vision models for biomedical applications with capabilities comparable to GPT-4. | 1,622 |
| This implementation enables text-to-image generation by leveraging cross-modal contrastive learning. | 98 |
| An interactive control system for text generation in multi-modal language models | 135 |
| Develops high-resolution multimodal LLMs by combining vision encoders and various input resolutions | 549 |
| An interactive system for understanding and interacting with 3D environments using natural language. | 255 |
| This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance. | 106 |
| An evaluation platform for comparing multi-modality models on visual question-answering tasks | 478 |
| Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs | 270 |