LLMGA

Image editing assistant

An implementation of a multimodal generation assistant using large language models and various image editing techniques.

This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024 Oral

GitHub

463 stars
13 watching
29 forks
Language: Python
last commit: 5 months ago
aigcimage-design-assistantimage-editingimage-generationlarge-language-modelllmmllmmulti-modal

Related projects:

Repository Description Stars
dvlab-research/llama-vid An image-based language model that uses large language models to generate visual and text features from videos 748
dvlab-research/lisa A system that uses large language models to generate segmentation masks for images based on complex queries and world knowledge. 1,923
llava-vl/llava-interactive-demo An all-in-one demo for interactive image processing and generation 353
ailab-cvc/seed An implementation of a multimodal language model with capabilities for comprehension and generation 585
rksm/org-ai A minor mode in Emacs for integrating generative AI models into text editing, with support for speech input/output and image generation. 710
noelyahan/mergi A Go library and command-line tool for manipulating images 236
hjprint/university_project_vmd_majorization A MATLAB implementation of a variable-model data fusion algorithm for removing noise from images and generating denoised images 81
microsoft/llava-med A research project aimed at building large language and vision models for biomedical applications with capabilities comparable to GPT-4. 1,622
google-research/xmcgan_image_generation This implementation enables text-to-image generation by leveraging cross-modal contrastive learning. 98
dvlab-research/prompt-highlighter An interactive control system for text generation in multi-modal language models 135
nvlabs/eagle Develops high-resolution multimodal LLMs by combining vision encoders and various input resolutions 549
open3da/ll3da An interactive system for understanding and interacting with 3D environments using natural language. 255
alibaba/conv-llava This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance. 106
opengvlab/multi-modality-arena An evaluation platform for comparing multi-modality models on visual question-answering tasks 478
vpgtrans/vpgtrans Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs 270