LISA
Image segmentation tool
A system that uses large language models to generate segmentation masks for images based on complex queries and world knowledge.
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
2k stars
11 watching
131 forks
Language: Python
last commit: 7 months ago large-language-modelllmmulti-modalsegmentation
Related projects:
Repository | Description | Stars |
---|---|---|
dvlab-research/llama-vid | An image-based language model that uses large language models to generate visual and text features from videos | 748 |
opengvlab/visionllm | A large language model designed to process and generate visual information | 956 |
thelegendali/deeplab-context | An implementation of a deep learning system for semantic image segmentation using a combination of convolutional neural networks and conditional random fields. | 239 |
balcilar/drlse-image-segmentation | A method for image segmentation using level sets and a distance regularized term to avoid the need for re-initialization | 89 |
nvlabs/prismer | A deep learning framework for training multi-modal models with vision and language capabilities. | 1,299 |
dvlab-research/llmga | An implementation of a multimodal generation assistant using large language models and various image editing techniques. | 463 |
dvlab-research/prompt-highlighter | An interactive control system for text generation in multi-modal language models | 135 |
zhengpeng7/birefnet | An open-source implementation of an image segmentation model that combines background removal and object detection capabilities. | 1,484 |
esa-philab/iris | A tool for manually segmenting images from satellite data with AI assistance | 141 |
abbypa/nnproject_deepmask | A deep learning implementation of an object segmentation algorithm. | 187 |
yfzhang114/llava-align | Debiasing techniques to minimize hallucinations in large visual language models | 75 |
kreshuklab/plant-seg | A tool for cell instance aware segmentation in densely packed 3D volumetric images | 101 |
evolvinglmms-lab/longva | An open-source project that enables the transfer of language understanding to vision capabilities through long context processing. | 347 |
labforcomputationalvision/matlabpyrtools | Tools for multi-scale image processing and analysis | 180 |
freedomintelligence/longllava | A system for scaling large language models to process and understand visual information from multiple images efficiently. | 183 |