MAGIC
Visual Guidance Model
Enables language models to generate text based on visual inputs and captions images without requiring explicit training or labeling data.
Language Models Can See: Plugging Visual Controls in Text Generation
254 stars
11 watching
27 forks
Language: Python
last commit: over 2 years ago clipgpt-2image-captioningmultimodalplug-and-play-language-modelsstory-generationtext-generationunsupervised-learningzero-shot