lens
Image captioner
Enhances language models to generate text based on visual descriptions of images
This is the official repository for the LENS (Large Language Models Enhanced to See) system.
351 stars
9 watching
12 forks
Language: Jupyter Notebook
last commit: 12 months ago Related projects:
Repository | Description | Stars |
---|---|---|
jiasenlu/adaptiveattention | Adaptive attention mechanism for image captioning using visual sentinels | 334 |
chapternewscu/image-captioning-with-semantic-attention | A deep learning model for generating image captions with semantic attention | 51 |
luoweizhou/vlp | A project for pre-training models to support image captioning and question answering tasks. | 412 |
lukemelas/image-paragraph-captioning | Trains image paragraph captioning models to generate diverse and accurate captions | 90 |
vision-cair/chatcaptioner | Enables automatic generation of descriptive text from images and videos based on user input. | 452 |
fengyang0317/unsupervised_captioning | An unsupervised image captioning framework that allows generating captions from images without paired data. | 215 |
cshizhe/asg2cap | An image caption generation model that uses abstract scene graphs to fine-grained control and generate captions | 200 |
anonymousanoy/fohe | Automates the process of generating multiple rewritten image captions by fine-tuning large vision-language models | 7 |
rmokady/clip_prefix_caption | An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation. | 1,315 |
byungkwanlee/moai | Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 311 |
isekai-portal/link-context-learning | An implementation of a multimodal learning approach to improve language models' ability to recognize unseen images and understand novel concepts. | 89 |
nickjiang2378/vl-interp | This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. | 31 |
luciancaetano/lens-ui | A React UI component library designed to be simple and customizable | 8 |
stevenfontanella/microlens | A lightweight alternative to the lens library with fewer dependencies and no Template Haskell support | 285 |
commaai/commacoloring | An online coloring book with interactive features | 101 |