DeepSeek-VL
Vision-Language Model
A multimodal AI model that enables real-world vision-language understanding applications
DeepSeek-VL: Towards Real-World Vision-Language Understanding
2k stars
20 watching
202 forks
Language: Python
last commit: 10 months ago foundation-modelsvision-language-modelvision-language-pretraining
Related projects:
Repository | Description | Stars |
---|---|---|
| A large language model trained on a massive dataset for various applications | 1,512 |
| A large language model with improved efficiency and performance compared to similar models | 1,024 |
| A deep learning framework for training multi-modal models with vision and language capabilities. | 1,299 |
| Provides a collection of deep learning models and utilities in JAX/Flax for research purposes. | 151 |
| This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. | 94 |
| A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities | 246 |
| A deep learning implementation of an object segmentation algorithm. | 187 |
| A guide to using pre-trained large language models in source code analysis and generation | 1,789 |
| Develops and trains models for vision-language learning with decoupled language pre-training | 24 |
| An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. | 1,076 |
| Improves large vision-language models' ability to accurately describe images by combining global and local attention mechanisms. | 18 |
| An open-source framework for adapting representation models to various tasks and industries | 1,743 |
| An implementation of a multimodal language model with capabilities for comprehension and generation | 585 |
| An annotated preference dataset and training framework for improving large vision language models. | 88 |
| This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. | 124 |