DeepSeek-VL
Vision-Language Model
A multimodal AI model that enables real-world vision-language understanding applications
DeepSeek-VL: Towards Real-World Vision-Language Understanding
2k stars
19 watching
195 forks
Language: Python
last commit: 7 months ago foundation-modelsvision-language-modelvision-language-pretraining
Related projects:
Repository | Description | Stars |
---|---|---|
deepseek-ai/deepseek-llm | A large language model trained on a massive dataset for various applications | 1,450 |
deepseek-ai/deepseek-moe | A large language model with improved efficiency and performance compared to similar models | 1,006 |
nvlabs/prismer | A deep learning framework for training multi-modal models with vision and language capabilities. | 1,298 |
darshandeshpande/jax-models | Provides a collection of deep learning models and utilities in JAX/Flax for research purposes. | 151 |
vishaal27/sus-x | This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. | 94 |
baaivision/eve | A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities | 230 |
abbypa/nnproject_deepmask | A deep learning implementation of an object segmentation algorithm. | 187 |
vhellendoorn/code-lms | A guide to using pre-trained large language models in source code analysis and generation | 1,782 |
yiren-jian/blitext | Develops and trains models for vision-language learning with decoupled language pre-training | 24 |
meituan-automl/mobilevlm | An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. | 1,039 |
lackel/agla | Improving large vision-language models to accurately describe images without generating fictional objects | 15 |
deepset-ai/farm | An open-source framework for adapting representation models to various tasks and industries | 1,741 |
ailab-cvc/seed | An implementation of a multimodal language model with capabilities for comprehension and generation | 576 |
vlf-silkie/vlfeedback | An annotated preference dataset and training framework for improving large vision language models. | 85 |
jiutian-vl/jiutian-lion | This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. | 121 |