DeepSeek-VL

Vision-Language Model

A multimodal AI model that enables real-world vision-language understanding applications

DeepSeek-VL: Towards Real-World Vision-Language Understanding

GitHub

2k stars
19 watching
195 forks
Language: Python
last commit: 7 months ago
foundation-modelsvision-language-modelvision-language-pretraining

Related projects:

Repository Description Stars
deepseek-ai/deepseek-llm A large language model trained on a massive dataset for various applications 1,450
deepseek-ai/deepseek-moe A large language model with improved efficiency and performance compared to similar models 1,006
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,298
darshandeshpande/jax-models Provides a collection of deep learning models and utilities in JAX/Flax for research purposes. 151
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94
baaivision/eve A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities 230
abbypa/nnproject_deepmask A deep learning implementation of an object segmentation algorithm. 187
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,782
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
meituan-automl/mobilevlm An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. 1,039
lackel/agla Improving large vision-language models to accurately describe images without generating fictional objects 15
deepset-ai/farm An open-source framework for adapting representation models to various tasks and industries 1,741
ailab-cvc/seed An implementation of a multimodal language model with capabilities for comprehension and generation 576
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 85
jiutian-vl/jiutian-lion This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. 121