DeepSeek-VL

Vision-Language Model

A multimodal AI model that enables real-world vision-language understanding applications

DeepSeek-VL: Towards Real-World Vision-Language Understanding

GitHub

2k stars
20 watching
202 forks
Language: Python
last commit: 9 months ago
foundation-modelsvision-language-modelvision-language-pretraining

Related projects:

Repository Description Stars
deepseek-ai/deepseek-llm A large language model trained on a massive dataset for various applications 1,512
deepseek-ai/deepseek-moe A large language model with improved efficiency and performance compared to similar models 1,024
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,299
darshandeshpande/jax-models Provides a collection of deep learning models and utilities in JAX/Flax for research purposes. 151
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94
baaivision/eve A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities 246
abbypa/nnproject_deepmask A deep learning implementation of an object segmentation algorithm. 187
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,789
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
meituan-automl/mobilevlm An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. 1,076
lackel/agla Improves large vision-language models' ability to accurately describe images by combining global and local attention mechanisms. 18
deepset-ai/farm An open-source framework for adapting representation models to various tasks and industries 1,743
ailab-cvc/seed An implementation of a multimodal language model with capabilities for comprehension and generation 585
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 88
jiutian-vl/jiutian-lion This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. 124