DeepSeek-VL

Vision-Language Model

A multimodal AI model that enables real-world vision-language understanding applications

DeepSeek-VL: Towards Real-World Vision-Language Understanding

GitHub

2k stars

20 watching

202 forks

Language: Python

last commit: about 2 years ago

foundation-modelsvision-language-modelvision-language-pretraining

Screenshot of deepseek-ai/DeepSeek-VL website

huggingface.co/spaces/deepseek-ai/DeepSeek-VL-7B

Related projects:

Repository	Description	Stars
deepseek-ai/deepseek-llm	A large language model trained on a massive dataset for various applications	1,512
deepseek-ai/deepseek-moe	A large language model with improved efficiency and performance compared to similar models	1,024
nvlabs/prismer	A deep learning framework for training multi-modal models with vision and language capabilities.	1,299
darshandeshpande/jax-models	Provides a collection of deep learning models and utilities in JAX/Flax for research purposes.	151
vishaal27/sus-x	This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required.	94
baaivision/eve	A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities	246
abbypa/nnproject_deepmask	A deep learning implementation of an object segmentation algorithm.	187
vhellendoorn/code-lms	A guide to using pre-trained large language models in source code analysis and generation	1,789
yiren-jian/blitext	Develops and trains models for vision-language learning with decoupled language pre-training	24
meituan-automl/mobilevlm	An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models.	1,076
lackel/agla	Improves large vision-language models' ability to accurately describe images by combining global and local attention mechanisms.	18
deepset-ai/farm	An open-source framework for adapting representation models to various tasks and industries	1,743
ailab-cvc/seed	An implementation of a multimodal language model with capabilities for comprehension and generation	585
vlf-silkie/vlfeedback	An annotated preference dataset and training framework for improving large vision language models.	88
jiutian-vl/jiutian-lion	This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations.	124