Dolphins
Driving Model
A multimodal language model designed to process video and text data for driving scenarios
[ECCV 2024] The official code for "Dolphins: Multimodal Language Model for Driving“
51 stars
3 watching
9 forks
Language: Python
last commit: 7 months ago Related projects:
Repository | Description | Stars |
---|---|---|
| A large-scale 5D semantics benchmark for autonomous driving | 171 |
| An autonomous driving project exploring the capabilities of a visual-language model in understanding complex driving scenes and making decisions | 288 |
| Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 143 |
| A deep learning module for stateful imitation learning in autonomous driving simulations. | 5 |
| A large multi-modal model developed using the Llama3 language model, designed to improve image understanding capabilities. | 32 |
| A deep learning framework for training multi-modal models with vision and language capabilities. | 1,299 |
| Implementing a unified modal learning framework for generative vision-language models | 43 |
| Designs and simulates autonomous vehicle control systems using Python and Carla Simulator. | 117 |
| An end-to-end image captioning system that uses large multi-modal models and provides tools for training, inference, and demo usage. | 1,849 |
| An implementation of a multimodal language model with capabilities for comprehension and generation | 585 |
| A multimodal LLM designed to handle text-rich visual questions | 270 |
| An interactive control system for text generation in multi-modal language models | 135 |
| A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages | 1,098 |
| Implementation of a conditional imitation learning policy in PyTorch for autonomous driving using the Carla dataset. | 65 |
| A Matlab implementation of a car following model | 48 |