M3DBench
3D dataset
An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models.
[ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.
58 stars
5 watching
2 forks
Language: Python
last commit: 5 months ago 3ddatasetinstruction-tuningllmmlmmulti-modal
Related projects:
Repository | Description | Stars |
---|---|---|
| Provides a 3D layout dataset and annotation tools for training and testing 3D layout models. | 27 |
| Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics | 1,253 |
| Developing a Large Language Model capable of processing 3D representations as inputs | 979 |
| A collection of benchmarks to evaluate the multi-modal understanding capability of large vision language models. | 168 |
| Enables machine learning on three-dimensional molecular structure by providing tools and datasets for working with 3D molecular data | 303 |
| A benchmark for evaluating large language models in multiple languages and formats | 93 |
| A dataset of 2D images and 3D data generated from the Grand Theft Auto game engine for object localization research. | 135 |
| Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,196 |
| A curated list of large machine learning models tracked over time | 341 |
| A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset | 541 |
| Automates data generation and model training for improving MLLM capabilities | 39 |
| An interactive system for understanding and interacting with 3D environments using natural language. | 255 |
| Provides MATLAB code and dataset for training machine learning models in millimeter wave and massive MIMO systems | 162 |
| A collection of data and tools for training algorithms to estimate dense depth in urban environments. | 497 |
| A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages | 1,098 |