LL3DA
3D assistant
An interactive system for understanding and interacting with 3D environments using natural language.
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
255 stars
6 watching
10 forks
Language: Python
last commit: 6 months ago 3d3d-models3d-to-textcvpr2024gptinstruction-tuninglanguage-modelllmmulti-modalscene-understanding
Related projects:
Repository | Description | Stars |
---|---|---|
umass-foundation-model/3d-llm | Developing a Large Language Model capable of processing 3D representations as inputs | 979 |
dvlab-research/llmga | An implementation of a multimodal generation assistant using large language models and various image editing techniques. | 463 |
openm3d/m3dbench | An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models. | 58 |
openfl/away3d | An open source platform for developing interactive 3D graphics. | 209 |
batra-mlp-lab/visdial | A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures. | 228 |
agenta-ai/agenta | An end-to-end platform for building and deploying large language model applications | 1,624 |
airaria/visual-chinese-llama-alpaca | Develops a multimodal Chinese language model with visual capabilities | 429 |
aidc-ai/ovis | An MLLM architecture designed to align visual and textual embeddings through structural alignment | 575 |
openrobotlab/pointllm | An open-source software framework that enables large language models to process and understand point cloud data, facilitating multimodal interactions. | 670 |
gulvarol/surreal | This project involves generating synthetic human data to train 3D models of human appearance and behavior. | 590 |
harfang3d/harfang3d | An all-in-one 3D visualization library for C++, Python, Lua, and Go. | 586 |
opengvlab/lamm | A framework and benchmark for training and evaluating multi-modal large language models, enabling the development of AI agents capable of seamless interaction between humans and machines. | 305 |
microsoft/llava-med | A research project aimed at building large language and vision models for biomedical applications with capabilities comparable to GPT-4. | 1,622 |
pfirsich/kaun | A Lua module for 3D graphics intended to provide a low-level API for abstracting away OpenGL details and enabling advanced techniques without the need for significant modifications to an existing game engine. | 7 |
groverburger/g3d | Simplifies 3D rendering in the LÖVE game engine using Lua. | 578 |