LL3DA

3D assistant

An interactive system for understanding and interacting with 3D environments using natural language.

[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

GitHub

255 stars
6 watching
10 forks
Language: Python
last commit: 6 months ago
3d3d-models3d-to-textcvpr2024gptinstruction-tuninglanguage-modelllmmulti-modalscene-understanding

Related projects:

Repository Description Stars
umass-foundation-model/3d-llm Developing a Large Language Model capable of processing 3D representations as inputs 979
dvlab-research/llmga An implementation of a multimodal generation assistant using large language models and various image editing techniques. 463
openm3d/m3dbench An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models. 58
openfl/away3d An open source platform for developing interactive 3D graphics. 209
batra-mlp-lab/visdial A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures. 228
agenta-ai/agenta An end-to-end platform for building and deploying large language model applications 1,624
airaria/visual-chinese-llama-alpaca Develops a multimodal Chinese language model with visual capabilities 429
aidc-ai/ovis An MLLM architecture designed to align visual and textual embeddings through structural alignment 575
openrobotlab/pointllm An open-source software framework that enables large language models to process and understand point cloud data, facilitating multimodal interactions. 670
gulvarol/surreal This project involves generating synthetic human data to train 3D models of human appearance and behavior. 590
harfang3d/harfang3d An all-in-one 3D visualization library for C++, Python, Lua, and Go. 586
opengvlab/lamm A framework and benchmark for training and evaluating multi-modal large language models, enabling the development of AI agents capable of seamless interaction between humans and machines. 305
microsoft/llava-med A research project aimed at building large language and vision models for biomedical applications with capabilities comparable to GPT-4. 1,622
pfirsich/kaun A Lua module for 3D graphics intended to provide a low-level API for abstracting away OpenGL details and enabling advanced techniques without the need for significant modifications to an existing game engine. 7
groverburger/g3d Simplifies 3D rendering in the LÖVE game engine using Lua. 578