LL3DA

3D assistant

An interactive system for understanding and interacting with 3D environments using natural language.

[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

GitHub

255 stars

6 watching

10 forks

Language: Python

last commit: about 1 year ago

3d3d-models3d-to-textcvpr2024gptinstruction-tuninglanguage-modelllmmulti-modalscene-understanding

ll3da.github.io/

Related projects:

Repository	Description	Stars
umass-foundation-model/3d-llm	Developing a Large Language Model capable of processing 3D representations as inputs	979
dvlab-research/llmga	An implementation of a multimodal generation assistant using large language models and various image editing techniques.	463
openm3d/m3dbench	An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models.	58
openfl/away3d	An open source platform for developing interactive 3D graphics.	209
batra-mlp-lab/visdial	A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures.	228
agenta-ai/agenta	An end-to-end platform for building and deploying large language model applications	1,624
airaria/visual-chinese-llama-alpaca	Develops a multimodal Chinese language model with visual capabilities	429
aidc-ai/ovis	An MLLM architecture designed to align visual and textual embeddings through structural alignment	575
openrobotlab/pointllm	An open-source software framework that enables large language models to process and understand point cloud data, facilitating multimodal interactions.	670
gulvarol/surreal	This project involves generating synthetic human data to train 3D models of human appearance and behavior.	590
harfang3d/harfang3d	An all-in-one 3D visualization library for C++, Python, Lua, and Go.	586
opengvlab/lamm	A framework and benchmark for training and evaluating multi-modal large language models, enabling the development of AI agents capable of seamless interaction between humans and machines.	305
microsoft/llava-med	A research project aimed at building large language and vision models for biomedical applications with capabilities comparable to GPT-4.	1,622
pfirsich/kaun	A Lua module for 3D graphics intended to provide a low-level API for abstracting away OpenGL details and enabling advanced techniques without the need for significant modifications to an existing game engine.	7
groverburger/g3d	Simplifies 3D rendering in the LÖVE game engine using Lua.	578