visdial
Dialog Agent
A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures.
[CVPR 2017] Torch code for Visual Dialog
228 stars
18 watching
69 forks
Language: Lua
last commit: almost 6 years ago computer-visiondeep-learningnatural-language-processingtorch
Related projects:
Repository | Description | Stars |
---|---|---|
open3da/ll3da | An interactive system for understanding and interacting with 3D environments using natural language. | 248 |
agenta-ai/agenta | A developer platform for building and deploying large language models | 1,275 |
ucsc-vlaa/sight-beyond-text | This repository provides an official implementation of a research paper exploring the use of multi-modal training to enhance language models' truthfulness and ethics in various applications. | 19 |
blazored/modal | A reusable UI component for displaying a customizable dialog in Blazor applications | 785 |
dialogflow/dialogflow-ruby-client | A Ruby SDK for interacting with the Dialogflow API natural language processing service. | 141 |
geek-ai/magent | A platform for multi-agent reinforcement learning research and development | 1,690 |
macournoyer/neuralconvo | An implementation of a conversational model using sequence-to-sequence learning and LSTM layers in Torch | 777 |
dvlab-research/prompt-highlighter | An interactive control system for text generation in multi-modal language models | 132 |
allenai/visprog | A system that uses code generation and execution to solve complex visual tasks from natural language instructions. | 693 |
airaria/visual-chinese-llama-alpaca | Develops a multimodal Chinese language model with visual capabilities | 424 |
gulvarol/surreal | This project involves generating synthetic human data to train 3D models of human appearance and behavior. | 588 |
lxtgh/omg-seg | Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. | 1,300 |
vinhnx/inkchatgpt | An application that enables users to upload documents and converse with an AI-powered language model. | 9 |
mlpc-ucsd/bliva | A multimodal LLM designed to handle text-rich visual questions | 269 |
fmaclen/hollama | A web application for interacting with a conversational AI system using large language models. | 477 |