visdial

Dialog Agent

A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures.

[CVPR 2017] Torch code for Visual Dialog

GitHub

228 stars
18 watching
69 forks
Language: Lua
last commit: almost 6 years ago
computer-visiondeep-learningnatural-language-processingtorch

Related projects:

Repository Description Stars
open3da/ll3da An interactive system for understanding and interacting with 3D environments using natural language. 248
agenta-ai/agenta A developer platform for building and deploying large language models 1,275
ucsc-vlaa/sight-beyond-text This repository provides an official implementation of a research paper exploring the use of multi-modal training to enhance language models' truthfulness and ethics in various applications. 19
blazored/modal A reusable UI component for displaying a customizable dialog in Blazor applications 785
dialogflow/dialogflow-ruby-client A Ruby SDK for interacting with the Dialogflow API natural language processing service. 141
geek-ai/magent A platform for multi-agent reinforcement learning research and development 1,690
macournoyer/neuralconvo An implementation of a conversational model using sequence-to-sequence learning and LSTM layers in Torch 777
dvlab-research/prompt-highlighter An interactive control system for text generation in multi-modal language models 132
allenai/visprog A system that uses code generation and execution to solve complex visual tasks from natural language instructions. 693
airaria/visual-chinese-llama-alpaca Develops a multimodal Chinese language model with visual capabilities 424
gulvarol/surreal This project involves generating synthetic human data to train 3D models of human appearance and behavior. 588
lxtgh/omg-seg Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. 1,300
vinhnx/inkchatgpt An application that enables users to upload documents and converse with an AI-powered language model. 9
mlpc-ucsd/bliva A multimodal LLM designed to handle text-rich visual questions 269
fmaclen/hollama A web application for interacting with a conversational AI system using large language models. 477