IMAD
Dialogue analyzer
A toolkit for analyzing and generating multi-modal dialogue with images
[AINL 2023] IMAD: IMage Augmented multi-modal Dialogue
4 stars
1 watching
0 forks
Language: Python
last commit: over 2 years ago datasetdeep-learningdialogue-systemsimage2textmultimodalmultimodal-deep-learning
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Develops a multimodal task and dataset to assess vision-language models' ability to handle interleaved image-text inputs. | 33 |
| | An implementation of a general-purpose robot learning model using multimodal prompts | 781 |
| | Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. | 15 |
| | A deep learning model for analyzing sentiment and emotion in text based on emojis. | 1,525 |
| | Provides tools and features to support development in the Ada programming language within Vim/NeoVim text editors. | 7 |
| | An Ember addon for building modal dialogs using a consistent pattern and layout approach. | 390 |
| | An all-in-one demo for interactive image processing and generation | 353 |
| | A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques. | 767 |
| | Extending pretraining models to handle multiple modalities by aligning language and video representations | 751 |
| | An application that enables users to upload documents and converse with an AI-powered language model. | 9 |
| | A deep learning-based framework for image aesthetics assessment using a convolutional neural network structure | 112 |
| | Develops multimodal instruction-following models for open-ended dialogues across multiple images | 43 |
| | A large-scale 5D semantics benchmark for autonomous driving | 171 |
| | A collection of modular deep learning components that can be easily configured and reused in various applications. | 276 |
| | An intelligent system that enables automatic control and utilization of visual foundation models to interact with images in conversational settings. | 762 |