ChatBridge

Multimodal Model

A unified multimodal language model capable of interpreting and reasoning about various modalities without paired data.

ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.

GitHub

47 stars
2 watching
1 forks
Language: Python
last commit: about 1 year ago

Related projects:

Repository Description Stars
thunlp/muffin A framework for building multimodal foundation models that can serve as bridges between different modalities and language models. 57
42wim/matterbridge A bridge that connects multiple chat protocols to a unified interface 6,656
andrewnguonly/chatabstractions Provides a framework for creating custom chat models with dynamic failover and load balancing features 79
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 14
langboat/mengzi3 An 8B and 13B language model based on the Llama architecture with multilingual capabilities. 2,032
mainframecomputer/fullmoon-ios An iOS app that provides a chat interface to local large language models, optimized for Apple silicon. 410
openbmb/viscpm A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages 1,089
yuliang-liu/monkey A toolkit for building conversational AI models that can process images and text inputs. 1,825
lyuchenyang/macaw-llm A multi-modal language model that integrates image, video, audio, and text data to improve language understanding and generation 1,550
deltachat-bot/matterdelta A tool that enables communication between Delta Chat and other supported chat services using Matterbridge. 13
xverse-ai/xverse-v-13b A large multimodal model for visual question answering, trained on a dataset of 2.1B image-text pairs and 8.2M instruction sequences. 77
kohjingyu/fromage A framework for grounding language models to images and handling multimodal inputs and outputs 478
tele-ai/telechat-52b An open-source chat model built on top of the 52B large language model, with improvements in position encoding, activation function, and layer normalization. 40
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 72
mshukor/unival A unified model for image, video, audio, and language tasks that can be fine-tuned for various downstream applications. 224