TextBind
Conversational AI framework
Enables larger language models to generate multi-turn multimodal instruction-response conversations from image-caption pairs with minimal annotations.
[2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation
48 stars
3 watching
3 forks
Language: Python
last commit: about 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
microsoft/mpnet | Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
pku-yuangroup/languagebind | Extending pretraining models to handle multiple modalities by aligning language and video representations | 723 |
csuhan/onellm | A framework for training and fine-tuning multimodal language models on various data types | 588 |
tiger-ai-lab/uniir | Trains and evaluates a universal multimodal retrieval model to perform various information retrieval tasks. | 110 |
yiren-jian/blitext | Develops and trains models for vision-language learning with decoupled language pre-training | 24 |
qinbinli/moon | A framework for collaborative machine learning model training that leverages similarity between model representations to correct local training. | 263 |
fuxiaoliu/mmc | Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. | 84 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
mbzuai-nlp/bactrian-x | A collection of multilingual language models trained on a dataset of instructions and responses in various languages. | 94 |
mbzuai-llm/web2code | A dataset and framework for training large multimodal language models on webpage-to-code generation tasks | 62 |
vishaal27/sus-x | This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. | 94 |
chendelong1999/polite-flamingo | Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models | 63 |
openbmb/cpm-live | A live training platform for large-scale deep learning models, allowing community participation and collaboration in model development and deployment. | 511 |
vlf-silkie/vlfeedback | An annotated preference dataset and training framework for improving large vision language models. | 85 |
open-mmlab/mmengine | Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,179 |