Sight-Beyond-Text

Multi-modal LLM training

This repository provides an official implementation of a research paper exploring the use of multi-modal training to enhance language models' truthfulness and ethics in various applications.

This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"

GitHub

19 stars
2 watching
1 forks
Language: Python
last commit: about 1 year ago
ai-alignmentalignmentllama2llavallmmllmvicunavision-languagevlm

Related projects:

Repository Description Stars
mlpc-ucsd/bliva A multimodal LLM designed to handle text-rich visual questions 269
ucsc-vlaa/vllm-safety-benchmark A benchmark for evaluating the safety and robustness of vision language models against adversarial attacks. 67
vpgtrans/vpgtrans Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs 269
ailab-cvc/seed An implementation of a multimodal language model with capabilities for comprehension and generation 576
lyuchenyang/macaw-llm A multi-modal language model that integrates image, video, audio, and text data to improve language understanding and generation 1,550
aidc-ai/ovis An architecture designed to align visual and textual embeddings in multimodal learning 517
mbzuai-oryx/groundinglmm An end-to-end trained model capable of generating natural language responses integrated with object segmentation masks. 781
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 72
llava-vl/llava-plus-codebase A platform for training and deploying large language and vision models that can use tools to perform tasks 704
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94
csuhan/onellm A framework for training and fine-tuning multimodal language models on various data types 588
salt-nlp/llavar An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. 258
alpha-vllm/wemix-llm An LLaMA-based multimodal language model with various instruction-following and multimodal variants. 17
bobazooba/xllm A tool for training and fine-tuning large language models using advanced techniques 380
neulab/pangea An open-source multilingual large language model designed to understand and generate content across diverse languages and cultural contexts 91