vllm-safety-benchmark
Vision model safety test
A benchmark for evaluating the safety and robustness of vision language models against adversarial attacks.
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
67 stars
4 watching
3 forks
Language: Python
last commit: 12 months ago adversarial-attacksbenchmarkdatasetsllmmultimodal-llmrobustnesssafetyvision-language-model
Related projects:
Repository | Description | Stars |
---|---|---|
aifeg/benchlmm | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 83 |
ucsc-vlaa/sight-beyond-text | This repository provides an official implementation of a research paper exploring the use of multi-modal training to enhance language models' truthfulness and ethics in various applications. | 19 |
ys-zong/vlguard | Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks | 45 |
safellama/plexiglass | A toolkit to detect and protect against vulnerabilities in Large Language Models. | 121 |
leondz/lm_risk_cards | A set of tools and guidelines for assessing the security vulnerabilities of language models in AI applications | 25 |
pku-alignment/safety-gymnasium | A unified benchmark for safe reinforcement learning algorithms and environments. | 394 |
howiehwong/trustllm | A toolkit for assessing trustworthiness in large language models | 466 |
hendrycks/robustness | Evaluates and benchmarks the robustness of deep learning models to various corruptions and perturbations in computer vision tasks. | 1,022 |
opengvlab/visionllm | A large language model designed to process and generate visual information | 915 |
byungkwanlee/collavo | Develops a PyTorch implementation of an enhanced vision language model | 93 |
mlpc-ucsd/bliva | A multimodal LLM designed to handle text-rich visual questions | 269 |
baaivision/eve | A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities | 230 |
dvlab-research/lisa | A system that uses large language models to generate segmentation masks for images based on complex queries and world knowledge. | 1,861 |
kaiyangzhou/dassl.pytorch | A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,217 |
ailab-cvc/seed-bench | A benchmark for evaluating large language models' ability to process multimodal input | 315 |