vllm-safety-benchmark
Vision model safety test
A benchmark for evaluating the safety and robustness of vision language models against adversarial attacks.
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
72 stars
4 watching
3 forks
Language: Python
last commit: almost 2 years ago adversarial-attacksbenchmarkdatasetsllmmultimodal-llmrobustnesssafetyvision-language-model
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 84 |
| | An implementation of a multimodal LLM training paradigm to enhance truthfulness and ethics in language models | 19 |
| | Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks | 47 |
| | A toolkit to detect and protect against vulnerabilities in Large Language Models. | 122 |
| | A set of tools and guidelines for assessing the security vulnerabilities of language models in AI applications | 28 |
| | A unified benchmark for safe reinforcement learning algorithms and environments. | 410 |
| | A toolkit for assessing trustworthiness in large language models | 491 |
| | Evaluates and benchmarks the robustness of deep learning models to various corruptions and perturbations in computer vision tasks. | 1,030 |
| | A large language model designed to process and generate visual information | 956 |
| | Develops a PyTorch implementation of an enhanced vision language model | 93 |
| | A multimodal LLM designed to handle text-rich visual questions | 270 |
| | A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities | 246 |
| | A system that uses large language models to generate segmentation masks for images based on complex queries and world knowledge. | 1,923 |
| | A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,236 |
| | A benchmark for evaluating large language models' ability to process multimodal input | 322 |