FoolyourVLLMs
Attack framework
An attack framework to manipulate the output of large language models and vision-language models
[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
14 stars
1 watching
2 forks
Language: Python
last commit: about 1 year ago adversarial-attacksllmsmcqvision-and-language
Related projects:
Repository | Description | Stars |
---|---|---|
yunqing-me/attackvlm | An adversarial attack framework on large vision-language models | 165 |
ys-zong/vlguard | Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks | 47 |
ys-zong/vl-icl | A benchmarking suite for multimodal in-context learning models | 31 |
hfzhang31/a3fl | A framework for attacking federated learning systems with adaptive backdoor attacks | 23 |
ethz-spylab/rlhf_trojan_competition | Detecting backdoors in language models to prevent malicious AI usage | 109 |
yuxie11/r2d2 | A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese | 157 |
jeremy313/fl-wbc | A defense mechanism against model poisoning attacks in federated learning | 37 |
junyizhu-ai/r-gap | A tool to demonstrate and analyze attacks on private data in machine learning models using gradients | 34 |
zjunlp/knowlm | A framework for training and utilizing large language models with knowledge augmentation capabilities | 1,251 |
weisong-ucr/mab-malware | An open-source reinforcement learning framework to generate adversarial examples for malware classification models. | 41 |
lhfowl/robbing_the_fed | This implementation allows an attacker to directly obtain user data from federated learning gradient updates by modifying the shared model architecture. | 23 |
yiyangzhou/lure | Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. | 136 |
kaiyuanzh/flip | A framework for defending against backdoor attacks in federated learning systems | 48 |
yuliang-liu/monkey | An end-to-end image captioning system that uses large multi-modal models and provides tools for training, inference, and demo usage. | 1,849 |
jind11/textfooler | A tool for generating adversarial examples to attack text classification and inference models | 496 |