FoolyourVLLMs
Attack framework
An attack framework to manipulate the output of large language models and vision-language models
[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
14 stars
1 watching
2 forks
Language: Python
last commit: about 1 year ago adversarial-attacksllmsmcqvision-and-language
Related projects:
Repository | Description | Stars |
---|---|---|
yunqing-me/attackvlm | An adversarial attack framework on large vision-language models | 161 |
ys-zong/vlguard | Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks | 45 |
ys-zong/vl-icl | A benchmarking suite for multimodal in-context learning models | 28 |
hfzhang31/a3fl | A framework for attacking federated learning systems with adaptive backdoor attacks | 22 |
ethz-spylab/rlhf_trojan_competition | Detecting backdoors in language models to prevent malicious AI usage | 107 |
yuxie11/r2d2 | A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese | 157 |
jeremy313/fl-wbc | A defense mechanism against model poisoning attacks in federated learning | 37 |
junyizhu-ai/r-gap | A tool to demonstrate and analyze attacks on private data in machine learning models using gradients | 34 |
zjunlp/knowlm | A framework for training and utilizing large language models with knowledge augmentation capabilities | 1,239 |
weisong-ucr/mab-malware | An open-source reinforcement learning framework to generate adversarial examples for malware classification models. | 40 |
lhfowl/robbing_the_fed | This implementation allows an attacker to directly obtain user data from federated learning gradient updates by modifying the shared model architecture. | 23 |
yiyangzhou/lure | Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability. | 134 |
kaiyuanzh/flip | A framework for defending against backdoor attacks in federated learning systems | 44 |
yuliang-liu/monkey | A toolkit for building conversational AI models that can process images and text inputs. | 1,825 |
jind11/textfooler | A tool for generating adversarial examples to attack text classification and inference models | 494 |