FoolyourVLLMs

Attack framework

An attack framework to manipulate the output of large language models and vision-language models

[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations

GitHub

14 stars

1 watching

2 forks

Language: Python

last commit: almost 2 years ago

adversarial-attacksllmsmcqvision-and-language

Related projects:

Repository	Description	Stars
yunqing-me/attackvlm	An adversarial attack framework on large vision-language models	165
ys-zong/vlguard	Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks	47
ys-zong/vl-icl	A benchmarking suite for multimodal in-context learning models	31
hfzhang31/a3fl	A framework for attacking federated learning systems with adaptive backdoor attacks	23
ethz-spylab/rlhf_trojan_competition	Detecting backdoors in language models to prevent malicious AI usage	109
yuxie11/r2d2	A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese	157
jeremy313/fl-wbc	A defense mechanism against model poisoning attacks in federated learning	37
junyizhu-ai/r-gap	A tool to demonstrate and analyze attacks on private data in machine learning models using gradients	34
zjunlp/knowlm	A framework for training and utilizing large language models with knowledge augmentation capabilities	1,251
weisong-ucr/mab-malware	An open-source reinforcement learning framework to generate adversarial examples for malware classification models.	41
lhfowl/robbing_the_fed	This implementation allows an attacker to directly obtain user data from federated learning gradient updates by modifying the shared model architecture.	23
yiyangzhou/lure	Analyzing and mitigating object hallucination in large vision-language models to improve their accuracy and reliability.	136
kaiyuanzh/flip	A framework for defending against backdoor attacks in federated learning systems	48
yuliang-liu/monkey	An end-to-end image captioning system that uses large multi-modal models and provides tools for training, inference, and demo usage.	1,849
jind11/textfooler	A tool for generating adversarial examples to attack text classification and inference models	496