HarmBench

Attack simulator

A standardized framework for evaluating and improving the robustness of large language models against adversarial attacks

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

GitHub

335 stars
6 watching
56 forks
Language: Jupyter Notebook
last commit: 3 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
robustbench/robustbench A standardized benchmark for measuring the robustness of machine learning models against adversarial attacks 667
n0dec/malwless A tool designed to simulate system compromise or attack behaviors without running processes or PoCs. 271
sail-research/iba This repository provides a setup and framework for investigating irreversible backdoor attacks in Federated Learning systems. 29
markcyber/badusb A collection of educational scripts and payloads for simulating vulnerabilities and malware attacks on Windows systems using custom hardware. 44
azure/simuland A collaboration to create realistic test environments for simulating real-world attacks and improving detection strategies. 703
uber-common/metta An adversarial simulation tool to test information security preparedness by simulating network-based attacks on various systems. 1,102
trycatchhcf/dumpsterfire A toolset for creating and automating customized security events to simulate realistic scenarios for testing and training 997
borealisai/advertorch A toolbox for researching and evaluating robustness against attacks on machine learning models 1,308
nshalabi/attack-tools Utilities for simulating adversary behavior in the context of threat intelligence and security analysis 1,012
hfzhang31/a3fl A framework for attacking federated learning systems with adaptive backdoor attacks 22
amv42/sshd-honeypot An intrusion detection system designed to capture and analyze ssh interactions between an attacker and a modified OpenSSH deamon 26
splunk/attack_range A tool to simulate attacks against virtual environments and collect data into Splunk for detection development 2,162
elastic/swat A tool designed to simulate malicious behavior against Google Workspace environments for threat research and detection rule effectiveness testing 161
ai-secure/dba A tool for demonstrating and analyzing attacks on federated learning systems by introducing backdoors into distributed machine learning models. 177
openbas-platform/openbas A comprehensive cyber adversary simulation platform for planning and conducting simulated attacks and exercises 690