HarmBench

Attack simulator

A standardized framework for evaluating and improving the robustness of large language models against adversarial attacks

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

GitHub

366 stars

6 watching

59 forks

Language: Jupyter Notebook

last commit: 12 months ago

Linked from 1 awesome list

Screenshot of centerforaisafety/HarmBench website

harmbench.org

Backlinks from these awesome lists:

ethicalml/awesome-production-machine-learning

Related projects:

Repository	Description	Stars
robustbench/robustbench	A standardized benchmark for measuring the robustness of machine learning models against adversarial attacks	682
n0dec/malwless	A tool designed to simulate system compromise or attack behaviors without running processes or PoCs.	271
sail-research/iba	This repository provides a setup and framework for investigating irreversible backdoor attacks in Federated Learning systems.	31
markcyber/badusb	A collection of educational scripts and payloads for simulating vulnerabilities and malware attacks on Windows systems using custom hardware.	60
azure/simuland	A collaboration to create realistic test environments for simulating real-world attacks and improving detection strategies.	704
uber-common/metta	An adversarial simulation tool to test information security preparedness by simulating network-based attacks on various systems.	1,103
trycatchhcf/dumpsterfire	A toolset for creating and automating customized security events to simulate realistic scenarios for testing and training	998
borealisai/advertorch	A toolbox for researching and evaluating robustness against attacks on machine learning models	1,311
nshalabi/attack-tools	Utilities for simulating adversary behavior in the context of threat intelligence and security analysis	1,011
hfzhang31/a3fl	A framework for attacking federated learning systems with adaptive backdoor attacks	23
amv42/sshd-honeypot	An intrusion detection system designed to capture and analyze ssh interactions between an attacker and a modified OpenSSH deamon	26
splunk/attack_range	A tool to simulate attacks against virtual environments and collect data into Splunk for detection development	2,181
elastic/swat	A tool designed to simulate malicious behavior against Google Workspace environments for threat research and detection rule effectiveness testing	163
ai-secure/dba	A tool for demonstrating and analyzing attacks on federated learning systems by introducing backdoors into distributed machine learning models.	179
openbas-platform/openbas	A comprehensive cyber adversary simulation platform for planning and conducting simulated attacks and exercises	765