robustness_metrics

Model Robustness Tool

A toolset to evaluate the robustness of machine learning models

GitHub

466 stars
11 watching
33 forks
Language: Jupyter Notebook
last commit: 4 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
hendrycks/robustness Evaluates and benchmarks the robustness of deep learning models to various corruptions and perturbations in computer vision tasks. 1,022
borealisai/advertorch A toolbox for researching and evaluating robustness against attacks on machine learning models 1,308
robustbench/robustbench A standardized benchmark for measuring the robustness of machine learning models against adversarial attacks 667
google-research/deep_ope A set of pre-trained reinforcement learning policies and benchmarking data for offline model selection in reinforcement learning. 85
guanghelee/neurips19-certificates-of-robustness Tight certificates of adversarial robustness for randomly smoothed classifiers 17
jmgirard/mreliability Tools for calculating consistency of observer measurements in various contexts 40
edisonleeeee/greatx A toolbox for graph reliability and robustness against noise, distribution shifts, and attacks. 83
modeloriented/fairmodels A tool for detecting bias in machine learning models and mitigating it using various techniques. 86
google-research/rlds A toolkit for storing and manipulating episodic data in reinforcement learning and related tasks. 293
google/ml-fairness-gym An open source framework for studying long-term fairness effects in machine learning decision systems 312
freedomintelligence/mllm-bench Evaluates and compares the performance of multimodal large language models on various tasks 55
benhamner/metrics Provides implementations of various supervised machine learning evaluation metrics in multiple programming languages. 1,627
i-gallegos/fair-llm-benchmark Compiles bias evaluation datasets and provides access to original data sources for large language models 110
sail-sg/mmcbench A benchmarking framework designed to evaluate the robustness of large multimodal models against common corruption scenarios 27
cmawer/reproducible-model A project demonstrating how to create a reproducible machine learning model using Python and version control 86