robustness_metrics
Model Robustness Tool
A toolset to evaluate the robustness of machine learning models
466 stars
11 watching
33 forks
Language: Jupyter Notebook
last commit: 4 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
hendrycks/robustness | Evaluates and benchmarks the robustness of deep learning models to various corruptions and perturbations in computer vision tasks. | 1,022 |
borealisai/advertorch | A toolbox for researching and evaluating robustness against attacks on machine learning models | 1,308 |
robustbench/robustbench | A standardized benchmark for measuring the robustness of machine learning models against adversarial attacks | 667 |
google-research/deep_ope | A set of pre-trained reinforcement learning policies and benchmarking data for offline model selection in reinforcement learning. | 85 |
guanghelee/neurips19-certificates-of-robustness | Tight certificates of adversarial robustness for randomly smoothed classifiers | 17 |
jmgirard/mreliability | Tools for calculating consistency of observer measurements in various contexts | 40 |
edisonleeeee/greatx | A toolbox for graph reliability and robustness against noise, distribution shifts, and attacks. | 83 |
modeloriented/fairmodels | A tool for detecting bias in machine learning models and mitigating it using various techniques. | 86 |
google-research/rlds | A toolkit for storing and manipulating episodic data in reinforcement learning and related tasks. | 293 |
google/ml-fairness-gym | An open source framework for studying long-term fairness effects in machine learning decision systems | 312 |
freedomintelligence/mllm-bench | Evaluates and compares the performance of multimodal large language models on various tasks | 55 |
benhamner/metrics | Provides implementations of various supervised machine learning evaluation metrics in multiple programming languages. | 1,627 |
i-gallegos/fair-llm-benchmark | Compiles bias evaluation datasets and provides access to original data sources for large language models | 110 |
sail-sg/mmcbench | A benchmarking framework designed to evaluate the robustness of large multimodal models against common corruption scenarios | 27 |
cmawer/reproducible-model | A project demonstrating how to create a reproducible machine learning model using Python and version control | 86 |