robustness_metrics
Model Robustness Tool
A toolset to evaluate the robustness of machine learning models
466 stars
11 watching
33 forks
Language: Jupyter Notebook
last commit: 6 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
hendrycks/robustness | Evaluates and benchmarks the robustness of deep learning models to various corruptions and perturbations in computer vision tasks. | 1,030 |
borealisai/advertorch | A toolbox for researching and evaluating robustness against attacks on machine learning models | 1,311 |
robustbench/robustbench | A standardized benchmark for measuring the robustness of machine learning models against adversarial attacks | 682 |
google-research/deep_ope | Provides benchmarking policies and datasets for offline reinforcement learning | 85 |
guanghelee/neurips19-certificates-of-robustness | Provides a framework for computing tight certificates of adversarial robustness for randomly smoothed classifiers. | 17 |
jmgirard/mreliability | Tools for calculating consistency of observer measurements in various contexts | 41 |
edisonleeeee/greatx | A toolbox for graph reliability and robustness against noise, distribution shifts, and attacks. | 85 |
modeloriented/fairmodels | A tool for detecting bias in machine learning models and mitigating it using various techniques. | 86 |
google-research/rlds | A toolkit for storing and manipulating episodic data in reinforcement learning and related tasks. | 302 |
google/ml-fairness-gym | An open-source tool for simulating the long-term impacts of machine learning-based decision systems on social environments | 314 |
freedomintelligence/mllm-bench | Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
benhamner/metrics | Provides implementations of various supervised machine learning evaluation metrics in multiple programming languages. | 1,632 |
i-gallegos/fair-llm-benchmark | Compiles bias evaluation datasets and provides access to original data sources for large language models | 115 |
sail-sg/mmcbench | A benchmarking framework designed to evaluate the robustness of large multimodal models against common corruption scenarios | 27 |
cmawer/reproducible-model | A project demonstrating how to create a reproducible machine learning model using Python and version control | 86 |