awesome-machine-learning-interpretability

AI interpretability guide

A curated list of resources on machine learning interpretability and responsible AI practices

A curated list of awesome responsible machine learning resources.

GitHub

4k stars
132 watching
589 forks
last commit: 8 days ago
Linked from 5 awesome lists

ai-safetyawesomeawesome-listdata-scienceexplainable-mlfairnessinterpretabilityinterpretable-aiinterpretable-machine-learninginterpretable-mlmachine-learningmachine-learning-interpretabilityprivacy-enhancing-technologiesprivacy-preserving-machine-learningpythonrreliable-aisecure-mltransparencyxai

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

8 Principles of Responsible ML
A Brief Overview of AI Governance for Responsible Machine Learning Systems
Acceptable Use Policies for Foundation Models 5 4 months ago
Access Now, Regulatory Mapping on Artificial Intelligence in Latin America: Regional AI Public Policy Report
Ada Lovelace Institute, Code and Conduct: How to Create Third-Party Auditing Regimes for AI Systems
Adversarial ML Threat Matrix 1,050 over 1 year ago
AI Governance Needs Sociotechnical Expertise: Why the Humanities and Social Sciences Are Critical to Government Efforts
AI Model Registries: A Foundational Tool for AI Governance, September 2024
AI Verify :

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / AI Verify

AI Verify Foundation
AI Verify Foundation, Cataloguing LLM Evaluations
AI Verify Foundation, Generative AI: Implications for Trust and Governance
AI Verfiy Foundation, Model Governance Framework for Generative AI

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

AI Snake Oil
The Alan Turing Institute, AI Ethics and Governance in Practice
The Alan Turing Institute, Responsible Data Stewardship in Practice
The Alan Turing Institute, AI Standards Hub
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
Andreessen Horowitz (a16z) AI Canon
Anthropic's Responsible Scaling Policy
AuditBoard: 5 AI Auditing Frameworks to Encourage Accountability
Auditing machine learning algorithms: A white paper for public auditors
AWS Data Privacy FAQ
AWS Privacy Notice
AWS, What is Data Governance?
Berryville Institute of Machine Learning, Architectural Risk Analysis of Large Language Models (requires free account login)
BIML Interactive Machine Learning Risk Framework
Boston University AI Task Force Report on Generative AI in Education and Research
Brendan Bycroft's LLM Visualization
Brown University, How Can We Tackle AI-Fueled Misinformation and Disinformation in Public Health?
Casey Flores, AIGP Study Guide

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Center for Security and Emerging Technology (CSET):

CSET's Harm Taxonomy for the AI Incident Database 12 5 months ago
CSET Publications
Adding Structure to AI Harm: An Introduction to CSET's AI Harm Framework
AI Accidents: An Emerging Threat: What Could Happen and What to Do, CSET Policy Brief, July 2021
AI Incident Collection: An Observational Study of the Great AI Experiment
Repurposing the Wheel: Lessons for AI Standards
Translating AI Risk Management Into Practice
Understanding AI Harms: An Overview

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Censius, AI Audit
Censius, An In-Depth Guide To Help You Start Auditing Your AI Models
Center for AI and Digital Policy Reports
Center for Democracy and Technology (CDT), AI Policy & Governance
Center for Democracy and Technology (CDT), Applying Sociotechnical Approaches to AI Governance in Practice
Center for Democracy and Technology (CDT), In Deep Trouble: Surfacing Tech-Powered Sexual Harassment in K-12 Schools
CivAI, GenAI Toolkit for the NIST AI Risk Management Framework: Thinking Through the Risks of a GenAI Chatbot
Coalition for Content Provenance and Authenticity (C2PA)
Council of Europe, European Audiovisual Observatory, IRIS, AI and the audiovisual sector: navigating the current legal landscape
Crowe LLP: Internal auditor's AI safety checklist
Data Provenance Explorer
Data & Society, AI Red-Teaming Is Not a One-Stop Solution to AI Harms: Recommendations for Using Red-Teaming for AI Accountability
Dealing with Bias and Fairness in AI/ML/Data Science Systems
Debugging Machine Learning Models (ICLR workshop proceedings)
Decision Points in AI Governance
Demos, AI – Trustworthy By Design: How to build trust in AI systems, the institutions that create them and the communities that use them
Digital Policy Alert, The Anatomy of AI Rules: A systematic comparison of AI rules across the globe
Distill
Dominique Shelton Leipzig, Countries With Draft AI Legislation or Frameworks
Ethical and social risks of harm from Language Models
Ethics for people who work in tech
EU Digital Partners, U.S. A.I. Laws: A State-by-State Study
Evaluating LLMs is a minefield
Fairly's Global AI Regulations Map 24 10 months ago
Fairness and Bias in Algorithmic Hiring: A Multidisciplinary Survey
FATML Principles and Best Practices
Federation of American Scientists, A NIST Foundation To Support The Agency’s AI Mandate
Financial Industry Regulatory Authority (FINRA), Artificial Intelligence (AI) in the Securities Industry
ForHumanity Body of Knowledge (BOK)
The Foundation Model Transparency Index

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / The Foundation Model Transparency Index

Trustible, Model Transparency Ratings

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

From Principles to Practice: An interdisciplinary framework to operationalise AI ethics
FS-ISAC, February 2024, Generative AI Vendor Risk Assessment Guide
The Future Society
Gage Repeatability and Reproducibility

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Google:

Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing
The Data Cards Playbook
Data governance in the cloud - part 1 - People and processes
Data Governance in the Cloud - part 2 - Tools
Evaluating social and ethical risks from generative AI
Generative AI Prohibited Use Policy
Perspectives on Issues in AI Governance
Principles and best practices for data governance in the cloud
Responsible AI Framework
Responsible AI practices
Testing and Debugging in Machine Learning

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

GSMA, September 2024, Best Practice Tools: Examples supporting responsible AI maturity
H2O.ai Algorithms 1,483 28 days ago
HackerOne Blog
Haptic Networks: How to Perform an AI Audit for UK Organisations
Hogan Lovells, The AI Act is coming: EU reaches political agreement on comprehensive regulation of artificial intelligence
Hugging Face, The Landscape of ML Documentation Tools
IAPP, Global AI Governance Law and Policy: Canada, EU, Singapore, UK and US
ICT Institute: A checklist for auditing AI systems

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / IEEE:

A Flexible Maturity Model for AI Governance Based on the NIST AI Risk Management Framework
P3119 Standard for the Procurement of Artificial Intelligence and Automated Decision Systems
Std 1012-1998 Standard for Software Verification and Validation

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Independent Audit of AI Systems
Identifying and Overcoming Common Data Mining Mistakes
Infocomm Media Development Authority (Singapore) and AI Verify Foundation, Cataloguing LLM Evaluations, Draft for Discussion (October 2023)
Infocomm Media Development Authority (Singapore), First of its kind Generative AI Evaluation Sandbox for Trusted AI by AI Verify Foundation and IMDA
Information Technology Industry (ITI) Council, October 2024, ITI's AI Security Policy Principles
International Bar Association and the Center for AI and Digital Policy, The Future Is Now: Artificial Intelligence and the Legal Profession
Institute for AI Policy and Strategy (IAPS), AI-Relevant Regulatory Precedents: A Systematic Search Across All Federal Agencies
Institute for AI Policy and Strategy (IAPS), Key questions for the International Network of AI Safety Institutes
Institute for AI Policy and Strategy (IAPS), Mapping Technical Safety Research at AI Companies: A literature review and incentives analysis
Institute for AI Policy and Strategy (IAPS), Understanding the First Wave of AI Safety Institutes: Characteristics, Functions, and Challenges
Institute for Public Policy Research (IPPR), Transformed by AI: How Generative Artificial Intelligence Could Affect Work in the UK—And How to Manage It
Institute for Security and Technology (IST), The Implications of Artificial Intelligence in Cybersecurity: Shifting the Offense-Defense Balance
Institute of Internal Auditors
Institute of Internal Auditors: Artificial Intelligence Auditing Framework, Practical Applications, Part A, Special Edition

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / ISACA:

ISACA: Auditing Artificial Intelligence
ISACA: Auditing Guidelines for Artificial Intelligence
ISACA: Capability Maturity Model Integration Resources

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Integrity Institute Report, February 2024, On Risk Assessment and Mitigation for Algorithmic Systems
ISO/IEC 42001:2023, Information technology — Artificial intelligence — Management system
Know Your Data
Language Model Risk Cards: Starter Set 25 6 months ago
Large language models, explained with a minimum of math and jargon
Larry G. Wlosinski, April 30, 2021, Information System Contingency Planning Guidance
Library of Congress, LC Labs AI Planning Framework 30 6 months ago
Llama 2 Responsible Use Guide
LLM Visualization
Machine Learning Quick Reference: Algorithms
Machine Learning Quick Reference: Best Practices
Manifest MLBOM Wiki 33 13 days ago

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Manifest MLBOM Wiki

Towards Traceability in Data Ecosystems using a Bill of Materials Model

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Meta:

System cards

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Microsoft:

Advancing AI responsibly
Azure AI Content Safety

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Microsoft: / Azure AI Content Safety

Harm categories in Azure AI Content Safety
Microsoft Responsible AI Standard, v2

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Microsoft:

GDPR and Generative AI: A Guide for Public Sector Organizations

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

MLA, How do I cite generative AI in MLA style?
model-cards-and-datasheets 71 5 months ago
NewsGuard AI Tracking Center
OpenAI, Building an early warning system for LLM-aided biological threat creation
OpenAI Cookbook, How to implement LLM guardrails
OpenAI, Evals 15,015 about 2 months ago
Open Data Institute, Understanding data governance in AI: Mapping governance
Open Sourcing Highly Capable Foundation Models
Organization and Training of a Cyber Security Team
Our Data Our Selves, Data Use Policy
OWASP, Guide for Preparing and Responding to Deepfake Events: From the OWASP Top 10 for LLM Applications Team, Version 1, September 2024
Oxford Commission on AI & Good Governance, AI in the Public Service: From Principles to Practice
PAIR Explorables: Datasets Have Worldviews
Partnership on AI, ABOUT ML Reference Document
Partnership on AI, PAI’s Guidance for Safe Foundation Model Deployment: A Framework for Collective Action
Partnership on AI, Responsible Practices for Synthetic Media: A Framework for Collective Action
PwC's Responsible AI
RAND Corporation, U.S. Tort Liability for Large-Scale Artificial Intelligence Damages A Primer for Developers and Policymakers
RAND Corporation, Analyzing Harms from AI-Generated Images and Safeguarding Online Authenticity
Ravit Dotan's Projects
Real-World Strategies for Model Debugging
RecoSense: Phases of an AI Data Audit – Assessing Opportunity in the Enterprise
Robust ML
Safe and Reliable Machine Learning
Sample AI Incident Response Checklist
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
SHRM Generative Artificial Intelligence (AI) Chatbot Usage Policy
Special Competitive Studies Project and Johns Hopkins University Applied Physics Laboratory, Framework for Identifying Highly Consequential AI Use Cases
Stanford University, Open Problems in Technical AI Governance: A repository of open problems in technical AI governance
Stanford University, Responsible AI at Stanford: Enabling innovation through AI best practices
Synack, The Complete Guide to Crowdsourced Security Testing, Government Edition
The Rise of Generative AI and the Coming Era of Social Media Manipulation 3.0: Next-Generation Chinese Astroturfing and Coping with Ubiquitous AI
Taskade: AI Audit PBC Request Checklist Template
Taylor & Francis, AI Policy
Tech Policy Press - Artificial Intelligence
TechTarget: 9 questions to ask when auditing your AI systems
Troubleshooting Deep Neural Networks
Trustible, Enhancing the Effectiveness of AI Governance Committees
Twitter Algorithmic Bias Bounty
Unite.AI: How to perform an AI Audit in 2023
University of California, Berkeley, Center for Long-Term Cybersecurity, A Taxonomy of Trustworthiness for Artificial Intelligence
University of California, Berkeley, Information Security Office, How to Write an Effective Website Privacy Statement
University of Washington Tech Policy Lab, Data Statements
Warning Signs: The Future of Privacy and Security in an Age of Machine Learning
When Not to Trust Your Explanations
Why We Need to Know More: Exploring the State of AI Incident Documentation Practices
WilmerHale, What Are High-Risk AI Systems Within the Meaning of the EU’s AI Act, and What Requirements Apply to Them?
World Economic Forum, AI Value Alignment: Guiding Artificial Intelligence Towards Shared Human Goals
World Economic Forum, Responsible AI Playbook for Investors
World Privacy Forum, AI Governance on the Ground: Canada’s Algorithmic Impact Assessment Process and Algorithm has evolved
World Privacy Forum, Risky Analysis: Assessing and Improving AI Governance Tools
You Created A Machine Learning Application Now Make Sure It's Secure
A-LIGN, ISO 42001 Requirement, NIST SP 800-218A Task, Recommendations and Considerations
AppliedAI Institute, Navigating the EU AI Act: A Process Map for making AI Systems available
BCG Robotaxonomy
Center for Security and Emerging Technology (CSET), High Level Comparison of Legislative Perspectives on Artificial Intelligence US vs. EU
European Data Protection Board (EDPB), Checklist for AI Auditing
Foundation Model Development Cheatsheet
Foundation Model Transparency Index Scores by Major Dimensions of Transparency, May 2024
Future of Privacy Forum, EU AI Act: A Comprehensive Implementation & Compliance Timeline
Future of Privacy Forum, The Spectrum of Artificial Intelligence
IAPP EU AI Act Cheat Sheet
IAPP, EU AI Act Compliance Matrix
IAPP, EU AI Act Compliance Matrix - At a Glance
Instruction finetuning an LLM from scratch
Is it a "deep fake" under the EU AI ACT?
Machine Learning Attack_Cheat_Sheet
Oliver Patel's Cheat Sheets

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Oliver Patel's Cheat Sheets

10 Key Pillars for Enterprise AI Governance
10 Key Questions for AI Risk Assessments
20 Key Policy Principles for Generative AI Use: Protect your organization with actionable and accessible generative AI policies
AI Governance in 2023
Canada AI Law & Policy Cheat Sheet
China AI Law Cheat Sheet
Definitions, Scope & Applicability EU AI Act Cheat Sheet Series, Part 1
EU AI Act Cheat Sheet Series 1, Definitions, Scope & Applicability
EU AI Act Cheat Sheet Series 2, Prohibited AI Systems
EU AI Act Cheat Sheet Series 3, High-Risk AI Systems
EU AI Act Cheat Sheet Series 4, Requirements for Providers
EU AI Act Cheat Sheet Series 5, Requirements for Deployers
EU AI Act Cheat Sheet Series 6, General-Purpose AI Models
EU AI Act Cheat Sheet Series 7, Compliance & Conformity Assessment
EU AI Act Cheat Sheet Series 8, Governance & Enforcement
India AI Policy Cheat Sheet
Governance Audit, Model Audit, and Application Audit
Gulf Countries AI Policy Cheat Sheet
Singapore AI Policy Cheat Sheet
UK AI Policy Cheat Sheet

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Open Source Audit Tooling (OAT) Landscape
Phil Lee, AI Act: Difference between AI systems and AI models
Phil Lee, AI Act: Meet the regulators!(Arts 30, 55b, 56 and 59)
Phil Lee, How the AI Act applies to integrated generative AI
Phil Lee, Overview of AI Act requirements for deployers of high risk AI systems
Phil Lee, Overview of AI Act requirements for providers of high risk AI systems
Purpose and Means AI Explainer Series - issue #4 - Navigating the EU AI Act
Trustible, Is It AI? How different laws & frameworks define AI
What Access Protections Do AI Companies Provide for Independent Safety Research?
Exploiting Novel GPT-4 APIs
Identifying and Eliminating CSAM in Generative ML Training Data and Models
Jailbreaking Black Box Large Language Models in Twenty Queries
LLM Agents can Autonomously Exploit One-day Vulnerabilities

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / LLM Agents can Autonomously Exploit One-day Vulnerabilities

No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Ofcom, Red Teaming for GenAI Harms: Revealing the Risks and Rewards for Online Safety, July 23, 2024
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Red Teaming of Advanced Information Assurance Concepts
@dotey on X/Twitter exploring GPT prompt security and prevention measures
0xeb / GPT-analyst 181 8 months ago
0xk1h0 / ChatGPT "DAN" (and other "Jailbreaks") 6,487 3 months ago
ACL 2024 Tutorial: Vulnerabilities of Large Language Models to Adversarial Attacks
Azure's PyRIT 1,891 6 days ago
Berkeley Center for Long-Term Cybersecurity (CLTC), https://cltc.berkeley.edu/publication/benchmark-early-and-red-team-often-a-framework-for-assessing-and-managing-dual-use-hazards-of-ai-foundation-models/
CDAO frameworks, guidance, and best practices for AI test & evaluation
ChatGPT_system_prompt 8,208 21 days ago
coolaj86 / Chat GPT "DAN" (and other "Jailbreaks")
CSET, What Does AI-Red Teaming Actually Mean?
DAIR Prompt Engineering Guide

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / DAIR Prompt Engineering Guide

DAIR Prompt Engineering Guide GitHub 50,262 24 days ago

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Extracting Training Data from ChatGPT
Frontier Model Forum: What is Red Teaming?
Generative AI Red Teaming Challenge: Transparency Report 2024
HackerOne, An Emerging Playbook for AI Red Teaming with HackerOne
Humane Intelligence, SeedAI, and DEFCON AI Village, Generative AI Red Teaming Challenge: Transparency Report 2024
In-The-Wild Jailbreak Prompts on LLMs 2,710 about 1 month ago
leeky: Leakage/contamination testing for black box language models 6 9 months ago
LLM Security & Privacy 433 2 months ago
Membership Inference Attacks and Defenses on Machine Learning Models Literature 290 19 days ago
Learn Prompting, Prompt Hacking

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / Learn Prompting, Prompt Hacking

MiesnerJacob / learn-prompting, Prompt Hacking 33 over 1 year ago

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

Lakera AI's Gandalf
leondz / garak 1,471 6 days ago
Microsoft AI Red Team building future of safer AI
OpenAI Red Teaming Network
r/ChatGPTJailbreak

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance / r/ChatGPTJailbreak

developer mode fixed

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Community Frameworks and Guidance

A Safe Harbor for AI Evaluation and Red Teaming
Y Combinator, ChatGPT Grandma Exploit
Backpack Language Models
Jay Alammar, Finding the Words to Say: Hidden State Visualizations for Language Models
Jay Alammar, Interfaces for Explaining Transformer Language Models
Patrick Hall and Daniel Atherton, Generative AI Risk Management Resources 10 9 days ago
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
The Remarkable Robustness of LLMs: Stages of Inference?
Columbia Business School, Generative AI Policy
Columbia University, Considerations for AI Tools in the Classroom
Columbia University, Generative AI Policy
Georgetown University, Artificial Intelligence and Homework Support Policies
Georgetown University, Artificial Intelligence (Generative) Resources
Georgetown University, Teaching with AI
George Washington University, Faculty Resources: Generative AI
George Washington University, Guidelines for Using Generative Artificial Intelligence at the George Washington University April 2023
George Washington University, Guidelines for Using Generative Artificial Intelligence in Connection with Academic Work
Harvard Business School, 2.1.2 Using ChatGPT & Artificial Intelligence (AI) Tools
Harvard Graduate School of Education, HGSE AI Policy
Harvard University, AI Guidance & FAQs
Harvard University, Guidelines for Using ChatGPT and other Generative AI tools at Harvard
Massachusetts Institute of Technology, Initial guidance for use of Generative AI tools
Massachusetts Institute of Technology, Generative AI & Your Course
Stanford Graduate School of Business, Course Policies on Generative AI Use
Stanford University, Artificial Intelligence Teaching Guide
Stanford University, Creating your course policy on AI
Stanford University, Generative AI Policy Guidance
Stanford University, Responsible AI at Stanford
University of California, AI Governance and Transparency
University of California, Applicable Law and UC Policy
University of California, Legal Alert: Artificial Intelligence Tools
University of California, Berkeley, AI at UC Berkeley
University of California, Berkeley, Appropriate Use of Generative AI Tools
University of California, Irvine, Generative AI for Teaching and Learning
University of California, Irvine, Statement on Generative AI Detection
University of California, Los Angeles, Artificial Intelligence (A.I.) Tools and Academic Use
University of California, Los Angeles, ChatGPT and AI Resources
University of California, Los Angeles, Generative AI
University of California, Los Angeles, Guiding Principles for Responsible Use
University of California, Los Angeles, Teaching Guidance for ChatGPT and Related AI Developments
University of Notre Dame, AI Recommendations for Instructors
University of Notre Dame, AI@ND Policies and Guidelines
University of Notre Dame, Generative AI Policy for Students
University of Southern California, Using Generative AI in Research
University of Washington, AI+Teaching
University of Washington, AI+Teaching, Sample syllabus statements regarding student use of artificial intelligence
Yale University, AI at Yale
Yale University, AI Guidance for Teachers
Yale University, Yale University AI guidelines for staff
Yale University, Guidelines for the Use of Generative AI Tools

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops

AAAI Conference on Artificial Intelligence
ACM FAccT (Fairness, Accountability, and Transparency)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops / ACM FAccT (Fairness, Accountability, and Transparency)

FAT/ML (Fairness, Accountability, and Transparency in Machine Learning)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops

ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO)
AIES (AAAI/ACM Conference on AI, Ethics, and Society)
Black in AI
Computer Vision and Pattern Recognition (CVPR)
Evaluating Generative AI Systems: the Good, the Bad, and the Hype (April 15, 2024)
IAPP, AI Governance Global 2024, June 4-7, 2024
International Conference on Machine Learning (ICML)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops / International Conference on Machine Learning (ICML) / :

2nd ICML Workshop on Human in the Loop Learning (HILL)
5th ICML Workshop on Human Interpretability in Machine Learning (WHI)
Challenges in Deploying and Monitoring Machine Learning Systems
Economics of privacy and data labor
Federated Learning for User Privacy and Data Confidentiality
Healthcare Systems, Population Health, and the Role of Health-tech
Law & Machine Learning
ML Interpretability for Scientific Discovery
MLRetrospectives: A Venue for Self-Reflection in ML Research
Participatory Approaches to Machine Learning
XXAI: Extending Explainable AI Beyond Deep Models and Classifiers
Human-AI Collaboration in Sequential Decision-Making
Machine Learning for Data: Automated Creation, Privacy, Bias
ICML Workshop on Algorithmic Recourse
ICML Workshop on Human in the Loop Learning (HILL)
ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI
Information-Theoretic Methods for Rigorous, Responsible, and Reliable Machine Learning (ITR3)
International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML'21)
Interpretable Machine Learning in Healthcare
Self-Supervised Learning for Reasoning and Perception
The Neglected Assumptions In Causal Inference
Theory and Practice of Differential Privacy
Uncertainty and Robustness in Deep Learning
Workshop on Computational Approaches to Mental Health @ ICML 2021
Workshop on Distribution-Free Uncertainty Quantification
Workshop on Socially Responsible Machine Learning
1st ICML 2022 Workshop on Safe Learning for Autonomous Driving (SL4AD)
2nd Workshop on Interpretable Machine Learning in Healthcare (IMLH)
DataPerf: Benchmarking Data for Data-Centric AI
Disinformation Countermeasures and Machine Learning (DisCoML)
Responsible Decision Making in Dynamic Environments
Spurious correlations, Invariance, and Stability (SCIS)
The 1st Workshop on Healthcare AI and COVID-19
Theory and Practice of Differential Privacy
Workshop on Human-Machine Collaboration and Teaming
2nd ICML Workshop on New Frontiers in Adversarial Machine Learning
2nd Workshop on Formal Verification of Machine Learning
3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH)
Challenges in Deployable Generative AI
“Could it have been different?” Counterfactuals in Minds and Machines
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities
Generative AI and Law (GenLaw)
Interactive Learning with Implicit Human Feedback
Neural Conversational AI Workshop - What’s left to TEACH (Trustworthy, Enhanced, Adaptable, Capable and Human-centric) chatbots?
The Second Workshop on Spurious Correlations, Invariance and Stability

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops

Knowledge, Discovery, and Data Mining (KDD)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops / Knowledge, Discovery, and Data Mining (KDD) / :

2nd ACM SIGKDD Workshop on Ethical Artificial Intelligence: Methods and Applications
KDD Data Science for Social Good 2023

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops

Mission Control AI, Booz Allen Hamilton, and The Intellectual Forum at Jesus College, Cambridge, The 2024 Leaders in Responsible AI Summit, March 22, 2024
NAACL 24 Tutorial: Explanations in the Era of Large Language Models
Neural Information Processing Systems (NeurIPs)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops / Neural Information Processing Systems (NeurIPs) / :

5th Robot Learning Workshop: Trustworthy Robotics
Algorithmic Fairness through the Lens of Causality and Privacy
Causal Machine Learning for Real-World Impact
Challenges in Deploying and Monitoring Machine Learning Systems
Cultures of AI and AI for Culture
Empowering Communities: A Participatory Approach to AI for Mental Health
Federated Learning: Recent Advances and New Challenges
Gaze meets ML
HCAI@NeurIPS 2022, Human Centered AI
Human Evaluation of Generative Models
Human in the Loop Learning (HiLL) Workshop at NeurIPS 2022
I Can’t Believe It’s Not Better: Understanding Deep Learning Through Empirical Falsification
Learning Meaningful Representations of Life
Machine Learning for Autonomous Driving
Progress and Challenges in Building Trustworthy Embodied AI
Tackling Climate Change with Machine Learning
Trustworthy and Socially Responsible Machine Learning
Workshop on Machine Learning Safety
AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics
Algorithmic Fairness through the Lens of Time
Attributing Model Behavior at Scale (ATTRIB)
Backdoors in Deep Learning: The Good, the Bad, and the Ugly
Computational Sustainability: Promises and Pitfalls from Theory to Deployment
I Can’t Believe It’s Not Better (ICBINB): Failure Modes in the Age of Foundation Models
Socially Responsible Language Modelling Research (SoLaR)
Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations
Workshop on Distribution Shifts: New Frontiers with Foundation Models
XAI in Action: Past, Present, and Future Applications

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Conferences and Workshops

OECD.AI, Building the foundations for collaboration: The OECD-African Union AI Dialogue
Oxford Generative AI Summit Slides

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

Department of Industry, Science and Resources, AI Governance: Leadership insights and the Voluntary AI Safety Standard in practice
Department of Industry, Science and Resources, The AI Impact Navigator, October 21, 2024
Department of Industry, Science and Resources, Australia’s AI Ethics Principles
Department of Industry, Science and Resources, Introducing mandatory guardrails for AI in high-risk settings: proposals paper
Department of Industry, Sciences and Resources, Voluntary AI Safety Standard, August 2024
Digital Transformation Agency, Evaluation of the whole-of-government trial of Microsoft 365 Copilot: Summary of evaluation findings, October 23, 2024
Digital Transformation Agency, Policy for the responsible use of AI in government, September 2024, Version 1.1
Office of the Australian Information Commissioner, Guidance on privacy and developing and training generative AI models
Office of the Australian Information Commissioner, Guidance on privacy and the use of commercially available AI products
National framework for the assurance of artificial intelligence in government
Testing the Reliability, Validity, and Equity of Terrorism Risk Assessment Instruments
Algorithmic Impact Assessment tool
A Regulatory Framework for AI: Recommendations for PIPEDA Reform
Artificial Intelligence and Data Act
The Artificial Intelligence and Data Act (AIDA) – Companion document
Developing Financial Sector Resilience in a Digital World: Selected Themes in Technology and Related Risks
Directive on Automated Decision Making (Canada)
(Draft Guideline) E-23 – Model Risk Management
Health Canada, Transparency for machine learning-enabled medical devices: Guiding principles
Gouvernance des algorithmes d’intelligence artificielle dans le secteur financier (France)
Bundesamt für Sicherheit in der Informationstechnik, Generative AI Models - Opportunities and Risks for Industry and Authorities
Bundesamt für Sicherheit in der Informationstechnik, German-French recommendations for the use of AI programming assistants
Japan AI Safety Institute, Guide to Red Teaming Methodology on AI Safety (Version 1.00) (September 25, 2024)
The National Guidelines on AI Governance & Ethics
Autoriteit Persoonsgegevens, Call for input on prohibition on AI systems for emotion recognition in the areas of workplace or education institutions (October 31, 2024)
Autoriteit Persoonsgegevens, scraping bijna altijd illegal (Dutch Data Protection Authority, "Scraping is always illegal")
General principles for the use of Artificial Intelligence in the financial sector
AI Safety Institute (AISI), Advanced AI evaluations at AISI: May update
Algorithm Charter for Aotearoa New Zealand
Callaghan Innovation, EU AI Fact Sheet 4, High-risk AI systems
Personal Data Protection Commission (PDPC), Companion to the Model AI Governance Framework – Implementation and Self-Assessment Guide for Organizations
Personal Data Protection Commission (PDPC), Compendium of Use Cases: Practical Illustrations of the Model AI Governance Framework
Personal Data Protection Commission (PDPC), Model Artificial Intelligence Governance Framework (Second Edition)
Personal Data Protection Commission (PDPC), Privacy Enhancing Technology (PET): Proposed Guide on Synthetic Data Generation
AI Safety Institute (AISI), Safety cases at AISI
Department for Science, Innovation and Technology and AI Safety Institute, International Scientific Report on the Safety of Advanced AI
Department for Science, Innovation and Technology, The Bletchley Declaration by Countries Attending the AI Safety Summit, 1-2 November 2023
Department for Science, Innovation and Technology, Frontier AI: capabilities and risks - discussion paper (United Kingdom)
Department for Science, Innovation and Technology, Guidance, Introduction to AI Assurance
Information Commissioner's Office (ICO), AI tools in recruitment (November 6, 2024)
National Physical Laboratory (NPL), Beginner's guide to measurement GPG118
Ofcom, Red Teaming for GenAI Harms: Revealing the Risks and Rewards for Online Safety, July 23, 2024
Online Harms White Paper: Full government response to the consultation (United Kingdom)
The Public Sector Bodies (Websites and Mobile Applications) Accessibility Regulations 2018
12 CFR Part 1002 - Equal Credit Opportunity Act (Regulation B)
Chatbots in consumer finance
Innovation spotlight: Providing adverse action notices when using AI/ML models
A Primer on Artificial Intelligence in Securities Markets
Responsible Artificial Intelligence in Financial Markets
H.R. 9720, AI Incident Reporting and Security Enhancement Act
Artificial Intelligence: Background, Selected Issues, and Policy Considerations, May 19, 2021
Artificial Intelligence: Overview, Recent Advances, and Considerations for the 118th Congress, August 4, 2023
Highlights of the 2023 Executive Order on Artificial Intelligence for Congress, November 17, 2023
Artificial Intelligence and Machine Learning in Financial Services, April 3, 2024
Copyright and Artificial Intelligence, Part 1: Digital Replicas, July 2024
Privacy Policy and Data Policy
Explainable Artificial Intelligence (XAI) (Archived)
Computer Security Technology Planning Study, October 1, 1972
Artificial intelligence
Intellectual property
National Institute of Standards and Technology (NIST)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / National Institute of Standards and Technology (NIST)

AI 100-1 Artificial Intelligence Risk Management Framework (NIST AI RMF 1.0)
De-identification Tools
NIST AI 100-2 E2023: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations
Assessing Risks and Impacts of AI (ARIA)
Four Principles of Explainable Artificial Intelligence, Draft NISTIR 8312, 2020-08-17
Four Principles of Explainable Artificial Intelligence, NISTIR 8312, 2021-09-29
Engineering Statistics Handbook
Measurement Uncertainty

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / National Institute of Standards and Technology (NIST) / Measurement Uncertainty

International Bureau of Weights and Measures (BIPM), Evaluation of measurement data—Guide to the expression of uncertainty in measurement

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / National Institute of Standards and Technology (NIST)

NIST Special Publication 800-30 Revision 1, Guide for Conducting Risk Assessments
Psychological Foundations of Explainability and Interpretability in Artificial Intelligence

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

National Telecommunications and Information Administration (NTIA)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / National Telecommunications and Information Administration (NTIA)

AI Accountability Policy Report
Internet Policy Task Force, Commercial Data Privacy and Innovation in the Internet Economy: A Dynamic Policy Framework

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

AI Principles: Recommendations on the Ethical Use of Artificial Intelligence
Audit of Governance and Protection of Department of Defense Artificial Intelligence Data and Technology
Chief Data and Artificial Intelligence Officer (CDAO) Assessment and Assurance

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Chief Data and Artificial Intelligence Officer (CDAO) Assessment and Assurance

Generative Artificial Intelligence Lexicon
RAI Toolkit

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Department of the Army

Proceedings of the Thirteenth Annual U.S. Army Operations Research Symposium, Volume 1, October 29 to November 1, 1974

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

Guidelines for secure AI system development
Office of Educational Technology

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Office of Educational Technology

Designing for Education with Artificial Intelligence: An Essential Guide for Developers
Empowering Education Leaders: A Toolkit for Safe, Ethical, and Equitable AI Integration, October 2024

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

Artificial Intelligence and Technology Office

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Artificial Intelligence and Technology Office

AI Risk Management Playbook (AIRMP)
AI Use Case Inventory (DOE Use Cases Releasable to Public in Accordance with E.O. 13960)
Digital Climate Solutions Inventory
Generative Artificial Intelligence Reference Guide

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

Artificial Intelligence and Autonomous Systems
Artificial Intelligence Safety and Security Board
Department of Homeland Security Artificial Intelligence Roadmap 2024
Safety and Security Guidelines for Critical Infrastructure Owners and Operators
Use of Commercial Generative Artificial Intelligence (AI) Tools
Artificial Intelligence Strategy for the U.S. Department of Justice, December 2020
Civil Rights Division, Artificial Intelligence and Civil Rights
Privacy Act of 1974
Overview of The Privacy Act of 1974 (2020 Edition)
Managing Artificial Intelligence-Specific Cybersecurity Risks in the Financial Services Sector, March 2024
EEOC Letter (from U.S. senators re: hiring software)
Questions and Answers to Clarify and Provide a Common Interpretation of the Uniform Guidelines on Employee Selection Procedures
Obama White House Archives, Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy, February 2012
Office of Management and Budget (OMB)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Office of Management and Budget (OMB)

M-21-06 (November 17, 2020), Memorandum for the Heads of Executive Departments and Agencies, Guidance for Regulation of Artificial Intelligence Applications
M-24-18 (September 24, 2024), Memorandum for the Heads of Executive Departments and Agencies, Advancing the Responsible Acquisition of Artificial Intelligence in Government

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

Office of Science and Technology Policy (OSTP)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Office of Science and Technology Policy (OSTP)

Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People
Framework to Advance AI Governance and Risk Management in National Security
National Science and Technology Council (NSTC)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Office of Science and Technology Policy (OSTP) / National Science and Technology Council (NSTC)

Select Committee on Artificial Intelligence, National Artificial Intelligence Research and Development Strategic Plan 2023 Update

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

FACT SHEET: Biden-⁠Harris Administration Outlines Coordinated Approach to Harness Power of AI for U.S. National Security, October 24, 2024
Supervisory Guidance on Model Risk Management
Advisory Bulletin AB 2013-07 Model Risk Management Guidance
Supervisory Guidance on Model Risk Management
Artificial Intelligence/Machine Learning (AI/ML)-Based: Software as a Medical Device (SaMD) Action Plan, updated January 2021
Business Blog :

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Business Blog

2020-04-08 Using Artificial Intelligence and Algorithms
2021-01-11 Facing the facts about facial recognition
2021-04-19 Aiming for truth, fairness, and equity in your company’s use of AI
2022-07-11 Location, health, and other sensitive information: FTC committed to fully enforcing the law against illegal use and sharing of highly sensitive data
2023-07-25 Protecting the privacy of health information: A baker’s dozen takeaways from FTC cases
2023-08-16 Can’t lose what you never had: Claims about digital ownership and creation in the age of generative AI
2023-08-22 For business opportunity sellers, FTC says “AI” stands for “allegedly inaccurate”
2023-09-15 Updated FTC-HHS publication outlines privacy and security laws and rules that impact consumer health data
2023-09-18 Companies warned about consequences of loose use of consumers’ confidential data
2023-09-27 Could PrivacyCon 2024 be the place to present your research on AI, privacy, or surveillance?
2022-05-20 Security Beyond Prevention: The Importance of Effective Breach Disclosures
2023-02-01 Security Principles: Addressing underlying causes of risk in complex systems
2023-06-29 Generative AI Raises Competition Concerns
2023-12-19 Coming face to face with Rite Aid’s allegedly unfair use of facial recognition technology

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

Children's Online Privacy Protection Rule ("COPPA")
Privacy Policy
Software as a Medical Device (SAMD) guidance (December 8, 2017)
Artificial Intelligence: An Accountability Framework for Federal Agencies and Other Entities, GAO-21-519SP
Veteran Suicide: VA Efforts to Identify Veterans at Risk through Analysis of Health Record Information
Central Security Service, Artificial Intelligence Security Center
Final Report
2021 Model Risk Management Handbook
The AIM Initiative: A Strategy for Augmenting Intelligence Using Machines
Principles of Artificial Intelligence Ethics for the Intelligence Community
SEC Charges Two Investment Advisers with Making False and Misleading Statements About Their Use of Artificial Intelligence
Public Views on Artificial Intelligence and Intellectual Property Policy
Design principles
California Consumer Privacy Act (CCPA)
California Department of Justice, How to Read a Privacy Policy
California Department of Technology, GenAI Executive Order
California Privacy Rights Act (CPRA)
Department of Technology, Office of Information Security, Generative Artificial Intelligence Risk Assessment, SIMM 5305-F, March 2024
Legislative Research Commission, Research Report No. 491, Executive Branch Use of Artificial Intelligence Technology
Mississippi Department of Education, Artificial Intelligence Guidance for K-12 Classrooms
New York City Automated Decision Systems Task Force Report (November 2019)
RE: Use of External Consumer Data and Information Sources in Underwriting for Life Insurance
Federal Reserve Bank of Dallas, Regulation B, Equal Credit Opportunity, Credit Scoring Interpretations: Withdrawl of Proposed Business Credit Amendments, June 3, 1982
Questions from the Commission on Protecting Privacy and Preventing Discrimination
Assessment List for Trustworthy Artificial Intelligence (ALTAI) for self-assessment - Shaping Europe’s digital future - European Commission
Proposal for a Regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance / Proposal for a Regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act)

Amendments adopted by the European Parliament on 14 June 2023 on the proposal for a regulation of the European Parliament and of the Council on laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Official Policy, Frameworks, and Guidance

The Digital Services Act package (EU Digital Services Act and Digital Markets Act)
Civil liability regime for artificial intelligence
European Parliament, Addressing AI risks in the workplace: Workers and algorithms
European Parliament, The impact of the General Data Protection Regulation (GDPR) on artificial intelligence
European Commission, Analysis of the preliminary AI standardisation work plan in support of the AI Act
European Commission, Communication from the Commission (4/25/2018), Artificial Intelligence for Europe
European Commission, European approach to artificial intelligence
European Commission, Hiroshima Process International Guiding Principles for Advanced AI system
European Commission, Data Protection Certification Mechanisms: Study on Articles 42 and 43 of the Regulation (EU) 2016/679
Proposal for a directive on adapting non-contractual civil liability rules to artificial intelligence: Complementary impact assessment
Artificial intelligence act: Council and Parliament strike a deal on the first rules for AI in the world
Data Protection Authority of Belgium General Secretariat, Artificial Intelligence Systems and the GDPR: A Data Protection Perspective
European Data Protection Board (EDPB), AI Auditing documents
European Data Protection Supervisor, First EDPS Orientations for EUIs using Generative AI
European Labour Authority (ELA), Artificial Intelligence and Algorithms in Risk Assessment: Addressing Bias, Discrimination and other Legal and Ethical Issues
AI, data governance and privacy: Synergies and areas of international co-operation, June 26, 2024
The Bias Assessment Metrics and Measures Repository
Measuring the environmental impacts of artificial intelligence compute and applications
Open, Useful and Re-usable data (OURdata) Index: 2019 - Policy Paper
AI in Precision Persuasion. Unveiling Tactics and Risks on Social Media
Narrative Detection and Topic Modelling in the Baltics
UNESCO, Artificial Intelligence: examples of ethical dilemmas
UNESCO, Consultation paper on AI regulation: emerging approaches across the world
Office of the United Nations High Commissioner for Human Rights

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Law Texts and Drafts

Algorithmic Accountability Act of 2023
Arizona, House Bill 2685
Australia, Data Availability and Transparency Act 2022
Australia, Privacy Act 1988
California, Civil Rights Council - First Modifications to Proposed Employment Regulations on Automated-Decision Systems, Title 2, California Code of Regulations
California, Consumer Privacy Act of 2018, Civil Code - DIVISION 3. OBLIGATIONS [1427 - 3273.69]
Colorado, SB24-205 Consumer Protections for Artificial Intelligence, Concerning consumer protections in interactions with artificial intelligence systems
European Union, General Data Protection Regulation (GDPR)

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Law Texts and Drafts / European Union, General Data Protection Regulation (GDPR)

Article 22 EU GDPR "Automated individual decision-making, including profiling"

Awesome Machine Learning Interpretability / Community and Official Guidance Resources / Law Texts and Drafts

Executive Order 13960 (2020-12-03), Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government
Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence
Facial Recognition and Biometric Technology Moratorium Act of 2020
Federal Consumer Online Privacy Rights Act (COPRA)
Germany, Bundesrat Drucksache 222/24 - Entwurf eines Gesetzes zum strafrechtlichen Schutz von Persönlichkeitsrechten vor Deepfakes (Draft Law on the Criminal Protection of Personality Rights from Deepfakes)
Illinois, Biometric Information Privacy Act
Justice in Policing Act
National Conference of State Legislatures (NCSL) 2020 Consumer Data Privacy Legislation
Nebraska, LB1203 - Regulate artificial intelligence in media and political advertisements under the Nebraska Political Accountability and Disclosure Act
Rhode Island, Executive Order 24-06: Artificial Intelligence and Data Centers of Excellence
Virginia, Consumer Data Protection Act
Washington State, SB 6513 - 2019-20
United States Congress, 118th Congress (2023-2024), H.R.5586 - DEEPFAKES Accountability Act
United States Congress, 118th Congress (2023-2024), H.R. 9720, AI Incident Reporting and Security Enhancement Act
United States Congress, 118th Congress (2023-2024), S.4769 - VET Artificial Intelligence Act

Awesome Machine Learning Interpretability / Education Resources / Comprehensive Software Examples and Tutorials

COMPAS Analysis Using Aequitas 694 2 months ago
Explaining Quantitative Measures of Fairness (with SHAP) 22,876 12 days ago
Getting a Window into your Black Box Model
H20.ai, From GLM to GBM Part 1
H20.ai, From GLM to GBM Part 2
IML
Interpretable Machine Learning with Python 673 5 months ago
Interpreting Machine Learning Models with the iml Package
Interpretable Machine Learning using Counterfactuals
Machine Learning Explainability by Kaggle Learn
Model Interpretability with DALEX

Awesome Machine Learning Interpretability / Education Resources / Comprehensive Software Examples and Tutorials / Model Interpretability with DALEX / :

The Importance of Human Interpretable Machine Learning
Model Interpretation Strategies
Hands-on Machine Learning Model Interpretation
Interpreting Deep Learning Models for Computer Vision

Awesome Machine Learning Interpretability / Education Resources / Comprehensive Software Examples and Tutorials

Partial Dependence Plots in R

Awesome Machine Learning Interpretability / Education Resources / Comprehensive Software Examples and Tutorials / PiML:

PiML Medium Tutorials
PiML-Toolbox Examples 1,204 24 days ago

Awesome Machine Learning Interpretability / Education Resources / Comprehensive Software Examples and Tutorials

Reliable-and-Trustworthy-AI-Notebooks 1 about 3 years ago
Saliency Maps for Deep Learning
Visualizing ML Models with LIME
Visualizing and debugging deep convolutional networks
What does a CNN see?

Awesome Machine Learning Interpretability / Education Resources / Free-ish Books

César A. Hidalgo, Diana Orghian, Jordi Albo-Canals, Filipa de Almeida, and Natalia Martin, 2021, How Humans Judge Machines
Charles Perrow, 1984, Normal Accidents: Living with High-Risk Technologies
Charles Perrow, 1999, Normal Accidents: Living with High-Risk Technologies with a New Afterword and a Postscript on the Y2K Problem
Christoph Molnar, 2021, Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

Awesome Machine Learning Interpretability / Education Resources / Free-ish Books / Christoph Molnar, 2021, Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

christophM/interpretable-ml-book 4,794 9 days ago

Awesome Machine Learning Interpretability / Education Resources / Free-ish Books

Deborah G. Johnson and Keith W. Miller, 2009, Computer Ethics: Analyzing Information Technology, Fourth Edition
Ed Dreby and Keith Helmuth (contributors) and Judy Lumb (editor), 2009, Fueling Our Future: A Dialogue about Technology, Ethics, Public Policy, and Remedial Action
Ethics for people who work in tech
Florence G'sell, Regulating under Uncertainty: Governance Options for Generative AI
George Reynolds, 2002, Ethics in Information Technology
George Reynolds, 2002, Ethics in Information Technology, Instructor's Edition
Joseph Weizenbaum, 1976, Computer Power and Human Reason: From Judgment to Calculation
Kenneth Vaux (editor), 1970, Who Shall Live? Medicine, Technology, Ethics
Kush R. Varshney, 2022, Trustworthy Machine Learning: Concepts for Developing Accurate, Fair, Robust, Explainable, Transparent, Inclusive, Empowering, and Beneficial Machine Learning Systems
Marsha Cook Woodbury, 2003, Computer and Information Ethics
M. David Ermann, Mary B. Williams, and Claudio Gutierrez, 1990, Computers, Ethics, and Society
Morton E. Winston and Ralph D. Edelbach, 2000, Society, Ethics, and Technology, First Edition
Morton E. Winston and Ralph D. Edelbach, 2003, Society, Ethics, and Technology, Second Edition
Morton E. Winston and Ralph D. Edelbach, 2006, Society, Ethics, and Technology, Third Edition
Patrick Hall and Navdeep Gill, 2019, An Introduction to Machine Learning Interpretability: An Applied Perspective on Fairness, Accountability, Transparency, and Explainable AI, Second Edition
Patrick Hall, Navdeep Gill, and Benjamin Cox, 2021, Responsible Machine Learning: Actionable Strategies for Mitigating Risks & Driving Adoption
Paula Boddington, 2017, Towards a Code of Ethics for Artificial Intelligence
Przemyslaw Biecek and Tomasz Burzykowski, 2020, Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models. With examples in R and Python
Przemyslaw Biecek, 2023, Adversarial Model Analysis
Raymond E. Spier (editor), 2003, Science and Technology Ethics
Richard A. Spinello, 1995, Ethical Aspects of Information Technology
Richard A. Spinello, 1997, Case Studies in Information and Computer Ethics
Richard A. Spinello, 2003, Case Studies in Information Technology Ethics, Second Edition
Solon Barocas, Moritz Hardt, and Arvind Narayanan, 2022, Fairness and Machine Learning: Limitations and Opportunities
Soraj Hongladarom and Charles Ess, 2007, Information Technology Ethics: Cultural Perspectives
Stephen H. Unger, 1982, Controlling Technology: Ethics and the Responsible Engineer, First Edition
Stephen H. Unger, 1994, Controlling Technology: Ethics and the Responsible Engineer, Second Edition

Awesome Machine Learning Interpretability / Education Resources / Glossaries and Dictionaries

A.I. For Anyone: The A-Z of AI
The Alan Turing Institute: Data science and AI glossary
Appen Artificial Intelligence Glossary
Artificial intelligence and illusions of understanding in scientific research (glossary on second page)
Brookings: The Brookings glossary of AI and emerging technologies
Built In, Responsible AI Explained
Center for Security and Emerging Technology: Glossary
Chief Digital and Artificial Intelligence Office (CDAO), Generative Artificial Intelligence Lexicon
CompTIA: Artificial Intelligence (AI) Terminology: A Glossary for Beginners
Council of Europe Artificial Intelligence Glossary
Coursera: Artificial Intelligence (AI) Terms: A to Z Glossary
Dataconomy: AI dictionary: Be a native speaker of Artificial Intelligence
Dennis Mercadal, 1990, Dictionary of Artificial Intelligence
European Commission, EU-U.S. Terminology and Taxonomy for Artificial Intelligence - Second Edition
European Commission, Glossary of human-centric artificial intelligence
G2: 70+ A to Z Artificial Intelligence Terms in Technology
General Services Administration: AI Guide for Government: Key AI terminology
Google Developers Machine Learning Glossary
H2O.ai Glossary
IAPP Glossary of Privacy Terms
IAPP International Definitions of Artificial Intelligence
IAPP Key Terms for AI Governance
IBM AI glossary
IEEE, A Glossary for Discussion of Ethics of Autonomous and Intelligent Systems, Version 1
ISO/IEC DIS 22989(en) Information technology — Artificial intelligence — Artificial intelligence concepts and terminology
Jerry M. Rosenberg, 1986, Dictionary of Artificial Intelligence & Robotics
MakeUseOf: A Glossary of AI Jargon: 29 AI Terms You Should Know
Moveworks: AI Terms Glossary
National Institute of Standards and Technology (NIST), NIST AI 100-2 E2023: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations
National Institute of Standards and Technology (NIST), The Language of Trustworthy AI: An In-Depth Glossary of Terms
Oliver Houdé, 2004, Dictionary of Cognitive Science: Neuroscience, Psychology, Artificial Intelligence, Linguistics, and Philosophy
Open Access Vocabulary
Otto Vollnhals, 1992, A Multilingual Dictionary of Artificial Intelligence (English, German, French, Spanish, Italian)
Raoul Smith, 1989, The Facts on File Dictionary of Artificial Intelligence
Raoul Smith, 1990, Collins Dictionary of Artificial Intelligence
Salesforce: AI From A to Z: The Generative AI Glossary for Business Leaders
Siemens, Artificial Intelligence Glossary
Stanford University HAI Artificial Intelligence Definitions
TechTarget: Artificial intelligence glossary: 60+ terms to know
TELUS International: 50 AI terms every beginner should know
Towards AI, Generative AI Terminology — An Evolving Taxonomy To Get You Started
UK Parliament, Artificial intelligence (AI) glossary
University of New South Wales, Bill Wilson, The Machine Learning Dictionary
VAIR (Vocabulary of AI Risks)
Wikipedia: Glossary of artificial intelligence
William J. Raynor, Jr, 1999, The International Dictionary of Artificial Intelligence, First Edition
William J. Raynor, Jr, 2009, International Dictionary of Artificial Intelligence, Second Edition

Awesome Machine Learning Interpretability / Education Resources / Open-ish Classes

An Introduction to Data Ethics
Awesome LLM Courses 104 about 1 month ago
AWS Skill Builder
Build a Large Language Model (From Scratch) 32,908 5 days ago
Carnegie Mellon University, Computational Ethics for NLP
Certified Ethical Emerging Technologist
Coursera, DeepLearning.AI, Generative AI for Everyone
Coursera, DeepLearning.AI, Generative AI with Large Language Models
Coursera, Google Cloud, Introduction to Generative AI
Coursera, Vanderbilt University, Prompt Engineering for ChatGPT
CS103F: Ethical Foundations of Computer Science
DeepLearning.AI
ETH Zürich ReliableAI 2022 Course Project repository 1 almost 2 years ago
Fairness in Machine Learning
Fast.ai Data Ethics course
Google Cloud Skills Boost

Awesome Machine Learning Interpretability / Education Resources / Open-ish Classes / Google Cloud Skills Boost

Attention Mechanism
Create Image Captioning Models
Encoder-Decoder Architecture
Introduction to Generative AI
Introduction to Image Generation
Introduction to Large Language Models
Introduction to Responsible AI
Introduction to Vertex AI Studio
Transformer Models and BERT Model

Awesome Machine Learning Interpretability / Education Resources / Open-ish Classes

Grow with Google, Generative AI for Educators
Human-Centered Machine Learning
IBM SkillsBuild
Introduction to AI Ethics
INFO 4270: Ethics and Policy in Data Science
Introduction to Responsible Machine Learning
Jay Alammar, Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
Machine Learning Fairness by Google
OECD.AI, Disability-Centered AI And Ethics MOOC
Piotr Sapieżyński's CS 4910 - Special Topics in Computer Science: Algorithm Audits
Tech & Ethics Curricula
Trustworthy Deep Learning

Awesome Machine Learning Interpretability / Education Resources / Podcasts and Channels

Internet of Bugs
Tech Won't Save Us
This Is Technology Ethics: An Introduction

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / AI Incident Information Sharing Resources

AI Incident Database (Responsible AI Collaborative)
AI Vulnerability Database (AVID)
AIAAIC
AI Badness: An open catalog of generative AI badness
AI Risk Database
Atlas of AI Risks
Brennan Center for Justice, Artificial Intelligence Legislation Tracker
EthicalTech@GW, Deepfakes & Democracy Initiative
George Washington University Law School's AI Litigation Database
Merging AI Incidents Research with Political Misinformation Research: Introducing the Political Deepfakes Incidents Database
Mitre's AI Risk Database 50 7 months ago
OECD AI Incidents Monitor
Verica Open Incident Database (VOID)
AI Ethics Issues in Real World: Evidence from AI Incident Database
The Atlas of AI Incidents in Mobile Computing: Visualizing the Risks and Benefits of AI Gone Mobile
Artificial Intelligence Incidents & Ethics: A Narrative Review
Artificial Intelligence Safety and Cybersecurity: A Timeline of AI Failures
Deployment Corrections: An Incident Response Framework for Frontier AI Models
Exploring Trust With the AI Incident Database
Indexing AI Risks with Incidents, Issues, and Variants
Good Systems, Bad Data?: Interpretations of AI Hype and Failures
How Does AI Fail Us? A Typological Theorization of AI Failures
Omission and Commission Errors Underlying AI Failures
Ontologies for Reasoning about Failures in AI Systems
Planning for Natural Language Failures with the AI Playbook
Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database
SoK: How Artificial-Intelligence Incidents Can Jeopardize Safety and Security
Understanding and Avoiding AI Failures: A Practical Guide
When Your AI Becomes a Target: AI Security Incidents and Best Practices
Why We Need to Know More: Exploring the State of AI Incident Documentation Practices

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / AI Law, Policy, and Guidance Trackers

Access Now, Regulatory Mapping on Artificial Intelligence in Latin America: Regional AI Public Policy Report
The Ethical AI Database
George Washington University Law School's AI Litigation Database
International Association of Privacy Professionals (IAPP), Global AI Legislation Tracker
International Association of Privacy Professionals (IAPP), UK data protection reform: An overview
International Association of Privacy Professionals (IAPP), US State Privacy Legislation Tracker
Institute for the Future of Work, Tracking international legislation relevant to AI at work
Legal Nodes, Global AI Regulations Tracker: Europe, Americas & Asia-Pacific Overview
MIT AI Risk Repository
National Conference of State Legislatures, Deceptive Audio or Visual Media (‘Deepfakes’) 2024 Legislation
OECD.AI, National AI policies & strategies
Raymond Sun, Global AI Regulation Tracker
Runway Strategies, Global AI Regulation Tracker
University of North Texas, Artificial Intelligence (AI) Policy Collection
VidhiSharma.AI, Global AI Governance Tracker
White & Case, AI Watch: Global regulatory tracker - United States

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / Challenges and Competitions

FICO Explainable Machine Learning Challenge
OSD Bias Bounty
National Fair Housing Alliance Hackathon
Twitter Algorithmic Bias

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / Critiques of AI

Against predictive optimization
AI can only do 5% of jobs, says MIT economist who fears tech stock crash
AI chatbots use racist stereotypes even after anti-racism training
AI coding assistants do not boost productivity or prevent burnout, study finds
AI hype as a cyber security risk: the moral responsibility of implementing generative AI in business
AI hype, promotional culture, and affective capitalism
AI Is a Lot of Work
AI is effectively ‘useless’—and it’s created a ‘fake it till you make it’ bubble that could end in disaster, veteran market watcher warns
AI Safety Is a Narrative Problem
AI Snake Oil
AI Tools Still Permitting Political Disinfo Creation, NGO Warns
Anthropomorphism in AI: hype and fallacy
Are Emergent Abilities of Large Language Models a Mirage?
Are Language Models Actually Useful for Time Series Forecasting?
Artificial Hallucinations in ChatGPT: Implications in Scientific Writing
Artificial intelligence and illusions of understanding in scientific research
Artificial Intelligence: Hope for Future or Hype by Intellectuals?
Artificial intelligence-powered chatbots in search engines: a cross-sectional study on the quality and risks of drug information for patients
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Aylin Caliskan's publications
Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks
Beyond Preferences in AI Alignment
Chatbots in consumer finance
ChatGPT is bullshit
Companies like Google and OpenAI are pillaging the internet and pretending it’s progress
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
The Cult of AI
Data and its (dis)contents: A survey of dataset development and use in machine learning research
The Data Scientific Method vs. The Scientific Method
Ed Zitron's Where's Your Ed At
Emergent and Predictable Memorization in Large Language Models
Evaluating Language-Model Agents on Realistic Autonomous Tasks
FABLES: Evaluating faithfulness and content selection in book-length summarization
The Fallacy of AI Functionality
Futurism, Disillusioned Businesses Discovering That AI Kind of Sucks
Gen AI: Too Much Spend, Too Little Benefit?
Generative AI: UNESCO study reveals alarming evidence of regressive gender stereotypes
Get Ready for the Great AI Disappointment
Ghost in the Cloud: Transhumanism’s simulation theology
Handling the hype: Implications of AI hype for public interest tech projects
The harms of terminology: why we should reject so-called “frontier AI”
HealthManagement.org, The Journal, Volume 19, Issue 2, 2019, Artificial Hype
How AI hype impacts the LGBTQ + community
How AI lies, cheats, and grovels to succeed - and what we need to do about it
Identifying and Eliminating CSAM in Generative ML Training Data and Models
Insanely Complicated, Hopelessly Inadequate
Internet of Bugs, Debunking Devin: "First AI Software Engineer" Upwork lie exposed!(video)
It’s Time to Stop Taking Sam Altman at His Word
I Will Fucking Piledrive You If You Mention AI Again
Julia Angwin, Press Pause on the Silicon Valley Hype Machine
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Lazy use of AI leads to Amazon products called “I cannot fulfill that request”
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks
Long-context LLMs Struggle with Long In-context Learning
Low-Resource Languages Jailbreak GPT-4
Machine Learning: The High Interest Credit Card of Technical Debt
Measuring the predictability of life outcomes with a scientific mass collaboration
Meta AI Chief: Large Language Models Won't Achieve AGI
Meta’s AI chief: LLMs will never reach human-level intelligence
MIT Technology Review, Introducing: The AI Hype Index
Most CEOs aren’t buying the hype on generative AI benefits
Nepotistically Trained Generative-AI Models Collapse
Non-discrimination Criteria for Generative Language Models
OECD, Measuring the environmental impacts of artificial intelligence compute and applications
OpenAI—written evidence (LLM0113), House of Lords Communications and Digital Select Committee inquiry: Large language models

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / Critiques of AI / OpenAI—written evidence (LLM0113), House of Lords Communications and Digital Select Committee inquiry: Large language models

Former OpenAI Researcher Says the Company Broke Copyright Law

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / Critiques of AI

Open Problems in Technical AI Governance
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
The perpetual motion machine of AI-generated data and the distraction of ChatGPT as a ‘scientist’
Pivot to AI
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
The Price of Emotion: Privacy, Manipulation, and Bias in Emotional AI
Promising the future, encoding the past: AI hype and public media imagery
Quantifying Memorization Across Neural Language Models
Re-evaluating GPT-4’s bar exam performance
Researchers surprised by gender stereotypes in ChatGPT
Ryan Allen, Explainable AI: The What’s and Why’s, Part 1: The What
Sam Altman’s imperial reach
Scalable Extraction of Training Data from (Production) Language Models
Speed of AI development stretches risk assessments to breaking point
Talking existential risk into being: a Habermasian critical discourse perspective to AI hype
Task Contamination: Language Models May Not Be Few-Shot Anymore
Theory Is All You Need: AI, Human Cognition, and Decision Making
There Is No A.I.
This AI Pioneer Thinks AI Is Dumber Than a Cat
Three different types of AI hype in healthcare
Toward Sociotechnical AI: Mapping Vulnerabilities for Machine Learning in Context
We still don't know what generative AI is good for
What’s in a Name? Experimental Evidence of Gender Bias in Recommendation Letters Generated by ChatGPT
Which Humans?
Why the AI Hype is Another Tech Bubble
Why We Must Resist AI’s Soft Mind Control
Winner's Curse? On Pace, Progress, and Empirical Rigor
A bottle of water per email: the hidden environmental costs of using AI chatbots
AI already uses as much energy as a small country. It’s only the beginning.
The AI Carbon Footprint and Responsibilities of AI Scientists
AI, Climate, and Regulation: From Data Centers to the AI Act
Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models
The carbon impact of artificial intelligence
Data centre water consumption
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Environment and sustainability development: A ChatGPT perspective
The Environmental Impact of AI: A Case Study of Water Consumption by Chat GPT
The Environmental Price of Intelligence: Evaluating the Social Cost of Carbon in Machine Learning
Generative AI’s environmental costs are soaring — and mostly secret
The Hidden Environmental Impact of AI
Making AI Less "Thirsty": Uncovering and Addressing the Secret Water Footprint of AI Models
Microsoft’s Hypocrisy on AI
The mechanisms of AI hype and its planetary and social costs
Power Hungry Processing: Watts Driving the Cost of AI Deployment?
Promoting Sustainability: Mitigating the Water Footprint in AI-Embedded Data Centres
Sustainable AI: AI for sustainability and the sustainability of AI
Sustainable AI: Environmental Implications, Challenges and Opportunities
Toward Responsible AI Use: Considerations for Sustainability Impact Assessment
Towards A Comprehensive Assessment of AI's Environmental Impact
Towards Environmentally Equitable AI via Geographical Load Balancing
Unraveling the Hidden Environmental Impacts of AI Solutions for Environment Life Cycle Assessment of AI Solutions

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / Groups and Organizations

AI Forum New Zealand, AI Governance Working Group
AI Village
Center for Advancing Safety of Machine Intelligence
Center for AI and Digital Policy
Center for Democracy and Technology
Center for Security and Emerging Technology
Future of Life Institute
Institute for Advanced Study (IAS), AI Policy and Governance Working Group
Partnership on AI

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / Curated Bibliographies

Proposed Guidelines for Responsible Use of Explainable Machine Learning (presentation, bibliography) 20 over 4 years ago
Proposed Guidelines for Responsible Use of Explainable Machine Learning (paper, bibliography) 17 almost 2 years ago
A Responsible Machine Learning Workflow (paper, bibliography) 13 over 4 years ago
Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) Scholarship

Awesome Machine Learning Interpretability / AI Incidents, Critiques, and Research Resources / List of Lists

A Living and Curated Collection of Explainable AI Methods
AI Ethics Guidelines Global Inventory
AI Ethics Resources
AI Tools and Platforms
Awesome AI Guidelines 1,265 4 days ago
Awesome interpretable machine learning 906 over 1 year ago
Awesome-explainable-AI 1,422 23 days ago
Awesome-ML-Model-Governance 100 7 months ago
Awesome MLOps 12,623 5 months ago
Awesome Production Machine Learning 17,606 4 days ago
Awful AI 6,982 7 months ago
Casey Fiesler's AI Ethics & Policy News spreadsheet
criticalML 367 almost 6 years ago
Ethics for people who work in tech
Evaluation Repository for 'Sociotechnical Safety Evaluation of Generative AI Systems'
IMDA-BTG, LLM-Evals-Catalogue 14 about 1 year ago
Machine Learning Ethics References 59 over 1 year ago
Machine Learning Interpretability Resources 484 almost 4 years ago
NIST, Comments Received for RFI on Artificial Intelligence Risk Management Framework
OECD-NIST Catalogue of AI Tools and Metrics
OpenAI Cookbook 59,807 8 days ago
private-ai-resources 471 over 4 years ago
Ravit Dotan's Resources
ResponsibleAI
Tech & Ethics Curricula
Worldwide AI ethics: A review of 200 guidelines and recommendations for AI governance
XAI Resources 822 over 2 years ago
xaience 107 about 1 year ago

Awesome Machine Learning Interpretability / Technical Resources / Benchmarks

benchm-ml 1,869 about 2 years ago
Bias Benchmark for QA dataset (BBQ) 87 11 months ago
Cataloguing LLM Evaluations 14 about 1 year ago
DecodingTrust 259 2 months ago
EleutherAI, Language Model Evaluation Harness 6,970 6 days ago
GEM
HELM
Hugging Face, evaluate 2,034 2 months ago
i-gallegos, Fair-LLM-Benchmark 110 about 1 year ago
MLCommons, MLCommons AI Safety v0.5 Proof of Concept
MLCommons, Introducing v0.5 of the AI Safety Benchmark from MLCommons
Nvidia MLPerf
OpenML Benchmarking Suites
Real Toxicity Prompts (Allen Institute for AI)
SafetyPrompts.com
Sociotechnical Safety Evaluation Repository
TrustLLM-Benchmark
Trust-LLM-Benchmark Leaderboard
TruthfulQA 618 about 1 year ago
WAVES: Benchmarking the Robustness of Image Watermarks
Wild-Time: A Benchmark of in-the-Wild Distribution Shifts over Time 61 over 1 year ago
Winogender Schemas 68 over 5 years ago
yandex-research / tabred 56 7 days ago

Awesome Machine Learning Interpretability / Technical Resources / Common or Useful Datasets

Adult income dataset
Balanced Faces in the Wild 46 about 2 years ago
Bruegel, A dataset on EU legislation for the digital world
COMPAS Recidivism Risk Score Data and Analysis

Awesome Machine Learning Interpretability / Technical Resources / Common or Useful Datasets / COMPAS Recidivism Risk Score Data and Analysis / :

All Lending Club loan data
Amazon Open Data
Data.gov
Home Mortgage Disclosure Act (HMDA) Data
MIMIC-III Clinical Database
UCI ML Data Repository

Awesome Machine Learning Interpretability / Technical Resources / Common or Useful Datasets

FANNIE MAE Single Family Loan Performance
Have I Been Trained?
nikhgarg / EmbeddingDynamicStereotypes 159 almost 2 years ago
Presidential Deepfakes Dataset
NYPD Stop, Question and Frisk Data
socialfoundations / folktables 241 6 months ago
Statlog (German Credit Data)
Wikipedia Talk Labels: Personal Attacks

Awesome Machine Learning Interpretability / Technical Resources / Machine Learning Environment Management Tools

dvc
gigantum
mlflow
mlmd 626 28 days ago
modeldb 1,702 4 months ago
neptune
Opik 2,121 6 days ago

Awesome Machine Learning Interpretability / Technical Resources / Personal Data Protection Tools

LLM Dataset Inference: Did you train on my dataset? 23 4 months ago

Awesome Machine Learning Interpretability / Technical Resources / Open Source/Access Responsible AI Software Packages

DiscriLens 7 over 1 year ago
Hugging Face, BiasAware: Dataset Bias Detection
manifold 1,651 over 1 year ago
PAIR-code / datacardsplaybook 169 6 months ago
PAIR-code / facets 7,357 over 1 year ago
PAIR-code / knowyourdata 286 about 2 years ago
TensorBoard Projector
What-if Tool
Born-again Tree Ensembles 64 over 1 year ago
Certifiably Optimal RulE ListS 172 about 3 years ago
Secure-ML 38 about 5 years ago
LDNOOBW
acd 125 about 3 years ago
aequitas 694 2 months ago
AI Fairness 360 2,457 5 months ago
AI Explainability 360 1,633 4 months ago
ALEPython 158 over 1 year ago
Aletheia 70 10 months ago
allennlp 11,757 almost 2 years ago
algofairness
http://fairness.haverford.edu/ See [Algorithmic Fairness][ )
Alibi 2,414 4 months ago
anchor 798 over 2 years ago
Bayesian Case Model
Bayesian Ors-Of-Ands 34 over 2 years ago
Bayesian Rule List (BRL)
BlackBoxAuditing 130 over 1 year ago
CalculatedContent, WeightWatcher 1,470 2 months ago
casme 73 over 4 years ago
Causal Discovery Toolbox 1,128 8 months ago
captum 4,931 6 days ago
causalml 5,095 13 days ago
cdt15, Causal Discovery Lab., Shiga University
checklist 2,010 11 months ago
cleverhans 6,202 8 months ago
contextual-AI 87 over 1 year ago
ContrastiveExplanation (Foil Trees) 45 almost 2 years ago
counterfit 806 about 1 year ago
dalex 1,375 about 2 months ago
debiaswe 243 over 1 year ago
DeepExplain 734 about 4 years ago
DeepLIFT 826 over 2 years ago
deepvis 4,019 almost 5 years ago
DIANNA 48 8 days ago
DiCE 1,364 7 months ago
DoWhy 7,119 15 days ago
dtreeviz 2,965 3 months ago
ecco 1,985 3 months ago
eli5 2,757 over 2 years ago
explabox 15 about 1 month ago
Explainable Boosting Machine (EBM)/GA2M 6,296 3 days ago
ExplainaBoard 361 over 1 year ago
explainerdashboard 2,311 4 months ago
explainX 417 3 months ago
fair-classification 189 almost 3 years ago
fairml 360 over 3 years ago
fairlearn 1,948 5 days ago
fairness-comparison 159 almost 2 years ago
fairness_measures_code 38 8 months ago
Falling Rule List (FRL)
foolbox 2,771 8 months ago
Giskard 4,071 6 days ago
Grad-CAM (GitHub topic)
gplearn 1,615 12 months ago
H2O-3 6,922 8 days ago
H2O-3 6,922 8 days ago
H2O-3 6,922 8 days ago
h2o-LLM-eval 50 28 days ago
hate-functional-tests 56 almost 3 years ago
imodels 1,399 15 days ago
iNNvestigate neural nets 1,265 11 months ago
Integrated-Gradients 598 over 2 years ago
interpret 6,296 3 days ago
interpret_with_rules 21 4 months ago
InterpretME 25 8 months ago
Keras-vis 2,982 almost 3 years ago
keract 1,045 3 months ago
L2X 124 over 3 years ago
Learning to Explain: An Information-Theoretic Perspective on Model Interpretation "Code for replicating the experiments in the paper at ICML 2018, by Jianbo Chen, Mitchell Stern, Martin J. Wainwright, Michael I. Jordan.”
LangFair 57 9 days ago
langtest 501 9 days ago
learning-fair-representations 26 over 4 years ago
http://www.cs.toronto.edu/~toni/Papers/icml-final.pdf "Python numba implementation of Zemel et al. 2013 "
leeky: Leakage/contamination testing for black box language models 6 9 months ago
leondz / garak, LLM vulnerability scanner 1,471 6 days ago
lilac 969 8 months ago
lime 11,615 4 months ago
LiFT 168 over 1 year ago
lit 3,492 15 days ago
LLM Dataset Inference: Did you train on my dataset? 23 4 months ago
lofo-importance 817 10 months ago
lrp_toolbox 330 over 2 years ago
MindsDB 26,793 6 days ago
MLextend
ml-fairness-gym 312 over 1 year ago
ml_privacy_meter 604 6 days ago
mllp 22 9 months ago
Transparent Classification with Multilayer Logical Perceptrons and Random Binarization "This is a PyTorch implementation of Multilayer Logical Perceptrons (MLLP) and Random Binarization (RB) method to learn Concept Rule Sets (CRS) for transparent classification tasks, as described in our paper: .”
Monotonic Constraints
XGBoost
Multilayer Logical Perceptron (MLLP) 22 9 months ago
Transparent Classification with Multilayer Logical Perceptrons and Random Binarization "This is a PyTorch implementation of Multilayer Logical Perceptrons (MLLP) and Random Binarization (RB) method to learn Concept Rule Sets (CRS) for transparent classification tasks, as described in our paper: .”
OptBinning 457 24 days ago
Optimal Sparse Decision Trees 100 over 1 year ago
"Optimal Sparse Decision Trees" "This accompanies the paper, by Xiyang Hu, Cynthia Rudin, and Margo Seltzer.”
parity-fairness
PDPbox 845 3 months ago
PiML-Toolbox 1,204 24 days ago
pjsaelin / Cubist 43 8 days ago
Privacy-Preserving-ML 1 over 1 year ago
ProtoPNet
pyBreakDown 41 over 1 year ago
dalex See
PyCEbox 165 over 4 years ago
pyGAM 875 5 months ago
pymc3 8,722 3 days ago
pySS3 336 over 1 year ago
pytorch-grad-cam 10,575 about 1 month ago
pytorch-innvestigate 9 over 5 years ago
https://github.com/albermax/innvestigate/ 1,265 11 months ago "PyTorch implementation of Keras already existing project: .”
Quantus 556 12 days ago
rationale 355 over 6 years ago
responsibly 94 about 1 year ago
REVISE: REvealing VIsual biaSEs 111 over 2 years ago
robustness 918 11 months ago
MadryLab "a package we (students in the ) created to make training, evaluating, and exploring neural networks flexible and easy.”
RISE 155 over 4 years ago
Vitali Petsiuk "contains source code necessary to reproduce some of the main results in the paper: , , (BMVC, 2018) [and] .”
Risk-SLIM 132 over 1 year ago
SAGE 253 10 days ago
SALib 885 about 1 month ago
Scikit-Explain
Scikit-learn
Scikit-learn
Scikit-learn
scikit-fairness 29 almost 4 years ago
fairlearn Historical link. Merged with
scikit-multiflow
shap 22,876 12 days ago
shapley 218 over 1 year ago
sklearn-expertsys 489 over 7 years ago
skope-rules 625 10 months ago
solas-ai-disparity 33 7 months ago
Super-sparse Linear Integer models (SLIMs) 41 about 1 year ago
tensorflow/lattice 518 4 months ago
tensorflow/lucid 4,673 almost 2 years ago
tensorflow/fairness-indicators 343 7 days ago
tensorflow/model-analysis 1,258 16 days ago
tensorflow/model-card-toolkit 425 over 1 year ago
tensorflow/model-remediation 43 over 1 year ago
tensorflow/privacy 1,943 17 days ago
tensorflow/tcav 632 4 months ago
tensorfuzz 208 about 6 years ago
TensorWatch 3,419 about 1 year ago
TextFooler 494 almost 2 years ago
text_explainability
text_sensitivity
tf-explain 1,018 6 months ago
Themis 101 over 4 years ago
themis-ml 124 about 4 years ago
TorchUncertainty 304 8 days ago
treeinterpreter 744 over 1 year ago
TRIAGE 8 8 months ago
woe 256 about 5 years ago
xai 1,125 about 3 years ago
xdeep 42 over 4 years ago
xplique 644 about 1 month ago
ydata-profiling 12,536 8 days ago
yellowbrick 4,293 about 2 months ago
ALEPlot
arules
Causal SVM 5 over 6 years ago
DALEX 1,375 about 2 months ago
DALEXtra: Extension for 'DALEX' Package
DrWhyAI 680 over 1 year ago
elasticnet
ExplainPrediction 2 over 7 years ago
Explainable Boosting Machine (EBM)/GA2M
fairmodels 86 about 2 years ago
fairness
fastshap 116 9 months ago
featureImportance 33 over 3 years ago
flashlight 22 4 months ago
forestmodel
fscaret
gam
glm2
glmnet
H2O-3 6,922 8 days ago
H2O-3 6,922 8 days ago
H2O-3 6,922 8 days ago
iBreakDown 81 12 months ago
ICEbox: Individual Conditional Expectation Plot Toolbox
iml 492 about 1 month ago
ingredients 37 over 1 year ago
interpret: Fit Interpretable Machine Learning Models
lightgbmExplainer 23 over 5 years ago
lime 485 over 2 years ago
live
mcr 8 almost 5 years ago
modelDown
modelOriented
modelStudio 326 about 1 year ago
Monotonic
quantreg
rpart
RuleFit
Scalable Bayesian Rule Lists (SBRL)
shapFlex 71 over 4 years ago
shapleyR 25 over 5 years ago
shapper
smbinning
vip 186 about 1 year ago
xgboostExplainer 252 over 6 years ago

Backlinks from these awesome lists:

More related projects: