awesome-seml

Machine Learning guidelines

A curated list of articles and resources on software engineering best practices for building machine learning applications.

A curated list of articles that cover the software engineering best practices for building machine learning applications.

GitHub

1k stars
45 watching
123 forks
last commit: 9 months ago
Linked from 3 awesome lists

awesomeawesome-listdeep-learningmachine-learningml-opssoftware-engineering

Awesome Software Engineering for Machine Learning / Broad Overviews

AI Engineering: 11 Foundational Practices ⭐
Best Practices for Machine Learning Applications
Engineering Best Practices for Machine Learning ⭐
Hidden Technical Debt in Machine Learning Systems πŸŽ“β­
Rules of Machine Learning: Best Practices for ML Engineering ⭐
Software Engineering for Machine Learning: A Case Study πŸŽ“β­

Awesome Software Engineering for Machine Learning / Data Management

A Survey on Data Collection for Machine Learning A Big Data - AI Integration Perspective_2019 πŸŽ“
Automating Large-Scale Data Quality Verification πŸŽ“
Data management challenges in production machine learning
Data Validation for Machine Learning πŸŽ“
How to organize data labelling for ML
The curse of big data labeling and three ways to solve it
The Data Linter: Lightweight, Automated Sanity Checking for ML Data Sets πŸŽ“
The ultimate guide to data labeling for ML

Awesome Software Engineering for Machine Learning / Model Training

10 Best Practices for Deep Learning
Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement πŸŽ“
Fairness On The Ground: Applying Algorithmic FairnessApproaches To Production Systems πŸŽ“
How do you manage your Machine Learning Experiments?
Machine Learning Testing: Survey, Landscapes and Horizons πŸŽ“
Nitpicking Machine Learning Technical Debt
On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach πŸŽ“β­
On human intellect and machine failures: Troubleshooting integrative machine learning systems πŸŽ“
Pitfalls and Best Practices in Algorithm Configuration πŸŽ“
Pitfalls of supervised feature selection πŸŽ“
Preparing and Architecting for Machine Learning
Preliminary Systematic Literature Review of Machine Learning System Development Process πŸŽ“
Software development best practices in a deep learning environment
Testing and Debugging in Machine Learning
What Went Wrong and Why? Diagnosing Situated Interaction Failures in the Wild πŸŽ“

Awesome Software Engineering for Machine Learning / Deployment and Operation

Best Practices in Machine Learning Infrastructure
Building Continuous Integration Services for Machine Learning πŸŽ“
Continuous Delivery for Machine Learning ⭐
Continuous Training for Production ML in the TensorFlow Extended (TFX) Platform πŸŽ“
Fairness Indicators: Scalable Infrastructure for Fair ML Systems πŸŽ“
Machine Learning Logistics
Machine learning: Moving from experiments to production
ML Ops: Machine Learning as an engineered disciplined
Model Governance Reducing the Anarchy of Production πŸŽ“
ModelOps: Cloud-based lifecycle management for reliable and trusted AI
Operational Machine Learning
Scaling Machine Learning as a Service πŸŽ“
TFX: A tensorflow-based Production-Scale ML Platform πŸŽ“
The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction πŸŽ“
Underspecification Presents Challenges for Credibility in Modern Machine Learning πŸŽ“
Versioning for end-to-end machine learning pipelines πŸŽ“

Awesome Software Engineering for Machine Learning / Social Aspects

Data Scientists in Software Teams: State of the Art and Challenges πŸŽ“
Machine Learning Interviews 9,227 over 1 year ago
Managing Machine Learning Projects
Principled Machine Learning: Practices and Tools for Efficient Collaboration

Awesome Software Engineering for Machine Learning / Governance

A Human-Centered Interpretability Framework Based on Weight of Evidence πŸŽ“
An Architectural Risk Analysis Of Machine Learning Systems
Beyond Debiasing
Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing πŸŽ“
Inherent trade-offs in the fair determination of risk scores πŸŽ“
Responsible AI practices ⭐
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims
Understanding Software-2.0 πŸŽ“

Awesome Software Engineering for Machine Learning / Tooling

Aim Aim is an open source experiment tracking tool
Airflow Programmatically author, schedule and monitor workflows
Alibi Detect 2,262 2 days ago Python library focused on outlier, adversarial and drift detection
Archai 468 about 2 months ago Neural architecture search
Data Version Control (DVC) DVC is a data and ML experiments management tool
Facets Overview / Facets Dive Robust visualizations to aid in understanding machine learning datasets
FairLearn A toolkit to assess and improve the fairness of machine learning models
Git Large File System (LFS) Replaces large files such as datasets with text pointers inside Git
Great Expectations 10,054 about 24 hours ago Data validation and testing with integration in pipelines
HParams 126 29 days ago A thoughtful approach to configuration management for machine learning projects
Kubeflow A platform for data scientists who want to build and experiment with ML pipelines
Label Studio 19,798 about 19 hours ago A multi-type data labeling and annotation tool with standardized output format
LiFT 167 over 1 year ago Linkedin fairness toolkit
MLFlow Manage the ML lifecycle, including experimentation, deployment, and a central model registry
Model Card Toolkit 427 over 1 year ago Streamlines and automates the generation of model cards; for model documentation
Neptune.ai Experiment tracking tool bringing organization and collaboration to data science projects
Neuraxle 610 over 1 year ago Sklearn-like framework for hyperparameter tuning and AutoML in deep learning projects
OpenML An inclusive movement to build an open, organized, online ecosystem for machine learning
PyTorch Lightning 28,636 about 13 hours ago The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate
REVISE: REvealing VIsual biaSEs 111 over 2 years ago Automatically detect bias in visual data sets
Robustness Metrics 466 5 months ago Lightweight modules to evaluate the robustness of classification models
Seldon Core 4,409 5 days ago An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models on Kubernetes
Spark Machine Learning Spark’s ML library consisting of common learning algorithms and utilities
TensorBoard TensorFlow's Visualization Toolkit
Tensorflow Extended (TFX) An end-to-end platform for deploying production ML pipelines
Tensorflow Data Validation (TFDV) 766 28 days ago Library for exploring and validating machine learning data. Similar to Great Expectations, but for Tensorflow data
Weights & Biases Experiment tracking, model optimization, and dataset versioning

Backlinks from these awesome lists:

More related projects: