awesome-mlops

MLOps toolbox

A curated collection of tools and resources supporting the development and deployment of machine learning models and workflows

sunglasses A curated list of awesome MLOps tools

GitHub

4k stars
89 watching
569 forks
Language: Python
last commit: about 2 months ago
Linked from 3 awesome lists

aiawesomedata-sciencemachine-learningmachine-learning-engineeringmlmlemlops

Awesome MLOps / AutoML

AutoGluon 8,039 6 days ago Automated machine learning for image, text, tabular, time-series, and multi-modal data
AutoKeras 9,154 16 days ago AutoKeras goal is to make machine learning accessible for everyone
AutoPyTorch 2,376 8 months ago Automatic architecture search and hyperparameter optimization for PyTorch
AutoSKLearn 7,632 6 days ago Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator
EvalML 778 6 days ago A library that builds, optimizes, and evaluates ML pipelines using domain-specific functions
FLAML 3,919 8 days ago Finds accurate ML models automatically, efficiently and economically
H2O AutoML Automates ML workflow, which includes automatic training and tuning of models
MindsDB 26,793 6 days ago AI layer for databases that allows you to effortlessly develop, train and deploy ML models
MLBox 1,500 over 1 year ago MLBox is a powerful Automated Machine Learning python library
Model Search 3,268 4 months ago Framework that implements AutoML algorithms for model architecture search at scale
NNI 14,054 5 months ago An open source AutoML toolkit for automate machine learning lifecycle

Awesome MLOps / CI/CD for Machine Learning

ClearML 5,684 10 days ago Auto-Magical CI/CD to streamline your ML workflow
CML 4,038 20 days ago Open-source library for implementing CI/CD in machine learning projects
KitOps 495 6 days ago – Open source MLOps project that eases model handoffs between data scientist and DevOps

Awesome MLOps / Cron Job Monitoring

Cronitor Monitor any cron job or scheduled task
HealthchecksIO Simple and effective cron job monitoring

Awesome MLOps / Data Catalog

Amundsen Data discovery and metadata engine for improving the productivity when interacting with data
Apache Atlas Provides open metadata management and governance capabilities to build a data catalog
CKAN 4,476 6 days ago Open-source DMS (data management system) for powering data hubs and data portals
DataHub 9,916 4 days ago LinkedIn's generalized metadata search & discovery tool
Magda 514 4 days ago A federated, open-source data catalog for all your big data and small data
Metacat 1,610 13 days ago Unified metadata exploration API service for Hive, RDS, Teradata, Redshift, S3 and Cassandra
OpenMetadata A Single place to discover, collaborate and get your data right

Awesome MLOps / Data Enrichment

Snorkel 5,809 7 months ago A system for quickly generating training data with weak supervision
Upgini 319 6 days ago Enriches training datasets with features from public and community shared data sources

Awesome MLOps / Data Exploration

Apache Zeppelin Enables data-driven, interactive data analytics and collaborative documents
BambooLib 939 9 months ago An intuitive GUI for Pandas DataFrames
DataPrep 2,068 5 months ago Collect, clean and visualize your data in Python
Google Colab Hosted Jupyter notebook service that requires no setup to use
Jupyter Notebook Web-based notebook environment for interactive computing
JupyterLab The next-generation user interface for Project Jupyter
Jupytext 6,649 3 months ago Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
Pandas Profiling 12,536 8 days ago Create HTML profiling reports from pandas DataFrame objects
Polynote The polyglot notebook with first-class Scala support

Awesome MLOps / Data Management

Arrikto Dead simple, ultra fast storage for the hybrid Kubernetes world
BlazingSQL 1,933 about 2 years ago A lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF
Delta Lake 7,593 6 days ago Storage layer that brings scalable, ACID transactions to Apache Spark and other engines
Dolt 17,965 6 days ago SQL database that you can fork, clone, branch, merge, push and pull just like a git repository
Dud 183 10 days ago A lightweight CLI tool for versioning data alongside source code and building data pipelines
DVC Management and versioning of datasets and machine learning models
Git LFS An open source Git extension for versioning large files
Hub 8,171 10 days ago A dataset format for creating, storing, and collaborating on AI datasets of any size
Intake 1,013 8 days ago A lightweight set of tools for loading and sharing data in data science projects
lakeFS 4,458 4 days ago Repeatable, atomic and versioned data lake on top of object storage
Marquez 1,779 8 days ago Collect, aggregate, and visualize a data ecosystem's metadata
Milvus 30,573 6 days ago An open source embedding vector similarity search engine powered by Faiss, NMSLIB and Annoy
Pinecone Managed and distributed vector similarity search used with a lightweight SDK
Qdrant 20,607 4 days ago An open source vector similarity search engine with extended filtering support
Quilt 1,330 4 days ago A self-organizing data hub with S3 support

Awesome MLOps / Data Processing

Airflow Platform to programmatically author, schedule, and monitor workflows
Azkaban 4,467 5 months ago Batch workflow job scheduler created at LinkedIn to run Hadoop jobs
Dagster 11,699 6 days ago A data orchestrator for machine learning, analytics, and ETL
Hadoop Framework that allows for the distributed processing of large data sets across clusters
OpenRefine 10,914 3 days ago Power tool for working with messy data and improving it
Spark Unified analytics engine for large-scale data processing

Awesome MLOps / Data Validation

Cerberus 3,168 3 months ago Lightweight, extensible data validation library for Python
Cleanlab 9,756 29 days ago Python library for data-centric AI and machine learning with messy, real-world data and labels
Great Expectations A Python data validation framework that allows to test your data against datasets
JSON Schema A vocabulary that allows you to annotate and validate JSON documents
TFDV 765 20 days ago An library for exploring and validating machine learning data

Awesome MLOps / Data Visualization

Count SQL/drag-and-drop querying and visualisation tool based on notebooks
Dash 21,482 9 days ago Analytical Web Apps for Python, R, Julia, and Jupyter
Data Studio Reporting solution for power users who want to go beyond the data and dashboards of GA
Facets 7,357 over 1 year ago Visualizations for understanding and analyzing machine learning datasets
Grafana Multi-platform open source analytics and interactive visualization web application
Lux 5,210 8 months ago Fast and easy data exploration by automating the visualization and data analysis process
Metabase The simplest, fastest way to get business intelligence and analytics to everyone
Redash Connect to any data source, easily visualize, dashboard and share your data
SolidUI 579 10 months ago AI-generated visualization prototyping and editing platform, support 2D and 3D models
Superset Modern, enterprise-ready business intelligence web application
Tableau Powerful and fastest growing data visualization tool used in the business intelligence industry

Awesome MLOps / Drift Detection

Alibi Detect 2,247 24 days ago An open source Python library focused on outlier, adversarial and drift detection
Frouros 193 6 days ago An open source Python library for drift detection in machine learning systems
TorchDrift 312 about 2 years ago A data and concept drift library for PyTorch

Awesome MLOps / Feature Engineering

Feature Engine 1,926 13 days ago Feature engineering package with SKlearn like functionality
Featuretools 7,270 8 days ago Python library for automated feature engineering
TSFresh 8,435 7 days ago Python library for automatic extraction of relevant features from time series

Awesome MLOps / Feature Store

Butterfree 283 about 1 month ago A tool for building feature stores. Transform your raw data into beautiful features
ByteHub 58 over 3 years ago An easy-to-use feature store. Optimized for time-series data
Feast End-to-end open source feature store for machine learning
Feathr 1,986 8 months ago An enterprise-grade, high performance feature store
Featureform 1,818 9 days ago A Virtual Feature Store. Turn your existing data infrastructure into a feature store
Tecton A fully-managed feature platform built to orchestrate the complete lifecycle of features

Awesome MLOps / Hyperparameter Tuning

Advisor 1,547 about 5 years ago Open-source implementation of Google Vizier for hyper parameters tuning
Hyperas 2,178 almost 2 years ago A very simple wrapper for convenient hyperparameter optimization
Hyperopt 7,258 24 days ago Distributed Asynchronous Hyperparameter Optimization in Python
Katib 1,509 16 days ago Kubernetes-based system for hyperparameter tuning and neural architecture search
KerasTuner 2,862 4 months ago Easy-to-use, scalable hyperparameter optimization framework
Optuna Open source hyperparameter optimization framework to automate hyperparameter search
Scikit Optimize 2,744 9 months ago Simple and efficient library to minimize expensive and noisy black-box functions
Talos 1,625 7 months ago Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Tune Python library for experiment execution and hyperparameter tuning at any scale

Awesome MLOps / Knowledge Sharing

Knowledge Repo 5,482 3 months ago Knowledge sharing platform for data scientists and other technical professions
Kyso One place for data insights so your entire team can learn from your data

Awesome MLOps / Machine Learning Platform

aiWARE aiWARE helps MLOps teams evaluate, deploy, integrate, scale & monitor ML models
Algorithmia Securely govern your machine learning operations with a healthy ML lifecycle
Allegro AI Transform ML/DL research into products. Faster
Bodywork Deploys machine learning projects developed in Python, to Kubernetes
CNVRG An end-to-end machine learning platform to build and deploy AI models at scale
DAGsHub A platform built on open source tools for data, model and pipeline management
Dataiku Platform democratizing access to data and enabling enterprises to build their own path to AI
DataRobot AI platform that democratizes data science and automates the end-to-end ML at scale
Domino One place for your data science tools, apps, results, models, and knowledge
Edge Impulse Platform for creating, optimizing, and deploying AI/ML algorithms for edge devices
envd 2,038 about 2 months ago Machine learning development environment for data science and AI/ML engineering teams
FedML Simplifies the workflow of federated learning anywhere at any scale
Gradient Multicloud CI/CD and MLOps platform for machine learning teams
H2O Open source leader in AI with a mission to democratize AI for everyone
Hopsworks Open-source platform for developing and operating machine learning models at scale
Iguazio Data science platform that automates MLOps with end-to-end machine learning pipelines
Katonic Automate your cycle of intelligence with Katonic MLOps Platform
Knime Create and productionize data science using one easy and intuitive environment
Kubeflow Making deployments of ML workflows on Kubernetes simple, portable and scalable
LynxKite A complete graph data science platform for very large graphs and other datasets
ML Workspace 3,434 4 months ago All-in-one web-based IDE specialized for machine learning and data science
MLReef 1,442 about 2 years ago Open source MLOps platform that helps you collaborate, reproduce and share your ML work
Modzy Deploy, connect, run, and monitor machine learning (ML) models in the enterprise and at the edge
Neu.ro MLOps platform that integrates open-source and proprietary tools into client-oriented systems
Omnimizer Simplifies and accelerates MLOps by bridging the gap between ML models and edge hardware
Pachyderm Combines data lineage with end-to-end pipelines on Kubernetes, engineered for the enterprise
Polyaxon A platform for reproducible and scalable machine learning and deep learning on kubernetes
Sagemaker Fully managed service that provides the ability to build, train, and deploy ML models quickly
SAS Viya Cloud native AI, analytic and data management platform that supports the analytics life cycle
Sematic An open-source end-to-end pipelining tool to go from laptop prototype to cloud in no time
SigOpt A platform that makes it easy to track runs, visualize training, and scale hyperparameter tuning
TrueFoundry A Cloud-native MLOps Platform over Kubernetes to simplify training and serving of ML Models
Valohai Takes you from POC to production while managing the whole model lifecycle

Awesome MLOps / Model Fairness and Privacy

AIF360 2,457 5 months ago A comprehensive set of fairness metrics for datasets and machine learning models
Fairlearn 1,948 5 days ago A Python package to assess and improve fairness of machine learning models
Opacus 1,716 17 days ago A library that enables training PyTorch models with differential privacy
TensorFlow Privacy 1,943 17 days ago Library for training machine learning models with privacy for training data

Awesome MLOps / Model Interpretability

Alibi 2,414 4 months ago Open-source Python library enabling ML model inspection and interpretation
Captum 4,931 6 days ago Model interpretability and understanding library for PyTorch
ELI5 262 5 months ago Python package which helps to debug machine learning classifiers and explain their predictions
InterpretML 6,296 3 days ago A toolkit to help understand models and enable responsible machine learning
LIME 11,615 4 months ago Explaining the predictions of any machine learning classifier
Lucid 4,673 almost 2 years ago Collection of infrastructure and tools for research in neural network interpretability
SAGE 253 10 days ago For calculating global feature importance using Shapley values
SHAP 22,876 12 days ago A game theoretic approach to explain the output of any machine learning model

Awesome MLOps / Model Lifecycle

Aeromancy 10 about 2 months ago A framework for performing reproducible AI and ML for Weights and Biases
Aim 5,220 6 days ago A super-easy way to record, search and compare 1000s of ML training runs
Cascade 22 4 days ago Library of ML-Engineering tools for rapid prototyping and experiment management
Comet 151 9 days ago Track your datasets, code changes, experimentation history, and models
Guild AI Open source experiment tracking, pipeline automation, and hyperparameter tuning
Keepsake 1,650 3 months ago Version control for machine learning with support to Amazon S3 and Google Cloud Storage
Losswise Makes it easy to track the progress of a machine learning project
MLflow Open source platform for the machine learning lifecycle
ModelDB 1,702 4 months ago Open source ML model versioning, metadata, and experiment management
Neptune AI The most lightweight experiment management tool that fits any workflow
Sacred 4,254 about 1 month ago A tool to help you configure, organize, log and reproduce experiments
Weights and Biases 9,152 3 days ago A tool for visualizing and tracking your machine learning experiments

Awesome MLOps / Model Serving

Banana Host your ML inference code on serverless GPUs and integrate it into your app with one line of code
Beam Develop on serverless GPUs, deploy highly performant APIs, and rapidly prototype ML models
BentoML 7,153 6 days ago Open-source platform for high-performance ML model serving
BudgetML 1,338 9 months ago Deploy a ML inference service on a budget in less than 10 lines of code
Cog 8,081 6 days ago Open-source tool that lets you package ML models in a standard, production-ready container
Cortex Machine learning model serving infrastructure
Geniusrise Host inference APIs, bulk inference and fine tune text, vision, audio and multi-modal models
Gradio 33,962 6 days ago Create customizable UI components around your models
GraphPipe Machine learning model deployment made simple
Hydrosphere 271 24 days ago Platform for deploying your Machine Learning to production
KFServing 3,637 5 days ago Kubernetes custom resource definition for serving ML models on arbitrary frameworks
LocalAI 25,827 4 days ago Drop-in replacement REST API that’s compatible with OpenAI API specifications for inferencing
Merlin 167 5 days ago A platform for deploying and serving machine learning models
MLEM 716 about 1 year ago Version and deploy your ML models following GitOps principles
Opyrator 3,102 29 days ago Turns your ML code into microservices with web API, interactive GUI, and more
PredictionIO 12,545 almost 4 years ago Event collection, deployment of algorithms, evaluation, querying predictive results via APIs
Quix Serverless platform for processing data streams in real-time with machine learning models
Rune 136 over 2 years ago Provides containers to encapsulate and deploy EdgeML pipelines and applications
Seldon Take your ML projects from POC to production with maximum efficiency and minimal risk
Streamlit 35,705 6 days ago Lets you create apps for your ML projects with deceptively simple Python scripts
TensorFlow Serving Flexible, high-performance serving system for ML models, designed for production
TorchServe 4,217 29 days ago A flexible and easy to use tool for serving PyTorch models
Triton Inference Server 8,342 3 days ago Provides an optimized cloud and edge inferencing solution
Vespa 5,812 4 days ago Store, search, organize and make machine-learned inferences over big data at serving time
Wallaroo.AI A platform for deploying, serving, and optimizing ML models in both cloud and edge environments

Awesome MLOps / Model Testing & Validation

Deepchecks 3,623 8 days ago Open-source package for validating ML models & data, with various checks and suites
Starwhale 208 4 months ago An MLOps/LLMOps platform for model building, evaluation, and fine-tuning
Trubrics 133 6 months ago Validate machine learning with data science and domain expert feedback

Awesome MLOps / Optimization Tools

Accelerate 7,947 6 days ago A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
Dask Provides advanced parallelism for analytics, enabling performance at scale for the tools you love
DeepSpeed 35,463 7 days ago Deep learning optimization library that makes distributed training easy, efficient, and effective
Fiber Python distributed computing library for modern computer clusters
Horovod 14,265 3 months ago Distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet
Mahout Distributed linear algebra framework and mathematically expressive Scala DSL
MLlib Apache Spark's scalable machine learning library
Modin 9,892 2 months ago Speed up your Pandas workflows by changing a single line of code
Nebullvm 8,375 4 months ago Easy-to-use library to boost AI inference
Nos 630 7 months ago Open-source module for running AI workloads on Kubernetes in an optimized way
Petastorm 1,799 12 months ago Enables single machine or distributed training and evaluation of deep learning models
Rapids Gives the ability to execute end-to-end data science and analytics pipelines entirely on GPUs
Ray 33,994 6 days ago Fast and simple framework for building and running distributed applications
Singa Apache top level project, focusing on distributed training of DL and ML models
Tpot 9,736 4 months ago Automated ML tool that optimizes machine learning pipelines using genetic programming

Awesome MLOps / Simplification Tools

Chassis Turns models into ML-friendly containers that run just about anywhere
Hermione 207 over 1 year ago Help Data Scientists on setting up more organized codes, in a quicker and simpler way
Hydra 8,812 5 days ago A framework for elegantly configuring complex applications
Koalas 3,336 8 months ago Pandas API on Apache Spark. Makes data scientists more productive when interacting with big data
Ludwig 11,189 24 days ago Allows users to train and test deep learning models without the need to write code
MLNotify 343 over 2 years ago No need to keep checking your training, just one import line and you'll know the second it's done
PyCaret Open source, low-code machine learning library in Python
Sagify 435 9 months ago A CLI utility to train and deploy ML/DL models on AWS SageMaker
Soopervisor 45 2 months ago Export ML projects to Kubernetes (Argo workflows), Airflow, AWS Batch, and SLURM
Soorgeon 78 2 months ago Convert monolithic Jupyter notebooks into maintainable pipelines
TrainGenerator 1,368 about 1 year ago A web app to generate template code for machine learning
Turi Create 11,202 about 1 year ago Simplifies the development of custom machine learning models

Awesome MLOps / Visual Analysis and Debugging

Aporia Observability with customized monitoring and explainability for ML models
Arize A free end-to-end ML observability and model monitoring platform
Evidently 5,391 7 days ago Interactive reports to analyze ML models during validation or production monitoring
Fiddler Monitor, explain, and analyze your AI in production
Manifold 1,651 over 1 year ago A model-agnostic visual debugging tool for machine learning
NannyML 1,971 17 days ago Algorithm capable of fully capturing the impact of data drift on performance
Netron 28,134 6 days ago Visualizer for neural network, deep learning, and machine learning models
Opik 2,121 6 days ago Evaluate, test, and ship LLM applications with a suite of observability tools
Phoenix MLOps in a Notebook for troubleshooting and fine-tuning generative LLM, CV, and tabular models
Radicalbit 71 6 days ago The open source solution for monitoring your AI models in production
Superwise Fully automated, enterprise-grade model observability in a self-service SaaS platform
Whylogs 2,653 3 days ago The open source standard for data logging. Enables ML monitoring and observability
Yellowbrick 4,293 about 2 months ago Visual analysis and diagnostic tools to facilitate machine learning model selection

Awesome MLOps / Workflow Tools

Argo 15,082 7 days ago Open source container-native workflow engine for orchestrating parallel jobs on Kubernetes
Automate Studio Rapidly build & deploy AI-powered workflows
Couler 915 about 1 month ago Unified interface for constructing and managing workflows on different workflow engines
dstack 1,533 4 days ago An open-core tool to automate data and training workflows
Flyte Easy to create concurrent, scalable, and maintainable workflows for machine learning
Hamilton 1,861 7 days ago A scalable general purpose micro-framework for defining dataflows
Kale 632 almost 2 years ago Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows
Kedro 10,004 6 days ago Library that implements software engineering best-practice for data and ML pipelines
Luigi 17,869 9 days ago Python module that helps you build complex pipelines of batch jobs
Metaflow Human-friendly lib that helps scientists and engineers build and manage data science projects
MLRun 1,446 4 days ago Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines
Orchest 4,079 over 1 year ago Visual pipeline editor and workflow orchestrator with an easy to use UI and based on Kubernetes
Ploomber 3,510 2 months ago Write maintainable, production-ready pipelines. Develop locally, deploy to the cloud
Prefect A workflow management system, designed for modern infrastructure
VDP 2,147 7 days ago An open-source tool to seamlessly integrate AI for unstructured data into the modern data stack
Wordware A web-hosted IDE where non-technical domain experts can build task-specific AI agents
ZenML 4,071 6 days ago An extensible open-source MLOps framework to create reproducible pipelines

Resources / Articles

A Tour of End-to-End Machine Learning Platforms (Databaseline)
Continuous Delivery for Machine Learning (Martin Fowler)
Delivering on the Vision of MLOps: A maturity-based approach (GigaOm)
Machine Learning Operations (MLOps): Overview, Definition, and Architecture (arXiv)
MLOps Roadmap: A Complete MLOps Career Guide (Scaler Blogs)
MLOps: Continuous delivery and automation pipelines in machine learning (Google)
MLOps: Machine Learning as an Engineering Discipline (Medium)
Rules of Machine Learning: Best Practices for ML Engineering (Google)
The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction (Google)
What Is MLOps? (NVIDIA)

Resources / Books

Beginning MLOps with MLFlow (Apress)
Building Machine Learning Pipelines (O'Reilly)
Building Machine Learning Powered Applications (O'Reilly)
Deep Learning in Production (AI Summer)
Designing Machine Learning Systems (O'Reilly)
Engineering MLOps (Packt)
Implementing MLOps in the Enterprise (O'Reilly)
Introducing MLOps (O'Reilly)
Kubeflow for Machine Learning (O'Reilly)
Kubeflow Operations Guide (O'Reilly)
Machine Learning Design Patterns (O'Reilly)
Machine Learning Engineering in Action (Manning)
ML Ops: Operationalizing Data Science (O'Reilly)
MLOps Engineering at Scale (Manning)
MLOps Lifecycle Toolkit (Apress)
Practical Deep Learning at Scale with MLflow (Packt)
Practical MLOps (O'Reilly)
Production-Ready Applied Deep Learning (Packt)
Reliable Machine Learning (O'Reilly)
The Machine Learning Solutions Architect Handbook (Packt)

Resources / Events

apply() - The ML data engineering conference
MLOps Conference - Keynotes and Panels
MLOps World: Machine Learning in Production Conference
NormConf - The Normcore Tech Conference
Stanford MLSys Seminar Series

Resources / Other Lists

Applied ML 27,322 4 months ago
Awesome AutoML Papers 4,023 5 months ago
Awesome AutoML 865 about 2 months ago
Awesome Data Science 25,157 14 days ago
Awesome DataOps 158 about 1 month ago
Awesome Deep Learning 24,271 7 months ago
Awesome Game Datasets 748 11 days ago (includes AI content)
Awesome Machine Learning 66,046 10 days ago
Awesome MLOps 12,623 5 months ago
Awesome Production Machine Learning 17,606 4 days ago
Awesome Python 225,227 3 months ago
Deep Learning in Production 4,306 12 days ago

Resources / Podcasts

How AI Built This
Kubernetes Podcast from Google
Machine Learning – Software Engineering Daily
MLOps.community
Pipeline Conversation
Practical AI: Machine Learning, Data Science
This Week in Machine Learning & AI
True ML Talks

Resources / Slack

Kubeflow Workspace
MLOps Community Wokspace

Resources / Websites

Feature Stores for ML
Made with ML 37,603 3 months ago
ML-Ops
MLOps Community
MLOps Guide
MLOps Now

Backlinks from these awesome lists:

More related projects: