Tutorial "Introduction to MLOps with MLflow"

Talk page: https://2022.pycon.de/program/DV8PJT/
Source code for tutorial: https://github.com/tsterbak/pydataberlin-2022

Author:

Tobias Sterbak
tobiassterbak.com
Blog: depends-on-the-definition.com [Tutorial on the blog.] - https://www.depends-on-the-definition.com/

Agenda

Introduction to MLOps
MLflow componetnts
Tracking ML experiments
Deployment and management
Trips and tricks.

What is MLOps and why care?

Set of tools to
bring ML to production
Maintain them
Monitor them
Reduce technial debt
Lifecycle management

Exploratory Data Analytics →Data Prepation / Fature engineering →Model training / tuning ↝ Model review and governance (MLFLOw) → [Something] →Monitoring →Automated model retraining

Mlflow componenets

Tracking
Record and query experiments
Projects
Models
Registry

Hands-on!

https://github.com/tsterbak/pydataberlin-2022

MLFlow store

Tracking stores:
File store or db store
Artifact stores
Store models, source code, plots etc.
S3, GCS etc.

Setup and configure tracking server

Tracking

Scikit-learn: Autologing also available

import mlflow
import mlflow.sklearn

Basic things to track:
Parameters: Key-value input parameters: mlflow.log_param, mlflow.log_params
Metrics: Key-value metrics, where the value is numeric (can be updated over the run): mlflow.log_metric, mlflow.log_metrics

python mlflow.log_metric("test_accuracy", test_score) # <-- Track metrics mlflow.log_param("num_samples", data.shape[0]) # <-- ADDED: track the number of samples in the dataset mlflow.log_artifact("/path/to/file/1_Run and track experiments.ipynb")

Log the model

mlflow.sklearn.log_model(tree, "model")

Compare models in the UI

Add a signature to the model

2. How to deploy from MLFLow with Python

Models
Registered Models