Lesson 43 β’ Advanced
Monitoring Models in Production π
Deploying a model is just the beginning. Learn to detect data drift, outliers, bias, and performance degradation before they cause real damage.
What You'll Learn in This Lesson
- β’ How to detect data drift using PSI (Population Stability Index)
- β’ Monitoring for concept drift with accuracy tracking over time
- β’ Catching anomalous inputs before they corrupt predictions
- β’ Fairness monitoring with disparate impact and equal opportunity metrics
- β’ Building automated alerting pipelines for production ML
1οΈβ£ Why Models Degrade
A model that was 95% accurate at deployment can silently drop to 70% within weeks. The three main causes:
| Type | What Changes | Example |
|---|---|---|
| Data Drift | Input distribution shifts | New customer demographics |
| Concept Drift | Inputβoutput relationship changes | "Good credit" threshold shifts |
| Feature Drift | Feature pipeline breaks | API returns null for a column |
π‘ Pro Tip: Monitor all three β most teams only check accuracy and miss silent data issues.
Try It: Data Drift Detection
Calculate PSI to detect when your input data shifts from the training distribution
import numpy as np
# ============================================
# DATA DRIFT DETECTION
# ============================================
# Data drift = input distribution changes over time
# Model was trained on summer data, now it's winter
np.random.seed(42)
print("=== Population Stability Index (PSI) ===")
print()
print("PSI measures how much the input distribution shifted.")
print("Think of it like checking if your customers changed.")
print()
def calculate_psi(reference, current, bins=10)
...2οΈβ£ Monitoring Architecture
A production monitoring system has these layers:
ββββββββββββββββββββββββββββββββββββββββ β Alerting (PagerDuty/Slack) β ββββββββββββββββββββββββββββββββββββββββ€ β Dashboard (Grafana/DataDog) β ββββββββββββββββββββββββββββββββββββββββ€ β Metric Store (Prometheus/BigQuery) β ββββββββββββββββββββββββββββββββββββββββ€ β Collectors (input/output loggers) β ββββββββββββββββββββββββββββββββββββββββ€ β Model Inference Service β ββββββββββββββββββββββββββββββββββββββββ
Key metrics to track:
- Latency: p50, p95, p99 response times
- Throughput: Requests per second
- Error rate: Failed predictions / total
- Feature distributions: Mean, std, min, max per feature
- Prediction distribution: Class balance or output range
Try It: Anomaly Detection in Production
Catch outlier inputs using z-score monitoring before they corrupt predictions
import numpy as np
# ============================================
# OUTLIER & ANOMALY DETECTION IN PRODUCTION
# ============================================
np.random.seed(42)
print("=== Production Anomaly Detection ===")
print()
print("Your model expects certain input ranges.")
print("Outliers can cause silent failures or wrong predictions.")
print()
# Simulate normal feature distributions from training
feature_names = ["age", "income", "credit_score", "loan_amount"]
train_means = [35, 55000
...Try It: Bias Monitoring
Track fairness metrics across demographic groups with disparate impact analysis
import numpy as np
# ============================================
# BIAS MONITORING IN PRODUCTION
# ============================================
np.random.seed(42)
print("=== Fairness Metrics Monitoring ===")
print()
print("Even a fair model at training can become biased in production")
print("as user demographics shift over time.")
print()
# Simulate loan approval predictions across groups
groups = {
"Group A (majority)": {"total": 5000, "approved": 3800, "default_rate": 0.05},
"Grou
...3οΈβ£ Common Mistakes
π Quick Reference β Model Monitoring
| Metric | Tool | Alert Threshold |
|---|---|---|
| PSI | Evidently, WhyLabs | > 0.25 β retrain |
| Accuracy | MLflow, Prometheus | < baseline - 5% |
| Latency p99 | Grafana, DataDog | > 200ms |
| Disparate Impact | Fairlearn, AIF360 | < 0.8 ratio |
| Feature Null % | Great Expectations | > 5% nulls |
π Lesson Complete!
You can now monitor ML models in production! Next, learn how to automate the entire ML lifecycle with MLOps pipelines.
Sign up for free to track which lessons you've completed and get learning reminders.