Coursera - Machine Learning in Production - Week 1 - Section 3

Coursera - Machine Learning in Production - Week 1 - Section 3 - Deployment

2025年01月03日

Week 1: Overview of the ML Lifecycle and Deployment

Section 3: Deployment

1. Key challenges

Software engineering issues

Checklist of questions

Realtime or Batch
Cloud vs. Edge/Browser
Compute resources (CPU/GPU/memory)
Latency, throughput (QPS)
Logging
Security and privacy

2. Deployment patterns

Common deployment cases
1. New product/capability
2. Automate/assist with manual task
3. Replace previous ML system

Key ideas:

Gradual ramp up with monitoring
Rollback

shadow mode deployment

ML system shadows the human and runs in parallel.
ML system's output not used for any decisions during this phase.

Canary deployment

Roll out to small fraction (say 5%) of traffic initially.
Monitor system and ramp up traffic gradually.

Blue green deployment
Old: Blue version
New: Green version

3. Monitoring

software metric
input metric
output metric

Examples of metrics to track
Software metrics: Memory, compute, latency, throughput, server load

Input metrics:

Avg input length
Avg input volume
Num missing values
Avg image brightness

Output metrics:

# times return "" (null)
# times user redoes search
# times user switches to typing
CTR

Model maintenance

Manual retraining
Automatic retraining

4. Pipeline monitoring

5. Week 1 Optional References

Week 1: Overview of the ML Lifecycle and Deployment
If you wish to dive more deeply into the topics covered this week, feel free to check out these optional references. You won’t have to read these to complete this week’s practice quizzes.

Concept and Data Drift
Monitoring ML Models
A Chat with Andrew on MLOps: From Model-centric to Data-centric

Papers
Konstantinos, Katsiapis, Karmarkar, A., Altay, A., Zaks, A., Polyzotis, N., … Li, Z. (2020). Towards ML Engineering: A brief history of TensorFlow Extended (TFX). http://arxiv.org/abs/2010.02013
Paleyes, A., Urma, R.-G., & Lawrence, N. D. (2020). Challenges in deploying machine learning: A survey of case studies. http://arxiv.org/abs/2011.09926
Sculley, D., Holt, G., Golovin, D., Davydov, E., & Phillips, T. (n.d.). Hidden technical debt in machine learning systems. Retrieved April 28, 2021, from Nips.c https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf

Category: AI Tags: AI public

Sky Cone