Coursera - Machine Learning in Production - Week 1 - Section 3 - Deployment

2025年01月03日


Week 1: Overview of the ML Lifecycle and Deployment


Section 3: Deployment


1. Key challenges


Software engineering issues

Checklist of questions
  • Realtime or Batch
  • Cloud vs. Edge/Browser
  • Compute resources (CPU/GPU/memory)
  • Latency, throughput (QPS)
  • Logging
  • Security and privacy

2. Deployment patterns


Common deployment cases
1. New product/capability
2. Automate/assist with manual task
3. Replace previous ML system

Key ideas:
  • Gradual ramp up with monitoring
  • Rollback

shadow mode deployment


ML system shadows the human and runs in parallel.
ML system's output not used for any decisions during this phase.

Canary deployment

  • Roll out to small fraction (say 5%) of traffic initially.
  • Monitor system and ramp up traffic gradually.

Blue green deployment
Old: Blue version
New: Green version



3. Monitoring


software metric
input metric
output metric

Examples of metrics to track
Software metrics: Memory, compute, latency, throughput, server load

Input metrics:
  • Avg input length
  • Avg input volume
  • Num missing values
  • Avg image brightness
Output metrics:
  • # times return "" (null)
  • # times user redoes search
  • # times user switches to typing
  • CTR

Model maintenance

  • Manual retraining
  • Automatic retraining

4. Pipeline monitoring



5. Week 1 Optional References


Week 1: Overview of the ML Lifecycle and Deployment
If you wish to dive more deeply into the topics covered this week, feel free to check out these optional references. You won’t have to read these to complete this week’s practice quizzes.

Concept and Data Drift
Monitoring ML Models
A Chat with Andrew on MLOps: From Model-centric to Data-centric

Papers
Konstantinos, Katsiapis, Karmarkar, A., Altay, A., Zaks, A., Polyzotis, N., … Li, Z. (2020). Towards ML Engineering: A brief history of TensorFlow Extended (TFX). http://arxiv.org/abs/2010.02013
Paleyes, A., Urma, R.-G., & Lawrence, N. D. (2020). Challenges in deploying machine learning: A survey of case studies. http://arxiv.org/abs/2011.09926
Sculley, D., Holt, G., Golovin, D., Davydov, E., & Phillips, T. (n.d.). Hidden technical debt in machine learning systems. Retrieved April 28, 2021, from Nips.c https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf


Category: AI Tags: public

Upvote


Downvote