Last-mile ML-PRO review: feature pipeline patterns, MLflow registry and promotion workflows, batch vs online deployment pickers, monitoring/drift decision rules, and governance essentials.
Use this for last‑mile review. Pair it with the Syllabus for coverage and Practice to validate production judgment.
flowchart LR
FE["Feature pipeline"] --> TR["Train + evaluate"]
TR --> RUN["MLflow run (params/metrics/artifacts)"]
RUN --> REG["Registry version"]
REG --> DEP["Deploy (batch/online)"]
DEP --> MON["Monitor + drift"]
MON -->|retrain| FE
Exam rule: if a solution lacks versioning, lineage, or rollback, it’s rarely correct.
| Risk | Symptom | Mitigation |
|---|---|---|
| Training/serving skew | production metrics collapse | shared transforms; enforce schema |
| Leakage | unrealistically good offline metrics | time-aware splits; careful feature design |
| Drift | model degrades over time | monitor distributions and outcomes |
| Concept | Why it matters |
|---|---|
| Registry versions | stable, auditable artifacts |
| Stage transitions | controlled promotion and rollback |
| Approval gates | reduce “accidental production” |
One-sentence heuristic: runs are for experiments, the registry is for releases.
| Requirement | Prefer | Why |
|---|---|---|
| Low latency per request | Online serving | request/response |
| High throughput scoring | Batch inference | cost-efficient |
| Model updates frequently | Managed rollout + rollback | reduce risk |
| Strict governance | Versioned registry releases | auditability |
| Observation | First question | Likely action |
|---|---|---|
| Gradual degradation | data drift? seasonality? | retrain / update features |
| Sudden drop | pipeline break? schema change? | rollback or fix upstream |
| Only one segment affected | sampling bias? | segment monitoring + targeted fix |