Design AI model routing infrastructure.
### Signal to interviewer
I can design routing infrastructure that balances cost, quality, and safety through explicit policy governance.
### Clarify
I would clarify workload taxonomy, model inventory, compliance constraints, and SLO priorities across products.
### Approach
Use a policy-driven router with three planes: request classification, route selection, and post-route evaluation feedback. Support fallback and override paths.
### Metrics & instrumentation
Primary metric: quality-adjusted cost per successful request. Secondary metrics: route accuracy, fallback frequency, and model utilization efficiency. Guardrails: latency breaches, policy violations, and degraded-user-impact rate.
### Tradeoffs
Fine-grained routing improves optimization but increases debugging burden. Conservative routing simplifies reliability but may waste compute on easy tasks.
### Risks & mitigations
Risk: policy drift causing silent quality drops; mitigate with route-level canaries. Risk: black-box decisions reduce trust internally; mitigate with route explanations. Risk: rollback delays during incidents; mitigate with one-click policy rollback.
### Example
In a multilingual assistant, low-risk translation routes to efficient models, while regulated finance prompts route to higher-governance models with stricter checks.
### 90-second version
Design routing as a policy system, not hardcoded if-else logic. Optimize quality-adjusted cost, keep route decisions observable, and make fallback/rollback safe and fast.
- Which request attributes are mandatory for route selection at launch?
- How do you define quality-adjusted success for routing decisions?
- How would you implement route explainability for debugging and audits?
- What policy rollback mechanism would you use during active incidents?