Design the architecture for ChatGPT.
### Signal to interviewer
I can design ChatGPT architecture with clear boundaries so capability growth does not break reliability and safety.
### Clarify
I would clarify workload mix, multimodal scope, enterprise requirements, and global latency targets.
### Approach
Use layered capability architecture: interaction layer, orchestration layer, model intelligence layer, and governance layer. Each layer has explicit SLAs and failure behavior.
### Metrics & instrumentation
Primary metric: successful user task completion. Secondary metrics: tail latency, tool-call success, and retrieval usefulness. Guardrails: policy violation rate, outage impact, and abuse detection precision.
### Tradeoffs
Deep orchestration improves answer quality but increases latency and complexity. Unified models simplify operations but may underperform on specialized workloads.
### Risks & mitigations
Risk: cascading failures across tools; mitigate with circuit breakers. Risk: routing drift; mitigate with continuous evaluation. Risk: policy inconsistency across regions; mitigate with centralized policy config and localized enforcement checks.
### Example
A complex legal query routes through retrieval + reasoning + policy checks, while a simple definition request uses a fast direct model path.
### 90-second version
Design ChatGPT in layers with explicit contracts. Route adaptively by task complexity, instrument every stage, and keep governance first-class so performance and safety scale together.
- What workload mix assumptions drive your initial routing strategy?
- Which capabilities must be globally consistent versus region-specific?
- How would you implement fail-open versus fail-closed behavior per layer?
- What telemetry schema would you use to trace end-to-end request outcomes?