system-design-interrogation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

System Design Interrogation

系统设计审查

The Problem

问题

Rushing to implementation without systematic design thinking leads to:
  • Scalability issues discovered too late
  • Security holes from missing tenant isolation
  • Data model mismatches
  • Frontend/backend contract conflicts
  • Poor user experience
仓促进入实施阶段而缺乏系统化设计思维会导致:
  • 可扩展性问题发现过晚
  • 因缺少租户隔离而产生安全漏洞
  • 数据模型不匹配
  • 前后端契约冲突
  • 糟糕的用户体验

The Solution: Question Before Implementing

解决方案:实施前先提问

┌────────────────────────────────────────────────────────────────────────────┐
│                    SYSTEM DESIGN INTERROGATION                             │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                            │
│                        ┌─────────────┐                                     │
│                        │   FEATURE   │                                     │
│                        │   REQUEST   │                                     │
│                        └──────┬──────┘                                     │
│                               │                                            │
│    ┌──────────────────────────┼──────────────────────────┐                │
│    │                          │                          │                │
│    ▼                          ▼                          ▼                │
│  ┌────────┐             ┌────────┐              ┌────────┐               │
│  │ SCALE  │             │  DATA  │              │SECURITY│               │
│  └───┬────┘             └───┬────┘              └───┬────┘               │
│      │                      │                       │                     │
│  • Users?               • Where?               • Who access?              │
│  • Volume?              • Pattern?             • Isolation?               │
│  • Growth?              • Search?              • Attacks?                 │
│      │                      │                       │                     │
│      └──────────────────────┼───────────────────────┘                     │
│                             │                                             │
│    ┌────────────────────────┼────────────────────────┐                   │
│    │                        │                        │                   │
│    ▼                        ▼                        ▼                   │
│  ┌────────┐           ┌──────────┐            ┌────────┐                │
│  │   UX   │           │COHERENCE │            │ TRADE- │                │
│  └───┬────┘           └────┬─────┘            │  OFFS  │                │
│      │                     │                  └───┬────┘                │
│  • Latency?           • Contracts?           • Speed?                    │
│  • Feedback?          • Types?               • Quality?                  │
│  • Errors?            • API?                 • Cost?                     │
│      │                     │                      │                      │
│      └─────────────────────┴──────────────────────┘                      │
│                             │                                             │
│                             ▼                                             │
│                     ┌───────────────┐                                    │
│                     │ IMPLEMENTATION│                                    │
│                     │    READY      │                                    │
│                     └───────────────┘                                    │
│                                                                            │
└────────────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────────┐
│                    SYSTEM DESIGN INTERROGATION                             │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                            │
│                        ┌─────────────┐                                     │
│                        │   FEATURE   │                                     │
│                        │   REQUEST   │                                     │
│                        └──────┬──────┘                                     │
│                               │                                            │
│    ┌──────────────────────────┼──────────────────────────┐                │
│    │                          │                          │                │
│    ▼                          ▼                          ▼                │
│  ┌────────┐             ┌────────┐              ┌────────┐               │
│  │ SCALE  │             │  DATA  │              │SECURITY│               │
│  └───┬────┘             └───┬────┘              └───┬────┘               │
│      │                      │                       │                     │
│  • Users?               • Where?               • Who access?              │
│  • Volume?              • Pattern?             • Isolation?               │
│  • Growth?              • Search?              • Attacks?                 │
│      │                      │                       │                     │
│      └──────────────────────┼───────────────────────┘                     │
│                             │                                             │
│    ┌────────────────────────┼────────────────────────┐                   │
│    │                        │                        │                   │
│    ▼                        ▼                        ▼                   │
│  ┌────────┐           ┌──────────┐            ┌────────┐                │
│  │   UX   │           │COHERENCE │            │ TRADE- │                │
│  └───┬────┘           └────┬─────┘            │  OFFS  │                │
│      │                     │                  └───┬────┘                │
│  • Latency?           • Contracts?           • Speed?                    │
│  • Feedback?          • Types?               • Quality?                  │
│  • Errors?            • API?                 • Cost?                     │
│      │                     │                      │                      │
│      └─────────────────────┴──────────────────────┘                      │
│                             │                                             │
│                             ▼                                             │
│                     ┌───────────────┐                                    │
│                     │ IMPLEMENTATION│                                    │
│                     │    READY      │                                    │
│                     └───────────────┘                                    │
│                                                                            │
└────────────────────────────────────────────────────────────────────────────┘

The Five Dimensions

五个维度

1. Scale

1. 可扩展性(Scale)

Key Questions:
  • How many users/tenants will use this?
  • What's the expected data volume (now and in 1 year)?
  • What's the request rate? Read-heavy or write-heavy?
  • Does complexity grow linearly or exponentially with data?
  • What happens at 10x current load? 100x?
OrchestKit Example:
Feature: "Add document tagging"
- Users: 1000 active users
- Documents per user: ~50 average
- Tags per document: 3-5
- Total tags: 50,000 → 500,000
- Access: Read-heavy (10:1 read:write)
- Search: Need tag autocomplete (prefix search)
核心问题:
  • 有多少用户/租户会使用该功能?
  • 预期的数据量是多少(当前及1年后)?
  • 请求频率如何?读密集型还是写密集型?
  • 复杂度随数据增长是线性还是指数级?
  • 当前负载提升10倍、100倍时会发生什么?
OrchestKit 示例:
Feature: "Add document tagging"
- Users: 1000 active users
- Documents per user: ~50 average
- Tags per document: 3-5
- Total tags: 50,000 → 500,000
- Access: Read-heavy (10:1 read:write)
- Search: Need tag autocomplete (prefix search)

2. Data

2. 数据(Data)

Key Questions:
  • Where does this data naturally belong?
  • What's the primary access pattern?
  • Is it master data or transactional?
  • What's the retention policy?
  • Does it need to be searchable? How?
OrchestKit Example:
Feature: "Add document tagging"
- Data: Tags belong WITH documents (denormalized) or separate table?
- Pattern: Get tags for document (by doc_id), get documents by tag
- Storage: PostgreSQL (relational) or add to document JSON?
- Search: Full-text for tag names, filter by tag for documents
- Decision: Separate `tags` table with many-to-many join
核心问题:
  • 这些数据天然应该存放在哪里?
  • 主要的访问模式是什么?
  • 它是主数据还是事务性数据?
  • 数据保留策略是什么?
  • 是否需要支持搜索?如何实现?
OrchestKit 示例:
Feature: "Add document tagging"
- Data: Tags belong WITH documents (denormalized) or separate table?
- Pattern: Get tags for document (by doc_id), get documents by tag
- Storage: PostgreSQL (relational) or add to document JSON?
- Search: Full-text for tag names, filter by tag for documents
- Decision: Separate `tags` table with many-to-many join

3. Security

3. 安全性(Security)

Key Questions:
  • Who can access this data/feature?
  • How is tenant isolation enforced?
  • What happens if authorization fails?
  • What attack vectors does this introduce?
  • Is there PII involved?
OrchestKit Example:
Feature: "Add document tagging"
- Access: User can only see/manage their own tags
- Isolation: All tag queries MUST include tenant_id filter
- AuthZ: Check user owns document before tagging
- Attacks: Tag injection? Limit tag length, sanitize input
- PII: Tags might contain PII → treat as sensitive
核心问题:
  • 谁可以访问这些数据/功能?
  • 如何实施租户隔离?
  • 授权失败时会发生什么?
  • 这会引入哪些攻击向量?
  • 是否涉及个人可识别信息(PII)?
OrchestKit 示例:
Feature: "Add document tagging"
- Access: User can only see/manage their own tags
- Isolation: All tag queries MUST include tenant_id filter
- AuthZ: Check user owns document before tagging
- Attacks: Tag injection? Limit tag length, sanitize input
- PII: Tags might contain PII → treat as sensitive

4. UX Impact

4. 用户体验影响(UX Impact)

Key Questions:
  • What's the expected latency for this operation?
  • What feedback does the user get during the operation?
  • What happens on failure? Can they retry?
  • Is there optimistic UI possible?
  • How does this affect the overall workflow?
OrchestKit Example:
Feature: "Add document tagging"
- Latency: < 100ms for add/remove tag
- Feedback: Optimistic update, show tag immediately
- Failure: Rollback tag, show error toast
- Optimistic: Yes - add tag to UI before server confirms
- Workflow: Tags should be inline editable, no modal
核心问题:
  • 该操作的预期延迟是多少?
  • 操作过程中用户会得到什么反馈?
  • 失败时会发生什么?用户可以重试吗?
  • 是否可以采用乐观UI?
  • 这会如何影响整体工作流?
OrchestKit 示例:
Feature: "Add document tagging"
- Latency: < 100ms for add/remove tag
- Feedback: Optimistic update, show tag immediately
- Failure: Rollback tag, show error toast
- Optimistic: Yes - add tag to UI before server confirms
- Workflow: Tags should be inline editable, no modal

5. Coherence

5. 一致性(Coherence)

Key Questions:
  • Which layers does this touch?
  • What contracts/interfaces change?
  • Are types consistent frontend ↔ backend?
  • Does this break existing clients?
  • How does this affect the API?
OrchestKit Example:
Feature: "Add document tagging"
- Layers: DB → Backend API → Frontend UI → State
- Contracts: Document type needs `tags: Tag[]` field
- Types: Tag = { id: UUID, name: string, color?: string }
- Breaking: No - additive change to Document response
- API: POST /documents/{id}/tags, DELETE /documents/{id}/tags/{tag_id}
核心问题:
  • 这会涉及哪些层级?
  • 哪些契约/接口会发生变化?
  • 前后端的类型是否一致?
  • 这会破坏现有客户端吗?
  • 这会如何影响API?
OrchestKit 示例:
Feature: "Add document tagging"
- Layers: DB → Backend API → Frontend UI → State
- Contracts: Document type needs `tags: Tag[]` field
- Types: Tag = { id: UUID, name: string, color?: string }
- Breaking: No - additive change to Document response
- API: POST /documents/{id}/tags, DELETE /documents/{id}/tags/{tag_id}

The Process

流程

Before Writing Any Code

编写代码之前

  1. State the Feature - One sentence description
  2. Run Through 5 Dimensions - Answer key questions for each
  3. Identify Trade-offs - Speed vs quality, complexity vs flexibility
  4. Document Decisions - Record answers in design doc or issue
  5. Review with Team - Get alignment before implementing
  1. 明确功能 - 用一句话描述功能
  2. 覆盖五个维度 - 回答每个维度的核心问题
  3. 识别权衡点 - 速度与质量、复杂度与灵活性的权衡
  4. 记录决策 - 在设计文档或议题中记录答案
  5. 团队评审 - 达成共识后再开始实施

Quick Assessment Template

快速评估模板

markdown
undefined
markdown
undefined

Feature: [Name]

Feature: [Name]

Scale

Scale

  • Users:
  • Data volume:
  • Access pattern:
  • Growth projection:
  • Users:
  • Data volume:
  • Access pattern:
  • Growth projection:

Data

Data

  • Storage location:
  • Schema changes:
  • Search requirements:
  • Retention:
  • Storage location:
  • Schema changes:
  • Search requirements:
  • Retention:

Security

Security

  • Authorization:
  • Tenant isolation:
  • Attack surface:
  • PII handling:
  • Authorization:
  • Tenant isolation:
  • Attack surface:
  • PII handling:

UX

UX

  • Target latency:
  • Feedback mechanism:
  • Error handling:
  • Optimistic updates:
  • Target latency:
  • Feedback mechanism:
  • Error handling:
  • Optimistic updates:

Coherence

Coherence

  • Affected layers:
  • Type changes:
  • API changes:
  • Breaking changes:
  • Affected layers:
  • Type changes:
  • API changes:
  • Breaking changes:

Decision

Decision

[Final approach with rationale]
undefined
[Final approach with rationale]
undefined

Integration with OrchestKit Workflow

与OrchestKit工作流的集成

In Brainstorming Phase

头脑风暴阶段

Before implementation, run system design interrogation:
/brainstorm → System Design Questions → Implementation Plan
实施前,运行系统设计审查:
/brainstorm → System Design Questions → Implementation Plan

In Code Review

代码评审阶段

Reviewer should verify:
  • Scale considerations documented
  • Security layer covered
  • Types consistent across stack
  • UX states handled
评审人员应验证:
  • 已记录可扩展性考量
  • 覆盖安全层级
  • 全栈类型一致
  • 已处理用户体验状态

In Testing

测试阶段

Tests should cover:
  • Scale: Load tests for expected volume
  • Security: Tenant isolation tests
  • Coherence: Integration tests across layers
  • UX: Error state tests
测试应涵盖:
  • 可扩展性:针对预期容量的负载测试
  • 安全性:租户隔离测试
  • 一致性:跨层级集成测试
  • 用户体验:错误状态测试

Anti-Patterns

反模式

❌ "I'll add an index later if it's slow"
   → Ask: What's the expected query pattern NOW?

❌ "We can add tenant filtering in a future PR"
   → Ask: How is isolation enforced from DAY ONE?

❌ "The frontend can handle any response shape"
   → Ask: What's the TypeScript type for this?

❌ "Users won't do that"
   → Ask: What's the attack vector? What if they DO?

❌ "It's just a small feature"
   → Ask: How does this grow with 100x users?
❌ "I'll add an index later if it's slow"
   → Ask: What's the expected query pattern NOW?

❌ "We can add tenant filtering in a future PR"
   → Ask: How is isolation enforced from DAY ONE?

❌ "The frontend can handle any response shape"
   → Ask: What's the TypeScript type for this?

❌ "Users won't do that"
   → Ask: What's the attack vector? What if they DO?

❌ "It's just a small feature"
   → Ask: How does this grow with 100x users?

Quick Reference Card

快速参考卡片

DimensionKey QuestionRed Flag
ScaleHow many?"All users"
DataWhere stored?"I'll figure it out"
SecurityWho can access?"Everyone"
UXWhat's the latency?"It'll be fast"
CoherenceWhat types change?"No changes needed"

Version: 1.0.0 (December 2025)
维度核心问题危险信号
可扩展性数量有多少?“所有用户”
数据存储在哪里?“我之后再想”
安全性谁可以访问?“所有人”
用户体验延迟是多少?“肯定很快”
一致性哪些类型会变化?“不需要任何变更”

版本: 1.0.0(2025年12月)

Related Skills

相关技能

  • brainstorming
    - Transform rough ideas into designs before applying system design interrogation
  • architecture-decision-record
    - Document key decisions discovered during interrogation
  • explore
    - Deep codebase exploration to understand existing architecture before planning
  • verify
    - Comprehensive feature verification after implementation
  • brainstorming
    - 在应用系统设计审查前,将初步想法转化为设计方案
  • architecture-decision-record
    - 记录审查过程中发现的关键决策
  • explore
    - 在规划前深入探索代码库,了解现有架构
  • verify
    - 实施完成后进行全面的功能验证

Key Decisions

关键决策

DecisionChoiceRationale
Dimensions countFive (Scale, Data, Security, UX, Coherence)Covers all critical architectural concerns without overlap
Process timingBefore any codePrevents costly rework from missed requirements
Question formatStructured templatesEnsures consistent coverage, prevents omissions
DocumentationMarkdown templatePortable, version-controlled, reviewable
IntegrationPairs with brainstormingBrainstorming explores options, interrogation validates choice
决策选择理由
维度数量五个(可扩展性、数据、安全性、用户体验、一致性)覆盖所有关键架构关注点且无重叠
流程时机编写任何代码之前避免因遗漏需求导致的昂贵返工
问题格式结构化模板确保覆盖全面,防止遗漏
文档方式Markdown模板可移植、可版本控制、可评审
集成方式与头脑风暴配合头脑风暴探索选项,审查验证选择

Capability Details

能力详情

scale-assessment

scale-assessment

Keywords: scale, load, traffic, users, concurrent, throughput Solves:
  • How many users will this feature serve?
  • What's the expected request rate?
  • How does this scale with data growth?
关键词: scale, load, traffic, users, concurrent, throughput 解决问题:
  • 该功能将服务多少用户?
  • 预期请求频率是多少?
  • 随数据增长,它的可扩展性如何?

data-architecture

data-architecture

Keywords: data, storage, database, schema, migration, structure Solves:
  • Where should this data live?
  • What's the access pattern?
  • How does this affect existing schemas?
关键词: data, storage, database, schema, migration, structure 解决问题:
  • 这些数据应该存放在哪里?
  • 访问模式是什么?
  • 这会如何影响现有 schema?

security-considerations

security-considerations

Keywords: security, auth, permission, tenant, isolation, attack Solves:
  • What are the security implications?
  • How is tenant isolation maintained?
  • What attack vectors exist?
关键词: security, auth, permission, tenant, isolation, attack 解决问题:
  • 存在哪些安全隐患?
  • 如何维持租户隔离?
  • 存在哪些攻击向量?

coherence-validation

coherence-validation

Keywords: coherence, consistency, contract, interface, integration Solves:
  • How does this fit the existing architecture?
  • What contracts need updating?
  • Are frontend/backend aligned?
关键词: coherence, consistency, contract, interface, integration 解决问题:
  • 它如何适配现有架构?
  • 需要更新哪些契约?
  • 前后端是否保持一致?

ux-impact

ux-impact

Keywords: ux, user experience, latency, feedback, error Solves:
  • What's the user experience impact?
  • How long will users wait?
  • What feedback do they get?
关键词: ux, user experience, latency, feedback, error 解决问题:
  • 对用户体验有什么影响?
  • 用户需要等待多久?
  • 他们会得到什么反馈?