senior-system-architecture
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSenior System Architecture
高级系统架构
When to Use
适用场景
- Define or review cross-service/system boundaries and contracts
- Compare architectural options with explicit trade-offs and NFR impact
- Author or review ADRs for decisions that are hard to reverse
- Run architecture review before major launches or vendor commitments
- Plan migration off legacy systems (strangler, parallel run, cutover)
- Establish architecture principles, standards, and exception process
- 定义或评审跨服务/系统边界与契约
- 对比具有明确权衡及非功能需求(NFR)影响的架构方案
- 撰写或评审难以逆转的决策对应的ADR
- 在重大发布或供应商承诺前开展架构评审
- 规划遗留系统迁移方案(绞杀者模式、并行运行、切换)
- 确立架构原则、标准及例外处理流程
When NOT to Use
不适用场景
- Single-team service RFC and implementation slices →
senior-software-engineer - Lakehouse, mesh, or enterprise data modeling →
data-architect - VPC, Kubernetes, Terraform, or CI/CD build → ,
infrastructure-engineerdevops - IDP, golden paths, Backstage →
platform-engineer - Milestones, RAID, steering status →
technical-program-manager - Release cutover tactics only →
deployment-strategist - Security control catalog and enterprise GRC →
cybersecurity - LLM/RAG system design →
ai-engineer - Business strategy, issue trees, steerCo cases →
business-consultant - Applied AI / LLM solution architecture →
applied-ai-architect-commercial-enterprise - Customer-facing solution design, RFP/RFI, PoC scope, deal integration →
solutions-architect
- 单团队服务RFC及实现切片 →
senior-software-engineer - 湖仓、数据网格或企业数据建模 →
data-architect - VPC、Kubernetes、Terraform或CI/CD构建 → 、
infrastructure-engineerdevops - IDP、黄金路径、Backstage →
platform-engineer - 里程碑、RAID、指导委员会状态跟踪 →
technical-program-manager - 仅发布切换策略 →
deployment-strategist - 安全控制目录与企业GRC →
cybersecurity - LLM/RAG系统设计 →
ai-engineer - 业务战略、问题树、指导委员会案例 →
business-consultant - 应用AI/LLM解决方案架构 →
applied-ai-architect-commercial-enterprise - 面向客户的解决方案设计、RFP/RFI、PoC范围界定、交易集成 →
solutions-architect
Related skills
相关技能
| Need | Skill |
|---|---|
| Service-level RFC and code review | |
| Data domain and governance architecture | |
| Cloud solution and migration architecture | |
| Enterprise cloud governance and landing zones | |
| Cloud/network/IaC delivery | |
| Platform-as-product and golden paths | |
| Multi-team launch coordination | |
| Rollout and rollback planning | |
| Security architecture and controls | |
| Requirements and business constraints | |
| Strategy, business case, operating model | |
| AI/LLM solution patterns | |
| Commercial/enterprise applied AI architecture | |
| Customer deal solution, RFP, PoC handoff | |
| 需求 | 技能 |
|---|---|
| 服务级RFC与代码评审 | |
| 数据领域与治理架构 | |
| 云解决方案与迁移架构 | |
| 企业云治理与着陆区 | |
| 云/网络/IaC交付 | |
| 平台即产品与黄金路径 | |
| 多团队发布协调 | |
| 上线与回滚规划 | |
| 安全架构与控制 | |
| 需求与业务约束 | |
| 战略、业务案例、运营模式 | |
| AI/LLM解决方案模式 | |
| 商用/企业级应用AI架构 | |
| 客户交易解决方案、RFP、PoC交接 | |
Core Workflows
核心工作流
1. Frame the decision
1. 明确决策框架
Capture before drawing boxes:
- Business outcome and measurable success criteria
- Constraints: budget, timeline, compliance, existing estate
- Non-goals (explicit scope cuts)
- Stakeholders and decision owner
- Reversibility: one-way door vs two-way door
One-way doors require ADR + architecture review. Two-way doors can stay in team RFC.
在绘制架构图前先梳理以下内容:
- 业务成果与可衡量的成功标准
- 约束条件:预算、时间线、合规要求、现有资产
- 非目标(明确的范围缩减项)
- 利益相关者与决策负责人
- 可逆性:单向门vs双向门
单向门决策需要ADR + 架构评审。双向门决策可保留在团队RFC中。
2. Model the system (C4-lite)
2. 系统建模(C4-lite)
Minimum views for reviewers:
- Context — actors, external systems, trust zones
- Containers — deployable units, data stores, queues, who owns each
- Critical path — sequence diagram for highest-risk flows only
Label every arrow: sync/async, protocol, auth model, and failure behavior.
See .
references/integration_patterns.md为评审者提供至少以下视图:
- 上下文视图 —— 参与者、外部系统、信任域
- 容器视图 —— 可部署单元、数据存储、队列,以及各自的所有者
- 关键路径视图 —— 仅针对最高风险流程的序列图
为每个箭头标注:同步/异步、协议、认证模型及故障行为。
参考 。
references/integration_patterns.md3. Define NFRs and quality attributes
3. 定义非功能需求(NFR)与质量属性
For each capability, specify targets (not vague "high availability"):
| Attribute | Example target | Verification |
|---|---|---|
| Availability | 99.9% monthly | SLO, error budget |
| Latency | p99 < 300ms read | Load test + prod SLO |
| Throughput | 5k RPS peak | Capacity model |
| Durability | RPO 1h, RTO 4h | DR drill |
| Security | mTLS east-west, OIDC | Threat model link |
| Cost | <$X / 1M requests | FinOps estimate |
See .
references/nfr_quality_attributes.md针对每个能力,明确具体目标(而非模糊的“高可用性”):
| 属性 | 示例目标 | 验证方式 |
|---|---|---|
| 可用性 | 月度99.9% | SLO、错误预算 |
| 延迟 | 读请求p99 < 300ms | 负载测试 + 生产环境SLO |
| 吞吐量 | 峰值5k RPS | 容量模型 |
| 持久性 | RPO 1小时,RTO 4小时 | 灾难恢复演练 |
| 安全性 | 东西向mTLS、OIDC | 威胁模型链接 |
| 成本 | 每100万请求 <$X | FinOps估算 |
参考 。
references/nfr_quality_attributes.md4. Evaluate options
4. 评估方案
Present at least two viable options plus "do nothing / minimal change":
| Criterion | Weight | Option A | Option B |
|---|---|---|---|
| Time to value | |||
| Operational burden | |||
| Scalability headroom | |||
| Team skill fit | |||
| Vendor lock-in | |||
| Security/compliance fit |
Recommend one; document rejected options and why.
See .
references/adr_template.md提供至少两个可行方案,加上“不做改动/最小改动”选项:
| 评估标准 | 权重 | 方案A | 方案B |
|---|---|---|---|
| 价值实现时间 | |||
| 运维负担 | |||
| 可扩展性余量 | |||
| 团队技能匹配度 | |||
| 供应商锁定风险 | |||
| 安全/合规适配性 |
推荐一个方案;记录被否决的方案及原因。
参考 。
references/adr_template.md5. Architecture review
5. 架构评审
Before build or contract signature:
- Problem and constraints restated in one paragraph
- Diagrams current; contracts versioned (OpenAPI, event schema)
- Failure modes: partial outage, dependency down, poison messages
- Data: ownership, retention, PII, migration path
- Observability: golden signals per container
- Security: authz boundaries, secrets, blast radius
- Rollout and rollback linked to if multi-phase
deployment-strategist
See .
references/architecture_review.md在开始构建或签署合同前:
- 用一段文字重述问题与约束条件
- 确保架构图为最新版本;契约已版本化(OpenAPI、事件Schema)
- 分析故障模式:部分 outage、依赖项故障、毒消息
- 数据相关:所有权、保留策略、PII、迁移路径
- 可观测性:每个容器的黄金信号
- 安全性:授权边界、密钥、影响范围
- 若为分阶段上线,需与联动规划上线与回滚策略
deployment-strategist
参考 。
references/architecture_review.md6. Evolution and migration
6. 演进与迁移
For legacy replacement:
- Identify capability slices that can move independently
- Prefer strangler over big-bang when risk is high
- Define parity criteria before cutover
- Plan dual-write / dual-read duration and reconciliation
- Deprecate old path with telemetry proving zero traffic
See .
references/migration_evolution.md针对遗留系统替代:
- 识别可独立迁移的能力切片
- 风险较高时,优先选择绞杀者模式而非大爆炸式迁移
- 切换前明确对等性标准
- 规划双写/双读的持续时间与对账机制
- 通过遥测确认无流量后,废弃旧路径
参考 。
references/migration_evolution.mdWhen to load references
何时加载参考资料
- ADR format and bar →
references/adr_template.md - Review checklist →
references/architecture_review.md - Sync, events, sagas, APIs →
references/integration_patterns.md - SLOs, capacity, DR, cost →
references/nfr_quality_attributes.md - Strangler and cutover →
references/migration_evolution.md
- ADR格式与标准 →
references/adr_template.md - 评审检查清单 →
references/architecture_review.md - 同步、事件、Saga、API →
references/integration_patterns.md - SLO、容量、灾难恢复、成本 →
references/nfr_quality_attributes.md - 绞杀者模式与切换 →
references/migration_evolution.md