backend-principle-eng-python-pro-max

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Backend Principle Eng Python Pro Max

Python资深后端工程专家指南

Principal-level guidance for Python backend systems in product companies. Emphasizes correctness, reliability, and operational excellence.
面向产品公司Python后端系统的资深级指导方案,重点关注正确性、可靠性与卓越运维。

When to Apply

适用场景

  • Designing or refactoring Python services, APIs, and data pipelines
  • Reviewing PRs for correctness, reliability, performance, and security
  • Planning migrations, scalability, or cost optimizations
  • Incident follow-ups and systemic fixes
  • 设计或重构Python服务、API与数据管道
  • 评审PR以确保正确性、可靠性、性能与安全性
  • 规划迁移、可扩展性或成本优化方案
  • 事件复盘与系统性问题修复

Priority Model (highest to lowest)

优先级模型(从高到低)

PriorityCategoryGoalSignals
1Correctness & ContractsNo wrong answersStrong validation, invariants, idempotency
2Reliability & ResilienceSurvive failuresTimeouts, retries, graceful degradation
3Security & PrivacyZero trust by defaultAuthz, secrets, minimal data exposure
4Performance & EfficiencyPredictable latencyAsync I/O, bounded queues, caching
5Observability & OperabilityFast triageTracing, metrics, runbooks
6Data & ConsistencyIntegrity over timeSafe migrations, outbox, versioning
7Scalability & EvolutionSafe growthStatelessness, partitioning, backpressure
8Developer Experience & TestingSustainable velocityCI gates, deterministic tests, typing
优先级类别目标关键指标
1正确性与契约无错误输出严格校验、不变量、幂等性
2可靠性与韧性故障下持续运行超时设置、重试机制、优雅降级
3安全性与隐私默认零信任授权控制、密钥管理、最小数据暴露
4性能与效率可预测延迟异步I/O、有界队列、缓存策略
5可观测性与可运维性快速问题定位链路追踪、指标监控、运维手册
6数据与一致性长期数据完整性安全迁移、事务发件箱、版本控制
7可扩展性与演进性安全增长无状态设计、分片策略、背压机制
8开发者体验与测试可持续交付速度CI门禁、确定性测试、类型提示

Quick Reference (Rules)

快速参考准则

1. Correctness & Contracts (CRITICAL)

1. 正确性与契约(核心优先级)

  • api-contracts
    - Versioned schemas and explicit validation
  • input-validation
    - Validate at boundaries, reject unknowns
  • idempotency
    - Safe retries with idempotency keys
  • invariants
    - Enforce domain rules in service and database
  • time-utc
    - Store UTC, use monotonic clocks for durations
  • api-contracts
    - 带版本的 schema 与显式校验
  • input-validation
    - 在边界处校验,拒绝未知输入
  • idempotency
    - 基于幂等键实现安全重试
  • invariants
    - 在服务与数据库中强制执行领域规则
  • time-utc
    - 存储UTC时间,使用单调时钟计算时长

2. Reliability & Resilience (CRITICAL)

2. 可靠性与韧性(核心优先级)

  • timeouts
    - Set per dependency; no unbounded waits
  • retries
    - Bounded with jitter; avoid retry storms
  • circuit-breakers
    - Fail fast for degraded dependencies
  • bulkheads
    - Isolate thread pools and task queues
  • load-shedding
    - Graceful degradation under load
  • timeouts
    - 为每个依赖设置超时;禁止无限制等待
  • retries
    - 带抖动的有限重试;避免重试风暴
  • circuit-breakers
    - 对降级依赖快速失败
  • bulkheads
    - 隔离线程池与任务队列
  • load-shedding
    - 高负载下优雅降级

3. Security & Privacy (CRITICAL)

3. 安全性与隐私(核心优先级)

  • authz
    - Enforce at every service boundary
  • secrets
    - Use vault/KMS; never in code or logs
  • data-min
    - Redact PII by default
  • crypto
    - TLS everywhere; strong defaults
  • supply-chain
    - Pin deps; scan CVEs
  • authz
    - 在每个服务边界强制执行授权
  • secrets
    - 使用vault/KMS管理;绝不在代码或日志中存储
  • data-min
    - 默认脱敏PII数据
  • crypto
    - 全链路TLS;使用安全默认配置
  • supply-chain
    - 固定依赖版本;扫描CVE漏洞

4. Performance & Efficiency (HIGH)

4. 性能与效率(高优先级)

  • async-io
    - Use async for I/O bound paths; avoid blocking
  • pooling
    - Right-size DB/HTTP pools; avoid starvation
  • cache
    - TTL and stampede protection for hot reads
  • batching
    - Batch I/O and DB operations where safe
  • profiling
    - Measure before optimizing
  • async-io
    - I/O密集路径使用异步;避免阻塞
  • pooling
    - 合理配置数据库/HTTP连接池;避免资源饥饿
  • cache
    - 热点读设置TTL与缓存击穿保护
  • batching
    - 在安全前提下批量处理I/O与数据库操作
  • profiling
    - 先测量再优化

5. Observability & Operability (HIGH)

5. 可观测性与可运维性(高优先级)

  • structured-logs
    - JSON logs with trace ids
  • metrics
    - RED/USE metrics plus business KPIs
  • tracing
    - Propagate context end-to-end
  • alerts
    - SLO-based with runbooks
  • deploys
    - Safe rollouts and rapid rollback
  • structured-logs
    - 带追踪ID的JSON格式日志
  • metrics
    - RED/USE指标加上业务KPI
  • tracing
    - 端到端传播上下文
  • alerts
    - 基于SLO的告警并附带运维手册
  • deploys
    - 安全发布与快速回滚

6. Data & Consistency (HIGH)

6. 数据与一致性(高优先级)

  • transactions
    - Clear boundaries; avoid cross-service tx
  • schema-evolution
    - Backward compatible migrations
  • outbox
    - Reliable event publishing
  • id-generation
    - Globally unique IDs
  • read-models
    - Use CQRS when complexity is justified
  • transactions
    - 明确边界;避免跨服务事务
  • schema-evolution
    - 向后兼容的迁移方案
  • outbox
    - 可靠的事件发布机制
  • id-generation
    - 全局唯一ID
  • read-models
    - 复杂度合理时使用CQRS

7. Scalability & Evolution (MEDIUM)

7. 可扩展性与演进性(中优先级)

  • stateless
    - Externalize state, scale horizontally
  • partitioning
    - Shard by stable keys
  • versioning
    - API and event versioning
  • backpressure
    - Bounded queues, explicit limits
  • config
    - Dynamic config with validation
  • stateless
    - 外部化状态,支持水平扩展
  • partitioning
    - 基于稳定键分片
  • versioning
    - API与事件版本控制
  • backpressure
    - 有界队列与显式限制
  • config
    - 带校验的动态配置

8. Developer Experience & Testing (MEDIUM)

8. 开发者体验与测试(中优先级)

  • typing
    - Type hints for public APIs and core logic
  • tests
    - Unit, integration, contract, load tests
  • determinism
    - Hermetic tests, fixed seeds, stable time
  • lint
    - Static analysis and formatting
  • typing
    - 对外API与核心逻辑添加类型提示
  • tests
    - 单元测试、集成测试、契约测试、负载测试
  • determinism
    - 封闭环境测试、固定随机种子、稳定时间模拟
  • lint
    - 静态分析与代码格式化

Execution Workflow

执行流程

  1. Clarify product goals, SLOs, latency and cost budgets
  2. Map data flow, dependencies, and failure modes
  3. Choose storage and consistency model (document tradeoffs)
  4. Define contracts: API schemas, events, and idempotency
  5. Implement with safe defaults, observability, and resilience
  6. Validate with tests, load, and failure scenarios
  7. Review risks and publish runbooks
  1. 明确产品目标、SLO、延迟与成本预算
  2. 梳理数据流、依赖关系与故障模式
  3. 选择存储与一致性模型(记录权衡方案)
  4. 定义契约:API schema、事件与幂等性规则
  5. 基于安全默认配置、可观测性与韧性实现功能
  6. 通过测试、负载与故障场景验证方案
  7. 评审风险并发布运维手册

Language-Specific Guidance

语言专属指导

See
references/python-core.md
for stack defaults, async patterns, and tooling.
详见
references/python-core.md
获取栈默认配置、异步模式与工具相关内容。