backend-principle-eng-cpp-pro-max

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Backend Principle Eng C++ Pro Max

首席C++后端工程专家指南

Principal-level guidance for C++ backend systems, low-latency services, and infrastructure. Emphasizes correctness, memory safety, and predictable performance.
针对C++后端系统、低延迟服务及基础设施的首席级指导方案,重点强调正确性、内存安全与可预测性能。

When to Apply

适用场景

  • Designing or refactoring C++ backend services and infrastructure
  • Reviewing code for memory safety, concurrency, and latency regressions
  • Building high-throughput networking, storage, or compute systems
  • Incident response and performance regressions
  • 设计或重构C++后端服务与基础设施
  • 评审代码以排查内存安全、并发及延迟退化问题
  • 构建高吞吐量网络、存储或计算系统
  • 事件响应与性能退化处理

Priority Model (highest to lowest)

优先级模型(从高到低)

PriorityCategoryGoalSignals
1Correctness & UB AvoidanceNo undefined behaviorRAII, invariants, validated inputs
2Reliability & ResilienceFail safe under loadTimeouts, backpressure, graceful shutdown
3SecurityHard to exploitHardened builds, safe parsing, least privilege
4Performance & LatencyPredictable P99Stable allocs, bounded queues, zero-copy where safe
5Observability & OperabilityFast triageTrace ids, structured logs, metrics
6Scalability & EvolutionSafe growthStatelessness, sharding, protocol versioning
7Tooling & TestingSustainable velocitySanitizers, fuzzing, CI gates
优先级类别目标信号指标
1正确性与避免未定义行为(UB)无未定义行为RAII、不变量、已验证输入
2可靠性与韧性高负载下安全故障超时机制、背压、优雅停机
3安全性难以被利用加固构建、安全解析、最小权限
4性能与延迟可预测的P99延迟稳定内存分配、有界队列、安全场景下的零拷贝
5可观测性与可运维性快速问题排查跟踪ID、结构化日志、指标
6可扩展性与演进安全增长无状态、分片、协议版本化
7工具与测试可持续开发速度Sanitizers(内存检测工具)、模糊测试、CI门禁

Quick Reference (Rules)

快速参考规则

1. Correctness & UB Avoidance (CRITICAL)

1. 正确性与避免未定义行为(CRITICAL,至关重要)

  • raii
    - Own resources with RAII and deterministic lifetimes
  • no-raw-ownership
    - Raw pointers only for non-owning references
  • bounds
    - Validate all indices and sizes at boundaries
  • invariants
    - Assert core invariants and state transitions
  • time
    - Use monotonic clocks for durations
  • raii
    - 使用RAII管理资源,确保确定性生命周期
  • no-raw-ownership
    - 裸指针仅用于非所有权引用
  • bounds
    - 在边界处验证所有索引与大小
  • invariants
    - 断言核心不变量与状态转换
  • time
    - 使用单调时钟计算时长

2. Reliability & Resilience (CRITICAL)

2. 可靠性与韧性(CRITICAL,至关重要)

  • timeouts
    - Explicit timeouts for every external call
  • backpressure
    - Bounded queues; apply load shedding
  • shutdown
    - Drain in-flight work with deadlines
  • bulkheads
    - Isolate thread pools by dependency
  • timeouts
    - 所有外部调用均设置显式超时
  • backpressure
    - 采用有界队列;应用流量削峰
  • shutdown
    - 带截止时间地处理未完成工作
  • bulkheads
    - 按依赖关系隔离线程池

3. Security (CRITICAL)

3. 安全性(CRITICAL,至关重要)

  • safe-parse
    - Validate untrusted input; avoid unsafe string ops
  • harden
    - Compile with stack protection, PIE, RELRO, FORTIFY
  • secrets
    - No secrets in logs or core dumps
  • least-priv
    - Drop privileges and sandbox when possible
  • safe-parse
    - 验证不可信输入;避免不安全字符串操作
  • harden
    - 编译时启用栈保护、PIE、RELRO、FORTIFY
  • secrets
    - 日志或核心转储中不得包含敏感信息
  • least-priv
    - 尽可能降低权限并启用沙箱

4. Performance & Latency (HIGH)

4. 性能与延迟(HIGH,高优先级)

  • allocs
    - Minimize allocations in hot paths
  • copy
    - Prefer move or views; avoid unnecessary copies
  • cache
    - Improve locality; avoid false sharing
  • io
    - Use async I/O where appropriate
  • profiling
    - Measure before optimizing
  • allocs
    - 减少热路径中的内存分配
  • copy
    - 优先使用移动语义或视图;避免不必要的拷贝
  • cache
    - 提升缓存局部性;避免伪共享
  • io
    - 合理使用异步I/O
  • profiling
    - 先测量再优化

5. Observability & Operability (HIGH)

5. 可观测性与可运维性(HIGH,高优先级)

  • logs
    - Structured logs with request and trace ids
  • metrics
    - RED/USE plus business KPIs
  • tracing
    - Propagate trace context across threads
  • crash
    - Symbolized crash reports and core dump policies
  • logs
    - 包含请求ID与跟踪ID的结构化日志
  • metrics
    - RED/USE指标加上业务关键绩效指标(KPI)
  • tracing
    - 跨线程传播跟踪上下文
  • crash
    - 带符号的崩溃报告与核心转储策略

6. Scalability & Evolution (MEDIUM)

6. 可扩展性与演进(MEDIUM,中等优先级)

  • stateless
    - Externalize state, enable horizontal scale
  • partitioning
    - Shard by stable keys
  • versioning
    - Protocol and schema versioning
  • limits
    - Explicit limits on payloads and queue sizes
  • stateless
    - 外部化状态,支持水平扩展
  • partitioning
    - 按稳定键进行分片
  • versioning
    - 协议与Schema版本化
  • limits
    - 对负载与队列大小设置显式限制

7. Tooling & Testing (MEDIUM)

7. 工具与测试(MEDIUM,中等优先级)

  • sanitizers
    - ASan, UBSan, TSan in CI
  • fuzzing
    - Fuzz parsers and protocol handlers
  • tests
    - Unit, integration, and load tests
  • lint
    - clang-tidy, clang-format, warnings as errors
  • sanitizers
    - CI中启用ASan、UBSan、TSan
  • fuzzing
    - 对解析器与协议处理程序进行模糊测试
  • tests
    - 单元测试、集成测试与负载测试
  • lint
    - clang-tidy、clang-format、将警告视为错误

Execution Workflow

执行工作流

  1. Clarify latency/SLOs, throughput, and cost budgets
  2. Map data flow, thread model, and failure modes
  3. Define interfaces and memory ownership contracts
  4. Implement with bounded queues and explicit timeouts
  5. Add observability and crash diagnostics
  6. Validate with sanitizers, fuzzing, load tests
  7. Review risks and publish runbooks
  1. 明确延迟/服务水平目标(SLO)、吞吐量与成本预算
  2. 梳理数据流、线程模型与故障模式
  3. 定义接口与内存所有权契约
  4. 基于有界队列与显式超时实现功能
  5. 添加可观测性与崩溃诊断机制
  6. 通过Sanitizers、模糊测试、负载测试进行验证
  7. 评审风险并发布运行手册

Language-Specific Guidance

语言特定指导

See
references/cpp-core.md
for toolchain defaults, concurrency patterns, and hardening.
请查看
references/cpp-core.md
获取工具链默认配置、并发模式与加固方案。