backend-principle-eng-java-pro-max

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Backend Principle Eng Java Pro Max

Java后端资深首席工程师指南Pro Max

Senior principal-level guidance for Java backend systems in product companies. Emphasizes durable architecture, production readiness, and measurable outcomes.

面向产品公司Java后端系统的资深首席级指导。强调持久化架构、生产就绪性与可衡量成果。

When to Apply

适用场景

Designing or refactoring Java services, APIs, data pipelines, or distributed systems
Reviewing PRs for correctness, reliability, performance, and security
Planning migrations, scalability, or cost optimizations
Incident follow-ups and systemic fixes

设计或重构Java服务、API、数据管道或分布式系统
评审PR以确保正确性、可靠性、性能与安全性
规划迁移、扩容或成本优化方案
事件复盘与系统性修复

Priority Model (highest to lowest)

优先级模型（从高到低）

Priority	Category	Goal	Signals
1	Correctness & Contracts	No wrong answers	Stable invariants, strong validation, idempotency
2	Reliability & Resilience	Survive failures	Timeouts, retries, circuit breakers, graceful degrade
3	Security & Privacy	Zero trust by default	Authz everywhere, secrets managed, minimal data exposure
4	Performance & Efficiency	Predictable latency	P95/P99 targets, bounded queues, efficient I/O
5	Observability & Operability	Fast detection and recovery	Tracing, actionable alerts, runbooks
6	Data & Consistency	Integrity over time	Safe migrations, transactional boundaries, outbox
7	Scalability & Evolution	Safe growth	Statelessness, partitioning, versioning
8	Developer Experience & Testing	Sustainable velocity	CI gates, deterministic tests, clear docs

优先级	类别	目标	信号
1	正确性与契约	无错误输出	稳定不变量、强校验、幂等性
2	可靠性与韧性	故障下持续运行	超时机制、重试策略、熔断器、优雅降级
3	安全与隐私	默认零信任	全链路授权、密钥管理、最小化数据暴露
4	性能与效率	可预测延迟	P95/P99指标、有界队列、高效I/O
5	可观测性与可运维性	快速检测与恢复	链路追踪、可执行告警、运行手册
6	数据与一致性	长期数据完整性	安全迁移、事务边界、发件箱模式
7	可扩展性与演进	安全增长	无状态、分片、版本化
8	开发者体验与测试	可持续交付速度	CI门禁、确定性测试、清晰文档

Quick Reference (Rules)

速查规则

1. Correctness & Contracts (CRITICAL)

1. 正确性与契约（CRITICAL）

```
api-contracts
```
- Versioned APIs, explicit schemas, backward compatibility
```
input-validation
```
- Validate at boundaries, normalize, reject unknowns
```
idempotency
```
- Safe retries for mutating calls with idempotency keys
```
invariants
```
- Enforce domain rules in service and database
```
time-utc
```
- Store UTC, handle clock skew, use monotonic time for durations

```
api-contracts
```
- 版本化API、明确Schema、向后兼容
```
input-validation
```
- 边界处校验、标准化、拒绝未知输入
```
idempotency
```
- 基于幂等键的安全重试（针对变更调用）
```
invariants
```
- 在服务与数据库中强制执行领域规则
```
time-utc
```
- 存储UTC时间、处理时钟偏移、使用单调时间计算时长

2. Reliability & Resilience (CRITICAL)

2. 可靠性与韧性（CRITICAL）

```
timeouts
```
- Set per dependency; no unbounded waits
```
retries
```
- Bounded with jitter; never retry non-idempotent without keys
```
circuit-breakers
```
- Fail fast when downstream degrades
```
bulkheads
```
- Isolate thread pools and queues per dependency
```
load-shedding
```
- Backpressure and graceful degradation under load

```
timeouts
```
- 为每个依赖设置超时；禁止无边界等待
```
retries
```
- 带抖动的有限重试；无幂等键时绝不重试非幂等操作
```
circuit-breakers
```
- 下游降级时快速失败
```
bulkheads
```
- 为每个依赖隔离线程池与队列
```
load-shedding
```
- 负载下的背压与优雅降级

3. Security & Privacy (CRITICAL)

3. 安全与隐私（CRITICAL）

```
authz
```
- Enforce at every service boundary, deny by default
```
secrets
```
- Managed via vault/KMS; never in code or logs
```
data-min
```
- Log minimal PII, redact by default
```
crypto
```
- TLS everywhere, rotate keys, strong defaults
```
supply-chain
```
- Pin deps, scan CVEs, reproducible builds

```
authz
```
- 在每个服务边界强制执行授权，默认拒绝
```
secrets
```
- 通过vault/KMS管理；绝不在代码或日志中存储
```
data-min
```
- 最小化PII日志，默认脱敏
```
crypto
```
- 全链路TLS、密钥轮换、强默认配置
```
supply-chain
```
- 固定依赖版本、扫描CVE、可复现构建

4. Performance & Efficiency (HIGH)

4. 性能与效率（HIGH）

```
pooling
```
- Right-size DB/HTTP pools; avoid blocking shared pools
```
serialization
```
- Avoid reflection in hot paths; prefer explicit schemas
```
allocation
```
- Minimize hot-path allocations and boxing
```
cache
```
- TTL and stampede protection for hot reads
```
batching
```
- Batch I/O and DB operations where safe

```
pooling
```
- 合理配置DB/HTTP连接池；避免阻塞共享池
```
serialization
```
- 热点路径避免反射；优先使用明确Schema
```
allocation
```
- 最小化热点路径的内存分配与装箱操作
```
cache
```
- 热点读的TTL与缓存击穿保护
```
batching
```
- 安全前提下批量处理I/O与数据库操作

5. Observability & Operability (HIGH)

5. 可观测性与可运维性（HIGH）

```
structured-logs
```
- JSON logs with trace/span ids and request ids
```
metrics
```
- RED/USE metrics plus business KPIs
```
tracing
```
- Propagate context end-to-end
```
alerts
```
- SLO-based, actionable, with runbooks
```
deploys
```
- Safe rollouts, health checks, rapid rollback

```
structured-logs
```
- 包含trace/span ID与请求ID的JSON日志
```
metrics
```
- RED/USE指标加上业务KPI
```
tracing
```
- 端到端传播上下文
```
alerts
```
- 基于SLO的可执行告警，附带运行手册
```
deploys
```
- 安全发布、健康检查、快速回滚

6. Data & Consistency (HIGH)

6. 数据与一致性（HIGH）

```
transactions
```
- Clear boundaries, short duration, avoid cross-service tx
```
schema-evolution
```
- Backward compatible migrations
```
outbox
```
- Reliable event publishing with transactional outbox
```
id-generation
```
- Globally unique IDs; avoid auto-increment for scale
```
read-models
```
- Use CQRS only when complexity is justified

```
transactions
```
- 明确边界、短时长、避免跨服务事务
```
schema-evolution
```
- 向后兼容的迁移
```
outbox
```
- 基于事务发件箱的可靠事件发布
```
id-generation
```
- 全局唯一ID；避免自增ID以支持扩容
```
read-models
```
- 仅在复杂度合理时使用CQRS

7. Scalability & Evolution (MEDIUM)

7. 可扩展性与演进（MEDIUM）

```
stateless
```
- Externalize state, scale horizontally
```
partitioning
```
- Shard by stable keys, avoid hotspots
```
versioning
```
- API and event versioning with deprecation plans
```
backpressure
```
- Bounded queues, explicit limits
```
config
```
- Dynamic config with safe defaults and validation

```
stateless
```
- 外部化状态、水平扩容
```
partitioning
```
- 基于稳定键分片、避免热点
```
versioning
```
- API与事件版本化，附带废弃计划
```
backpressure
```
- 有界队列、明确限制
```
config
```
- 带安全默认值与校验的动态配置

8. Developer Experience & Testing (MEDIUM)

8. 开发者体验与测试（MEDIUM）

```
tests
```
- Unit, integration, contract, and load tests
```
determinism
```
- Hermetic tests, fixed seeds, stable time
```
lint
```
- Static analysis, formatting, build reproducibility
```
docs
```
- ADRs for major decisions, runbook ownership

```
tests
```
- 单元测试、集成测试、契约测试与负载测试
```
determinism
```
- 封闭测试环境、固定种子、稳定时间
```
lint
```
- 静态分析、代码格式化、可复现构建
```
docs
```
- 重大决策的ADR文档、运行手册归属

Execution Workflow

执行流程

Clarify product goals, SLOs, latency and cost budgets
Map data flow, dependencies, and failure modes
Choose storage and consistency model (document tradeoffs)
Define contracts: API schemas, events, and idempotency
Implement with safe defaults, observability, and resilience
Validate with tests, load, and failure scenarios
Review risks and publish runbooks

明确产品目标、SLO、延迟与成本预算
梳理数据流、依赖关系与故障模式
选择存储与一致性模型（记录权衡）
定义契约：API Schema、事件与幂等性规则
基于安全默认值、可观测性与韧性实现功能
通过测试、负载与故障场景验证
评审风险并发布运行手册

Language-Specific Guidance

语言专属指导

See

references/java-core.md

for stack defaults, JVM tuning, libraries, and Java-specific patterns.

详见

references/java-core.md

获取栈默认配置、JVM调优、库与Java特定模式。