Back to Details

agentic-engineering

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Agentic Engineering

Agentic Engineering

Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.

将此技能用于AI智能体承担大部分实现工作、人类负责质量与风险管控的工程工作流中。

Operating Principles

操作原则

Define completion criteria before execution.
Decompose work into agent-sized units.
Route model tiers by task complexity.
Measure with evals and regression checks.

执行前明确完成标准。
将工作分解为适合智能体处理的单元。
根据任务复杂度分配不同层级的模型。
通过评估和回归检查进行度量。

Eval-First Loop

先评估循环（Eval-First Loop）

Define capability eval and regression eval.
Run baseline and capture failure signatures.
Execute implementation.
Re-run evals and compare deltas.

定义能力评估和回归评估。
运行基准测试并记录失败特征。
执行实现工作。
重新运行评估并对比差异。

Task Decomposition

任务分解

Apply the 15-minute unit rule:

each unit should be independently verifiable
each unit should have a single dominant risk
each unit should expose a clear done condition

遵循15分钟单元规则：

每个单元应可独立验证
每个单元应存在单一主要风险
每个单元应具备明确的完成条件

Model Routing

模型路由

Haiku: classification, boilerplate transforms, narrow edits
Sonnet: implementation and refactors
Opus: architecture, root-cause analysis, multi-file invariants

Haiku：分类、样板代码转换、局部编辑
Sonnet：实现与重构
Opus：架构设计、根因分析、多文件不变量维护

Session Strategy

会话策略

Continue session for closely-coupled units.
Start fresh session after major phase transitions.
Compact after milestone completion, not during active debugging.

关联紧密的任务单元使用同一个会话。
完成主要阶段转换后开启新会话。
在里程碑完成后进行压缩，而非在主动调试期间。

Review Focus for AI-Generated Code

AI生成代码的评审重点

Prioritize:

invariants and edge cases
error boundaries
security and auth assumptions
hidden coupling and rollout risk

Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.

优先关注：

不变量与边缘情况
错误边界
安全与认证假设
隐藏耦合与发布风险当自动化格式化/代码检查工具已规范代码风格时，不要在仅涉及风格的分歧上浪费评审时间。

Cost Discipline

成本管控

Track per task:

model
token estimate
retries
wall-clock time
success/failure

Escalate model tier only when lower tier fails with a clear reasoning gap.

按任务跟踪以下内容：

使用的模型
令牌估算量
重试次数
实际耗时
成功/失败情况仅当低层级模型因明显的推理缺陷失败时，才升级使用更高层级的模型。