agentic-engineering

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agentic Engineering

Agentic Engineering

Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.
将此技能用于AI智能体承担大部分实现工作、人类负责质量与风险管控的工程工作流中。

Operating Principles

操作原则

  1. Define completion criteria before execution.
  2. Decompose work into agent-sized units.
  3. Route model tiers by task complexity.
  4. Measure with evals and regression checks.
  1. 执行前明确完成标准。
  2. 将工作分解为适合智能体处理的单元。
  3. 根据任务复杂度分配不同层级的模型。
  4. 通过评估和回归检查进行度量。

Eval-First Loop

先评估循环(Eval-First Loop)

  1. Define capability eval and regression eval.
  2. Run baseline and capture failure signatures.
  3. Execute implementation.
  4. Re-run evals and compare deltas.
  1. 定义能力评估和回归评估。
  2. 运行基准测试并记录失败特征。
  3. 执行实现工作。
  4. 重新运行评估并对比差异。

Task Decomposition

任务分解

Apply the 15-minute unit rule:
  • each unit should be independently verifiable
  • each unit should have a single dominant risk
  • each unit should expose a clear done condition
遵循15分钟单元规则:
  • 每个单元应可独立验证
  • 每个单元应存在单一主要风险
  • 每个单元应具备明确的完成条件

Model Routing

模型路由

  • Haiku: classification, boilerplate transforms, narrow edits
  • Sonnet: implementation and refactors
  • Opus: architecture, root-cause analysis, multi-file invariants
  • Haiku:分类、样板代码转换、局部编辑
  • Sonnet:实现与重构
  • Opus:架构设计、根因分析、多文件不变量维护

Session Strategy

会话策略

  • Continue session for closely-coupled units.
  • Start fresh session after major phase transitions.
  • Compact after milestone completion, not during active debugging.
  • 关联紧密的任务单元使用同一个会话。
  • 完成主要阶段转换后开启新会话。
  • 在里程碑完成后进行压缩,而非在主动调试期间。

Review Focus for AI-Generated Code

AI生成代码的评审重点

Prioritize:
  • invariants and edge cases
  • error boundaries
  • security and auth assumptions
  • hidden coupling and rollout risk
Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.
优先关注:
  • 不变量与边缘情况
  • 错误边界
  • 安全与认证假设
  • 隐藏耦合与发布风险 当自动化格式化/代码检查工具已规范代码风格时,不要在仅涉及风格的分歧上浪费评审时间。

Cost Discipline

成本管控

Track per task:
  • model
  • token estimate
  • retries
  • wall-clock time
  • success/failure
Escalate model tier only when lower tier fails with a clear reasoning gap.
按任务跟踪以下内容:
  • 使用的模型
  • 令牌估算量
  • 重试次数
  • 实际耗时
  • 成功/失败情况 仅当低层级模型因明显的推理缺陷失败时,才升级使用更高层级的模型。