agentic-engineering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgentic Engineering
Agentic Engineering
Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.
将此技能用于AI智能体承担大部分实现工作、人类负责质量与风险管控的工程工作流中。
Operating Principles
操作原则
- Define completion criteria before execution.
- Decompose work into agent-sized units.
- Route model tiers by task complexity.
- Measure with evals and regression checks.
- 执行前明确完成标准。
- 将工作分解为适合智能体处理的单元。
- 根据任务复杂度分配不同层级的模型。
- 通过评估和回归检查进行度量。
Eval-First Loop
先评估循环(Eval-First Loop)
- Define capability eval and regression eval.
- Run baseline and capture failure signatures.
- Execute implementation.
- Re-run evals and compare deltas.
- 定义能力评估和回归评估。
- 运行基准测试并记录失败特征。
- 执行实现工作。
- 重新运行评估并对比差异。
Task Decomposition
任务分解
Apply the 15-minute unit rule:
- each unit should be independently verifiable
- each unit should have a single dominant risk
- each unit should expose a clear done condition
遵循15分钟单元规则:
- 每个单元应可独立验证
- 每个单元应存在单一主要风险
- 每个单元应具备明确的完成条件
Model Routing
模型路由
- Haiku: classification, boilerplate transforms, narrow edits
- Sonnet: implementation and refactors
- Opus: architecture, root-cause analysis, multi-file invariants
- Haiku:分类、样板代码转换、局部编辑
- Sonnet:实现与重构
- Opus:架构设计、根因分析、多文件不变量维护
Session Strategy
会话策略
- Continue session for closely-coupled units.
- Start fresh session after major phase transitions.
- Compact after milestone completion, not during active debugging.
- 关联紧密的任务单元使用同一个会话。
- 完成主要阶段转换后开启新会话。
- 在里程碑完成后进行压缩,而非在主动调试期间。
Review Focus for AI-Generated Code
AI生成代码的评审重点
Prioritize:
- invariants and edge cases
- error boundaries
- security and auth assumptions
- hidden coupling and rollout risk
Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.
优先关注:
- 不变量与边缘情况
- 错误边界
- 安全与认证假设
- 隐藏耦合与发布风险 当自动化格式化/代码检查工具已规范代码风格时,不要在仅涉及风格的分歧上浪费评审时间。
Cost Discipline
成本管控
Track per task:
- model
- token estimate
- retries
- wall-clock time
- success/failure
Escalate model tier only when lower tier fails with a clear reasoning gap.
按任务跟踪以下内容:
- 使用的模型
- 令牌估算量
- 重试次数
- 实际耗时
- 成功/失败情况 仅当低层级模型因明显的推理缺陷失败时,才升级使用更高层级的模型。