define-goal

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Define Goal

定义目标

Overview

概述

Shape the user's intent into an objective an agent can pursue honestly. Prefer measurable outcomes, explicit evidence, and bounded scope over activity descriptions.

This skill covers goal definition and goal-tool creation only. Do not create intermediate planning artifacts, durable snapshots, ledgers, decision logs, or resume files from this skill.

将用户的意图转化为Agent可以切实执行的任务目标。相较于活动描述，优先选择可衡量的结果、明确的证据和有限的范围。

本技能仅涵盖目标定义和目标工具创建。请勿通过本技能创建中间规划工件、持久快照、分类账、决策日志或简历文件。

Workflow

工作流程

Confirm that goal definition is actually needed.
- Use this skill when the user asks for
```
$define-goal
```
  , asks to create or set a goal, asks for the goal tool, or wants help turning an intention into a clear objective.
- If the user only asks for ordinary implementation work, do the work directly instead of forcing goal creation.
Restate the likely goal in concrete terms. A usable goal names:
- the specific outcome that will be true
- the main artifact, system, repo, environment, or user-facing behavior involved
- how completion will be verified
- what is in scope
- what is out of scope when ambiguity would matter
- the stop condition for asking the user instead of grinding
Make it quantitative when the domain supports it. Prefer numbers that represent real success, not decorative precision:
- pass/fail validators: exact tests, checks, CI jobs, evals, commands, or acceptance criteria
- quality thresholds: latency, error rate, cost, accuracy, recall, precision, coverage, flake rate, bundle size, memory, uptime, completion rate, or manual review criteria
- artifact constraints: file paths, affected modules, allowed commands, output formats, target environments, deadlines, or maximum blast radius
- evidence counts: number of reproduced failures, successful reruns, reviewed examples, migrated records, addressed comments, or verified cases
Repair weak goals before setting them.
- Rewrite vague goals into measurable objectives when local context makes the rewrite safe.
- Ask one concise clarification question when the missing detail changes the intended outcome or validation.
- Reject pure activity goals such as "make progress," "keep investigating," "improve things," or "work on X" unless they are sharpened into a verifiable outcome.
Check active goal state before creating a goal.
- Call
```
get_goal
```
  .
- If there is no active goal and the objective meets the quality bar, call
```
create_goal
```
  .
- If there is an active goal that still matches the user's intent, continue using it instead of creating a duplicate.
- If there is an active goal that conflicts with the new request, ask whether to finish the current goal, mark it complete if done, or start a separate goal-backed thread.
Create the goal only after it passes the quality bar.
- Use a single concise objective string.
- Include the verification evidence in the objective itself.
- Include scope bounds when they constrain the work.
- Include a token budget only when the user explicitly requested one.
- Do not call
```
create_goal
```
  for an ordinary multi-step task unless the user explicitly asked for goal-backed work.

确认是否确实需要定义目标。
- 当用户请求
```
$define-goal
```
  、要求创建或设定目标、请求目标工具，或是需要帮助将意图转化为清晰目标时，使用本技能。
- 如果用户仅要求常规的实施工作，请直接开展工作，而非强制创建目标。
用具体的语言重述可能的目标。一个可用的目标应明确：
- 达成后会实现的具体结果
- 涉及的主要工件、系统、仓库（repo）、环境或面向用户的行为
- 如何验证目标完成情况
- 哪些内容在范围内
- 当存在歧义时，哪些内容在范围外
- 何时停止推进并向用户询问的终止条件
在适用领域内将目标量化。优先选择能代表实际成功的数值，而非形式上的精确：
- 合格/不合格验证器：精确的测试、检查、CI任务、评估、命令或验收标准
- 质量阈值：延迟、错误率、成本、准确率、召回率、精确率、覆盖率、波动率、包大小、内存占用、正常运行时间、完成率或人工审核标准
- 工件约束：文件路径、受影响模块、允许的命令、输出格式、目标环境、截止日期或最大影响范围
- 证据数量：重现的失败次数、成功重跑次数、已审核示例数、迁移记录数、已处理评论数或已验证案例数
在设定目标前完善薄弱目标。
- 当本地上下文确保改写安全时，将模糊的目标重写为可衡量的任务。
- 当缺失的细节会改变预期结果或验证方式时，提出一个简洁的澄清问题。
- 拒绝纯活动类目标，例如“取得进展”、“继续调查”、“改进事物”或“处理X”，除非它们被细化为可验证的结果。
创建目标前检查当前目标状态。
- 调用
```
get_goal
```
  。
- 如果没有活跃目标且任务符合质量标准，则调用
```
create_goal
```
  。
- 如果存在仍符合用户意图的活跃目标，请继续使用该目标，而非创建重复目标。
- 如果存在与新请求冲突的活跃目标，请询问用户是完成当前目标（若已完成则标记为完成），还是启动一个独立的目标驱动线程。
仅当目标通过质量标准后再创建。
- 使用单个简洁的任务字符串。
- 在任务中包含验证证据。
- 当范围边界会限制工作时，将其包含在内。
- 仅当用户明确要求时，才包含令牌预算。
- 对于普通的多步骤任务，请勿调用
```
create_goal
```
  ，除非用户明确要求基于目标开展工作。

Goal Quality Bar

目标质量标准

Before

create_goal

, the objective should answer:

What concrete thing will be true when this is done?
What evidence will prove it?
What quantitative or binary threshold defines success?
What scope boundaries matter?
What should cause the agent to stop and ask?

Good:

Reduce checkout API p95 latency below 250 ms for the documented slow path by making the smallest safe server-side change, then verify with
npm run test:checkout
and the existing local latency benchmark showing p95 under 250 ms across 3 consecutive runs.

Good:

Resolve the open review comments on PR 123 that request code changes, update only the affected auth files and tests, and verify with the targeted auth test command plus
gh pr view 123
showing no unresolved change-request threads.

Weak:

Make checkout faster.

Weak:

Keep investigating the PR comments.

在调用

create_goal

之前，任务应回答以下问题：

完成后会实现哪些具体结果？
有哪些证据可以证明目标完成？
什么量化或二元阈值定义了成功？
哪些范围边界是重要的？
什么情况下Agent应停止推进并询问用户？

优秀示例：

通过最小的安全服务端修改，将文档中记录的慢路径下结账API的p95延迟降低至250毫秒以下，然后通过
npm run test:checkout
及现有本地延迟基准测试验证，需连续3次运行显示p95延迟低于250毫秒。

优秀示例：

解决PR 123中要求修改代码的未处理评审评论，仅更新受影响的认证文件和测试，然后通过针对性的认证测试命令以及
gh pr view 123
显示无未解决的变更请求线程来验证。

薄弱示例：

让结账更快。

薄弱示例：

继续调查PR评论。

Quantification Heuristics

量化启发法

For bugs, define success as reproduction first, fix second, and a failing-then-passing validator when possible.
For tests, name the exact command and required pass condition.
For performance, name the metric, target threshold, measurement method, and number of runs.
For quality work, define an observable acceptance bar such as reviewed examples, lint/typecheck/test pass, or user-approved artifact.
For research, define the decision the research must enable, the sources or systems in scope, and the evidence standard.
For operations, define healthy state, monitoring window, failure threshold, and rollback or escalation trigger.

对于Bug，优先将成功定义为重现问题，其次是修复问题，尽可能使用先失败后通过的验证器。
对于测试，明确具体的命令和所需的通过条件。
对于性能，明确指标、目标阈值、测量方法和运行次数。
对于质量工作，定义可观察的验收标准，例如已审核示例、代码检查/类型检查/测试通过，或用户认可的工件。
对于研究，明确研究必须支持的决策、范围内的来源或系统，以及证据标准。
对于运维，明确健康状态、监控窗口、故障阈值，以及回滚或升级触发条件。

Clarifying Questions

澄清问题

Ask only when a reasonable rewrite would risk pursuing the wrong outcome. Keep the question short and oriented around the missing validator or scope boundary.

Useful question shapes:

"What metric should define success here: latency, cost, accuracy, or user-visible behavior?"
"Which environment should I verify against: local, staging, or production?"
"What is the minimum evidence you want before I mark this goal complete?"

If the user cannot provide a metric, propose the most honest binary validator available and ask for confirmation.

仅当合理的改写可能导致追求错误结果时才提出问题。问题应简洁，围绕缺失的验证器或范围边界展开。

实用的问题形式：

“此处应使用什么指标定义成功：延迟、成本、准确率还是用户可见行为？”
“我应针对哪个环境进行验证：本地、预发布（staging）还是生产环境？”
“在我标记目标完成之前，您需要的最低证据是什么？”

如果用户无法提供指标，请提出最可靠的二元验证器并请求确认。