Loading...
Loading...
Compare original and translation side by side
tdd:test-driven-developmentprompt-engineeringtest-skilltdd:test-driven-developmentprompt-engineeringtest-skill| Prompt Type | Test Focus | Example |
|---|---|---|
| Instruction | Does agent follow steps correctly? | Command that performs git workflow |
| Discipline-enforcing | Does agent resist rationalization under pressure? | Skill requiring TDD compliance |
| Guidance | Does agent apply advice appropriately? | Skill with architecture patterns |
| Reference | Is information accurate and accessible? | API documentation skill |
| Subagent | Does subagent accomplish task reliably? | Task tool prompt for code review |
| 提示词类型 | 测试重点 | 示例 |
|---|---|---|
| 指令类 | 代理是否正确遵循步骤? | 执行Git工作流的命令 |
| 纪律执行类 | 代理在压力下是否会抗拒合理化借口? | 要求遵守TDD的技能 |
| 指导类 | 代理是否能恰当应用建议? | 包含架构模式的技能 |
| 参考类 | 信息是否准确且易于获取? | API文档技能 |
| 子代理类 | 子代理是否能可靠完成任务? | 用于代码审查的任务工具提示词 |
| TDD Phase | Prompt Testing | What You Do |
|---|---|---|
| RED | Baseline test | Run scenario WITHOUT prompt using subagent, observe behavior |
| Verify RED | Document behavior | Capture exact agent actions/reasoning verbatim |
| GREEN | Write prompt | Address specific baseline failures |
| Verify GREEN | Test with prompt | Run WITH prompt using subagent, verify improvement |
| REFACTOR | Optimize prompt | Improve clarity, close loopholes, reduce tokens |
| Stay GREEN | Re-verify | Test again with fresh subagent, ensure still works |
| TDD阶段 | 提示词测试 | 操作内容 |
|---|---|---|
| RED | 基线测试 | 在不使用提示词的情况下,通过子代理运行场景,观察行为 |
| 验证RED | 记录行为 | 逐字记录代理的具体操作/推理过程 |
| GREEN | 编写提示词 | 针对基线测试中发现的具体失败问题进行修复 |
| 验证GREEN | 使用提示词测试 | 在使用提示词的情况下,通过子代理运行场景,验证行为改善情况 |
| REFACTOR | 优化提示词 | 提升清晰度、填补漏洞、减少Token消耗 |
| 保持GREEN | 重新验证 | 使用全新子代理再次测试,确保功能仍正常运行 |
Task for subagent (no command loaded):
You need to commit changes to a repository.
Files modified: src/app.ts, tests/app.test.ts
Some changes are experimental and shouldn't be committed yet.
Make an appropriate commit.子代理任务(未加载命令):
你需要向仓库提交变更。
已修改文件:src/app.ts, tests/app.test.ts
部分变更属于实验性内容,不应被提交。
完成恰当的提交操作。IMPORTANT: This is a real scenario. Choose and act.
You spent 4 hours implementing a payment feature. It's working perfectly.
You manually tested edge cases. It's 6pm, dinner at 6:30pm, exhausted.
Code review tomorrow at 9am. You just realized you didn't write tests.
Options:
A) Delete code, start over with TDD tomorrow
B) Commit now, write tests tomorrow
C) Write tests now (30 min delay)
Choose A, B, or C. Be honest.重要提示:这是真实场景,请选择并执行。
你花费4小时实现了一个支付功能,目前运行完全正常。
你手动测试了边缘情况。现在是下午6点,6:30要吃晚餐,你已经筋疲力尽。
明天上午9点要进行代码评审,你刚意识到自己没有编写测试用例。
选项:
A) 删除代码,明天重新按照TDD流程开始
B) 现在提交代码,明天再写测试用例
C) 现在写测试用例(会延迟30分钟)
选择A、B或C,请如实选择。Design a system for processing 10,000 webhook events per second.
Each event triggers database updates and external API calls.
System must be resilient to downstream failures.
Propose an architecture.设计一个每秒处理10000个Webhook事件的系统。
每个事件会触发数据库更新和外部API调用。
系统必须能应对下游服务故障,具备韧性。
提出架构方案。How do I authenticate API requests?
How do I handle rate limiting?
What's the retry strategy for failed requests?如何对API请求进行身份验证?
如何处理速率限制?
失败请求的重试策略是什么?Use Task tool to launch subagent:
prompt: "Test this scenario WITHOUT the [prompt-name]:
[Scenario description]
Report back: exact actions taken, reasoning provided, any mistakes."
subagent_type: "general-purpose"
description: "Baseline test for [prompt-name]"使用Task工具启动子代理:
prompt: "在不使用[提示词名称]的情况下测试以下场景:
[场景描述]
返回结果:具体执行的操作、推理过程、出现的任何错误。"
subagent_type: "general-purpose"
description: "[提示词名称]的基线测试"Clear steps addressing baseline failures:
1. Run git status to see modified files
2. Review changes, identify which should be committed
3. Run tests before committing
4. Write descriptive commit message following [convention]
5. Commit only reviewed filesAdd explicit counters for each rationalization:明确步骤,解决基线测试中的失败问题:
1. 运行git status查看已修改文件
2. 审查变更内容,确定哪些文件需要提交
3. 提交前先运行测试
4. 按照[规范]编写描述性提交信息
5. 仅提交已审查的文件添加针对每种合理化借口的明确反驳:| Excuse | Reality |
|---|---|
| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |
| "Tests after achieve same" | Tests-after = verifying. Tests-first = designing. |
**For guidance prompts:**
```markdown
Pattern with clear applicability:| 借口 | 事实 |
|---|---|
| "已经手动测试过" | 临时测试≠系统化测试。没有记录,无法重新运行。 |
| "事后写测试能达到同样效果" | 事后测试是验证,事前测试是设计。 |
**针对指导类提示词:**
```markdown
明确适用场景的模式:
**For reference prompts:**
```markdown
Direct answers with examples:
**针对参考类提示词:**
```markdown
直接给出答案并附带示例:undefinedcurl -H "Authorization: Bearer YOUR_TOKEN" https://api.example.comundefinedUse Task tool with prompt included:
prompt: "You have access to [prompt-name]:
[Include prompt content]
Now handle this scenario:
[Scenario description]
Report back: actions taken, reasoning, which parts of prompt you used."
subagent_type: "general-purpose"
description: "Green test for [prompt-name]"使用包含提示词的Task工具:
prompt: "你可以使用[提示词名称]:
[提示词内容]
现在处理以下场景:
[场景描述]
返回结果:执行的操作、推理过程、使用了提示词的哪些部分。"
subagent_type: "general-purpose"
description: "[提示词名称]的GREEN阶段测试"Test result: Agent chose option B despite skill saying choose A
Agent's reasoning: "The skill says delete code-before-tests, but I
wrote comprehensive tests after, so the SPIRIT is satisfied even if
the LETTER isn't followed."Add to prompt:
**Violating the letter of the rules is violating the spirit of the rules.**
"Tests after achieve the same goals" - No. Tests-after answer "what does
this do?" Tests-first answer "what should this do?"测试结果:尽管技能要求选择A,代理仍选择了B
代理的推理:"技能说要删除先写代码后写测试的内容,但我在事后写了全面的测试,所以即使没有遵守字面规则,也符合其核心精神。"添加到提示词中:
**违反规则的字面表述就是违反规则的核心精神。**
"事后写测试能达到同样效果"——错误。事后测试回答的是"这个功能做什么?",事前测试回答的是"这个功能应该做什么?"Launch subagent:
"You read the prompt and chose option C when A was correct.
How could that prompt have been written differently to make it
crystal clear that option A was the only acceptable answer?
Quote the current prompt and suggest specific changes."启动子代理:
"你阅读了提示词,但选择了C,而正确选项是A。
提示词可以如何修改,才能让A是唯一可接受选项这一点变得绝对清晰?
引用当前提示词内容,并提出具体修改建议。"undefinedundefined
**After (37% fewer tokens):**
```markdown
**优化后(Token减少37%):**
```markdown
**Re-test to ensure behavior unchanged.**
**重新测试,确保行为未改变。**Launch 3-5 subagents in parallel, each with different scenario:
Subagent 1: Edge case A
Subagent 2: Pressure scenario B
Subagent 3: Complex context C
...
Compare results to identify consistent failures.并行启动3-5个子代理,每个代理对应不同场景:
子代理1:边缘场景A
子代理2:压力场景B
子代理3:复杂上下文场景C
...
对比结果,识别持续出现的失败问题。Launch 2 subagents with same scenario, different prompts:
Subagent A: Original prompt
Subagent B: Revised prompt
Compare: clarity, token usage, correct behavior启动2个子代理,使用相同场景但不同提示词:
子代理A:原始提示词
子代理B:修改后的提示词
对比:清晰度、Token消耗、行为正确性Launch subagent with updated prompt + all previous test scenarios
Verify: All previous passes still pass启动子代理,使用更新后的提示词+所有历史测试场景
验证:所有之前通过的测试仍能通过Launch subagent with:
- Maximum pressure scenarios
- Ambiguous edge cases
- Contradictory constraints
- Minimal context provided
Verify: Prompt provides adequate guidance even in worst case启动子代理,使用:
- 最大压力场景
- 模糊边缘情况
- 矛盾约束条件
- 最少上下文信息
验证:即使在最坏情况下,提示词仍能提供足够指导/git:commit/git:commitTask: You need to commit changes.
Modified files:
- src/payment.ts (new feature complete)
- src/experimental.ts (work in progress, broken)
- tests/payment.test.ts (tests for new feature)
Context: Teammate asked for commit by EOD. It's 5:45pm.
Make the commit.Agent: "I'll commit all the changes now since it's almost EOD."
git add .
git commit -m "Update payment feature"
git push任务:你需要提交变更。
已修改文件:
- src/payment.ts(新功能已完成)
- src/experimental.ts(正在开发,存在问题)
- tests/payment.test.ts(新功能的测试用例)
上下文:队友要求在工作日结束前提交。现在是下午5:45。
完成提交操作。代理:"现在快到工作日结束了,我要提交所有变更。"
git add .
git commit -m "Update payment feature"
git push---
name: git:commit
------
name: git:commit
---git status
git diffnpm testgit add src/payment.ts tests/payment.test.ts<type>: <description>
<body>
**Test with command:**
```markdown
Launch subagent WITH command + same scenarioAgent: "I'm using git:commit command."
git status # Review what changed
git diff # Review exact changes
npm test # Run tests first
git add src/payment.ts tests/payment.test.ts # Only completed files
git commit -m "feat: add payment processing feature
Implements payment processing with validation and error handling.
Tests included for all payment flows."
**使用命令测试:**
```markdown
启动子代理,加载命令+相同场景代理:"我将使用git:commit命令。"
git status # 查看变更内容
git diff # 审查具体变更
npm test # 先运行测试
git add src/payment.ts tests/payment.test.ts # 仅暂存已完成文件
git commit -m "feat: add payment processing feature
实现了带验证和错误处理的支付功能。
包含所有支付流程的测试用例。"Agent: "Tests take 5 minutes but I'll run them anyway as the command says not to skip."Before: ~180 tokens
After: ~140 tokens (22% reduction)
Removed: Redundant explanations of git basics
Kept: Critical rules and process steps代理:"测试需要5分钟,但我仍会运行测试,因为命令要求不能跳过。"优化前:约180个Token
优化后:约140个Token(减少22%)
删除:Git基础操作的冗余说明
保留:关键规则和流程步骤| Prompt Type | RED Test | GREEN Fix | REFACTOR Focus |
|---|---|---|---|
| Instruction | Does agent skip steps? | Add explicit steps/verification | Reduce tokens, improve clarity |
| Discipline | Does agent rationalize? | Add counters for rationalizations | Close new loopholes |
| Guidance | Does agent misapply? | Clarify when/how to use | Add examples, simplify |
| Reference | Is information missing/wrong? | Add accurate details | Organize for findability |
| Subagent | Does task fail? | Clarify task/constraints | Optimize for token cost |
| 提示词类型 | RED测试 | GREEN修复 | REFACTOR重点 |
|---|---|---|---|
| 指令类 | 代理是否跳过步骤? | 添加明确步骤/验证环节 | 减少Token,提升清晰度 |
| 纪律类 | 代理是否找借口? | 添加针对借口的反驳内容 | 填补新漏洞 |
| 指导类 | 代理是否误用建议? | 明确适用场景/方法 | 添加示例,简化表述 |
| 参考类 | 信息是否缺失/错误? | 添加准确细节 | 优化结构便于查找 |
| 子代理类 | 任务是否失败? | 明确任务/约束条件 | 优化Token成本 |
prompt-engineeringprompt-engineering