resilient-execution

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Overview

概述

The resilient-execution skill prevents premature failure by enforcing a minimum of 3 genuinely different approaches before escalating to the user. It provides a structured error classification system, an approach cascade methodology, and transparent logging of each attempt. Without this skill, agents give up too early — with it, they systematically exhaust alternatives and only escalate with full evidence.

Announce at start: "I'm using the resilient-execution skill — I will try multiple approaches before escalating."

resilient-execution skill 通过要求向用户上报问题前至少尝试3种完全不同的解决方案，避免执行提前失败。它提供了结构化的错误分类体系、方案级联方法论，以及每次尝试的透明日志记录。没有该skill时，Agent会过早放弃；使用该skill后，它们会系统性地尝试所有可选方案，且仅在掌握完整证据的情况下才会上报问题。

启动时声明： "I'm using the resilient-execution skill — I will try multiple approaches before escalating."

Phase 1: Error Classification

阶段1：错误分类

When an approach fails, immediately classify the error before retrying:

Error Type	Definition	Indicators	Correct Response
Transient	Temporary infrastructure failure	Network timeout, rate limit, 503 error, lock contention	Wait briefly, retry the same approach
Environmental	Missing or misconfigured dependency	Module not found, wrong version, missing env var, permission denied	Fix the environment, then retry same approach
Logical	Wrong approach or incorrect assumption	Wrong output, unexpected behavior, type mismatch, wrong API usage	Rethink the approach entirely
Fundamental	Genuinely impossible with available tools	API does not exist, hardware limitation, missing capability	Escalate to user with evidence

<HARD-GATE> You MUST try at least 3 different approaches before telling the user something cannot be done. "I tried and it didn't work" is not acceptable without evidence of 3 genuine attempts with meaningfully different strategies. </HARD-GATE>

STOP: Classify the error before choosing your next approach. Wrong classification leads to wasted retries.

当方案失败时，重试前请立即对错误进行分类：

错误类型	定义	判定指标	正确应对方式
Transient（瞬态）	临时基础设施故障	网络超时、速率限制、503错误、锁竞争	短暂等待后重试同一方案
Environmental（环境类）	依赖缺失或配置错误	模块未找到、版本错误、环境变量缺失、权限不足	修复环境后重试同一方案
Logical（逻辑类）	方案错误或假设不成立	输出错误、行为异常、类型不匹配、API调用错误	彻底重新设计方案
Fundamental（根本性）	现有工具确实无法实现	API不存在、硬件限制、能力缺失	携带证据向用户上报

<HARD-GATE> 在告知用户某件事无法完成之前，你必须至少尝试3种不同的方案。没有3种采用存在本质差异的策略的真实尝试作为证据，仅说「我试过了但没用」是不被接受的。 </HARD-GATE>

停止操作：选择下一个方案前请先对错误分类。错误的分类会导致重试资源浪费。

Phase 2: Approach Cascade

阶段2：方案级联

Execute the cascade systematically. Each attempt must be a genuinely different strategy.

Attempt 1: Primary approach (most direct solution)
    | fails
    v
Classify error -> Can same approach work with a fix?
    | YES -> Fix and retry (does NOT count as a new attempt)
    | NO  -> Proceed to Attempt 2
    v
Attempt 2: Alternative approach 1 (different technique)
    | fails
    v
Classify error -> Is this fundamentally blocked?
    | YES -> Proceed directly to escalation
    | NO  -> Proceed to Attempt 3
    v
Attempt 3: Alternative approach 2 (different path entirely)
    | fails
    v
Circuit breaker -> Present findings to user with full evidence

系统性执行级联流程，每次尝试都必须是存在本质差异的策略。

Attempt 1: Primary approach (most direct solution)
    | fails
    v
Classify error -> Can same approach work with a fix?
    | YES -> Fix and retry (does NOT count as a new attempt)
    | NO  -> Proceed to Attempt 2
    v
Attempt 2: Alternative approach 1 (different technique)
    | fails
    v
Classify error -> Is this fundamentally blocked?
    | YES -> Proceed directly to escalation
    | NO  -> Proceed to Attempt 3
    v
Attempt 3: Alternative approach 2 (different path entirely)
    | fails
    v
Circuit breaker -> Present findings to user with full evidence

For Each Attempt, Log:

每次尝试需记录以下内容：

markdown

undefined

markdown

undefined

Attempt N: [Approach Name]

Strategy: [what makes this different from previous attempts] What I tried: [specific description with commands/code] What happened: [exact error or unexpected result] Why it failed: [root cause analysis] Classification: [Transient / Environmental / Logical / Fundamental] What to try next: [reasoning for next approach]


> **STOP: Log every attempt before moving to the next. Do NOT skip logging — it is evidence for the escalation report.**

---


> **停止操作：进入下一次尝试前请先记录本次尝试的所有信息。不得跳过日志记录——这是上报报告的证据。**

---

Phase 3: Alternative Approach Selection

阶段3：备选方案选择

When the primary approach fails, select the next approach using this decision table:

Failure Type	Strategy 1	Strategy 2	Strategy 3
Library/API does not work	Different library	Direct implementation (no library)	Shell command / external tool
Algorithm produces wrong result	Different algorithm	Decompose into smaller steps	Simplify constraints, solve easier version
Permission/access denied	Different access method	Escalate with manual steps	Work around via alternative path
Tool limitation	Different tool	Combine multiple tools	Provide manual instructions
Integration failure	Mock the dependency	Use alternative interface	Isolate and test components separately
Performance issue	Different data structure	Batch/stream processing	Approximate solution

当主方案失败时，参考以下决策表选择下一个方案：

失败类型	策略1	策略2	策略3
库/API不可用	更换其他库	不依赖库直接实现	调用Shell命令/外部工具
算法输出错误结果	更换其他算法	拆分为更小的步骤执行	简化约束，先解决更简单的版本
权限/访问被拒绝	更换访问方式	提供手动步骤上报	通过备选路径绕过限制
工具存在局限性	更换其他工具	组合多个工具实现	提供手动操作指引
集成失败	Mock依赖项	使用备选接口	隔离组件分别测试
性能问题	更换数据结构	批量/流式处理	采用近似解决方案

Alternative Strategy Hierarchy

备选策略优先级

Try these in order of preference:

Different tool — use a different library, API, or command
Different algorithm — solve the same problem a different way
Decompose — break the problem into smaller, solvable parts
Simplify — remove constraints and solve a simpler version first
Work around — achieve the goal through a different path entirely
Manual steps — provide clear instructions the user can follow themselves

按优先级从高到低尝试：

更换工具 —— 使用不同的库、API或命令
更换算法 —— 用不同的方法解决同一个问题
问题拆解 —— 将问题拆分为更小的、可解决的部分
简化问题 —— 去掉约束，先解决更简单的版本
路径绕过 —— 通过完全不同的路径达成目标
手动步骤 —— 提供清晰的指引让用户自行操作

Phase 4: Escalation Report

阶段4：上报报告

After 3 genuine attempts with different approaches, produce this report:

markdown

undefined

完成3次采用不同方案的真实尝试后，生成如下报告：

markdown

undefined

Execution Report

I tried 3 different approaches to [goal]:

Attempt 1: [Approach Name]

Strategy: [description] Result: Failed because [specific reason] Error: [exact error message or unexpected output]

Attempt 2: [Approach Name]

Strategy: [description] Result: Failed because [specific reason] Error: [exact error message or unexpected output]

Attempt 3: [Approach Name]

Strategy: [description] Result: Failed because [specific reason] Error: [exact error message or unexpected output]

Root Cause Analysis

[Why all three approaches failed — identify the common blocker]

Recommended Next Steps

Option A: [what the user could try]
Option B: [alternative path]
Option C: [if applicable]

Option A: [what the user could try]
Option B: [alternative path]
Option C: [if applicable]

What I Need From You to Proceed

[Specific ask — access, information, permission, or decision]


> **STOP: Do NOT escalate without this report. The user needs evidence that 3 genuine attempts were made.**

---

[Specific ask — access, information, permission, or decision]


> **停止操作：没有该报告不得上报问题。用户需要证明你确实完成了3次有效尝试的证据。**

---

Decision Table: When Retries Count as "Genuine"

决策表：重试判定为「有效尝试」的标准

Counts as Genuine Attempt	Does NOT Count
Different library or tool	Same library with different import
Different algorithm or data structure	Same algorithm with tweaked parameters
Different architectural approach	Same approach with minor code changes
Manual workaround vs automated	Same automation with retry loop
Breaking problem into sub-problems	Same monolithic approach with logging added
Using an entirely different API	Same API with different authentication method (unless auth was the error)

属于有效尝试	不属于有效尝试
使用不同的库或工具	同一个库仅修改导入方式
使用不同的算法或数据结构	同一个算法仅调整参数
使用不同的架构方案	同一个方案仅做少量代码修改
手动绕过方案 vs 自动化方案	同一个自动化方案仅增加重试循环
将问题拆分为子问题解决	同一个整体方案仅增加日志
使用完全不同的API	同一个API仅修改认证方式（除非认证是错误根源）

Anti-Patterns / Common Mistakes

反模式/常见错误

What NOT to Do	Why It Fails	What to Do Instead
Retry the same approach 3 times and call it "3 attempts"	Same approach = same failure. Not genuine alternatives.	Each attempt must use a meaningfully different strategy
Give up after 1 failure	Misses 2+ viable approaches	Always try at least 3 genuinely different approaches
Skip error classification	Without classification, you retry wrong things	Classify BEFORE choosing next approach
Hide failed attempts from the user	User cannot help without context	Log and report every attempt transparently
Escalate without trying manual workaround	Many things that fail in automation work manually	Always consider manual steps as Approach 3
Blame the platform without investigation	"Platform limitation" is often wrong	Search for workarounds before declaring impossible
Fix environment issues and count as new attempt	Fixing env + retrying same approach is 1 attempt	Only count genuinely different strategies
Skip logging intermediate attempts	Loses evidence trail, cannot produce escalation report	Log every attempt immediately

禁止行为	错误原因	正确做法
重复尝试同一个方案3次并称之为「3次尝试」	同一个方案=同样的失败，不属于有效备选方案	每次尝试必须采用存在本质差异的策略
1次失败后就放弃	会错过至少2种可行的方案	始终至少尝试3种完全不同的方案
跳过错误分类	没有分类就会重试错误的方案	选择下一个方案前先完成错误分类
向用户隐藏失败的尝试	没有上下文用户无法提供帮助	透明地记录并上报每一次尝试
没有尝试手动绕过方案就上报	很多自动化失败的场景手动操作可以成功	始终将手动步骤作为第3种方案考虑
没有调研就将问题归咎于平台限制	「平台限制」的判定通常是错误的	声明无法实现前先搜索绕过方案
修复环境问题后重试算作新的尝试	修复环境+重试同一方案只能算1次尝试	仅完全不同的策略才算新的尝试
跳过中间尝试的日志记录	丢失证据链，无法生成上报报告	每次尝试完成后立即记录

Anti-Rationalization Guards

反合理化规则

Thought	Reality
"This genuinely cannot be done"	Have you tried 3 different approaches? Probably not.
"The error is clear, I know what is wrong"	Clear errors can have hidden root causes. Investigate.
"I have already tried everything"	List what you tried. There are always more options.
"The user should fix this themselves"	Provide a manual path, but try 3 approaches first.
"This is a platform limitation"	Limitations often have workarounds. Search for them.
"The same error keeps happening"	Same error with different approaches = different root cause. Classify.
"This is taking too long"	Giving up takes longer when the user has to start over.
"A simpler version would not be useful"	A working simple version beats a broken complex one.

Do NOT escalate without 3 genuine attempts. Period.

错误想法	事实
「这确实不可能做到」	你试过3种不同的方案了吗？大概率没有。
「错误很明显，我知道问题出在哪」	明显的错误可能存在隐藏的根因，需要调研。
「我已经试过所有方案了」	列出来你试过的方案，永远有更多可选方案。
「用户应该自己修复这个问题」	先尝试3种方案，再提供手动路径。
「这是平台的限制」	限制通常有绕过方案，去搜索。
「一直报同一个错误」	不同方案出现同一个错误=不同的根因，需要分类。
「这太耗费时间了」	用户重新开始解决问题会耗费更多时间。
「简化版本没有用」	能运行的简化版本好过无法运行的复杂版本。

没有3次有效尝试绝对不能上报。没有例外。

Integration Points

集成点

Skill	Relationship
`circuit-breaker`	Activated after resilient-execution exhausts retries at the loop level
`task-management`	Invokes resilient-execution when a task step fails
`self-learning`	Records failure patterns to avoid repeating them in future sessions
`planning`	Uses failure history to choose more robust approaches
`auto-improvement`	Tracks retry success rates and approach effectiveness
`verification-before-completion`	Invokes resilient-execution if verification fails

Skill	关联关系
`circuit-breaker`	resilient-execution在循环层面耗尽重试次数后激活
`task-management`	任务步骤失败时调用resilient-execution
`self-learning`	记录失败模式，避免未来会话重复出现相同问题
`planning`	利用失败历史选择更鲁棒的方案
`auto-improvement`	跟踪重试成功率和方案有效性
`verification-before-completion`	验证失败时调用resilient-execution

Concrete Examples

具体示例

Example: File Parsing Failure

示例：文件解析失败

Attempt 1: JSON.parse() on the file
  Result: SyntaxError — file contains comments (JSONC format)
  Classification: Logical — wrong parser for this format

Attempt 2: Strip comments with regex, then JSON.parse()
  Result: Failed — nested block comments not handled
  Classification: Logical — regex too simple for comment stripping

Attempt 3: Use `jsonc-parser` library (handles JSONC natively)
  Result: Success — file parsed correctly

Attempt 1: JSON.parse() on the file
  Result: SyntaxError — file contains comments (JSONC format)
  Classification: Logical — wrong parser for this format

Attempt 2: Strip comments with regex, then JSON.parse()
  Result: Failed — nested block comments not handled
  Classification: Logical — regex too simple for comment stripping

Attempt 3: Use `jsonc-parser` library (handles JSONC natively)
  Result: Success — file parsed correctly

Example: API Integration Failure

示例：API集成失败

Attempt 1: Direct HTTP request to API endpoint
  Result: 403 Forbidden — authentication required
  Classification: Environmental — missing auth config

  Fix: Add API key from .env
  Result: 429 Too Many Requests — rate limited
  Classification: Transient — wait and retry
  Result: 200 OK but response format changed from docs
  Classification: Logical — API version mismatch

Attempt 2: Use official SDK instead of raw HTTP
  Result: SDK throws "unsupported region" error
  Classification: Environmental — region config needed

Attempt 3: Use GraphQL endpoint instead of REST
  Result: Success — GraphQL endpoint supports all regions

Attempt 1: Direct HTTP request to API endpoint
  Result: 403 Forbidden — authentication required
  Classification: Environmental — missing auth config

  Fix: Add API key from .env
  Result: 429 Too Many Requests — rate limited
  Classification: Transient — wait and retry
  Result: 200 OK but response format changed from docs
  Classification: Logical — API version mismatch

Attempt 2: Use official SDK instead of raw HTTP
  Result: SDK throws "unsupported region" error
  Classification: Environmental — region config needed

Attempt 3: Use GraphQL endpoint instead of REST
  Result: Success — GraphQL endpoint supports all regions

Key Principles

核心原则

Never give up silently — always show what was tried
Genuine alternatives — each attempt must be a meaningfully different approach, not the same thing with minor tweaks
Root cause analysis — understand WHY before trying the next approach
Learn from failure — update memory with what did not work and why
Transparent — show the user your reasoning at each step
Classify first — error type determines whether to retry same approach or try a new one

永远不要默默放弃 —— 始终展示你尝试过的方案
有效备选方案 —— 每次尝试必须是存在本质差异的方案，不是同一个方案的微小调整
根因分析 —— 尝试下一个方案前先理解失败的原因
从失败中学习 —— 记录无效方案及原因，更新到记忆中
透明化 —— 每一步都向用户展示你的推理过程
先分类再处理 —— 错误类型决定了是重试同一方案还是尝试新方案

Skill Type

Skill类型

RIGID — The 3-attempt minimum is a HARD-GATE. Error classification is mandatory before each retry. The escalation report format must be followed exactly. Do not relax these requirements regardless of perceived simplicity.

RIGID（刚性规则） —— 至少3次尝试是HARD-GATE（硬性门槛）。每次重试前必须进行错误分类。必须严格遵循上报报告格式。无论感知到的问题有多简单，都不得放宽这些要求。