cascade-workflow
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCascade Workflow with Graceful Degradation Skill
具备优雅降级能力的级联工作流 Skill
Purpose
目标
Implement graceful degradation through cascading fallback strategies. When optimal approaches fail or timeout, the system automatically falls back to simpler, more reliable alternatives while maintaining acceptable functionality.
通过级联回退策略实现优雅降级。当最优方案失败或超时,系统会自动回退到更简单、更可靠的替代方案,同时维持可接受的功能。
When to Use This Skill
适用场景
USE FOR:
- External service dependencies (APIs, databases)
- Time-sensitive operations with acceptable degraded modes
- Operations where partial results are better than no results
- High-availability requirements (system must always respond)
- Scenarios where waiting for perfect solution is worse than good-enough solution
AVOID FOR:
- Operations requiring exact correctness (no acceptable degradation)
- Security-critical operations (authentication, authorization)
- Financial transactions (no room for "approximate")
- When failures must surface to user (diagnostic operations)
- Simple operations with no meaningful fallback
适用场景:
- 依赖外部服务(API、数据库)的场景
- 允许降级模式的时间敏感型操作
- 部分结果优于无结果的操作
- 高可用要求(系统必须始终响应)
- 等待完美方案不如接受足够好方案的场景
避免场景:
- 要求绝对正确性的操作(无接受降级的空间)
- 安全关键型操作(身份验证、授权)
- 金融交易(无“近似”的余地)
- 必须向用户暴露故障的场景(诊断操作)
- 无有意义回退方案的简单操作
Configuration
配置
Core Parameters
核心参数
Timeout Strategy:
- - Fast failures, quick degradation (5s / 2s / 1s)
aggressive - - Reasonable attempts (30s / 10s / 5s) - DEFAULT
balanced - - Thorough attempts before fallback (120s / 30s / 10s)
patient - - Define your own timeouts
custom
Fallback Types:
- - External API → Cached data → Static defaults
service - - Comprehensive → Standard → Minimal analysis
quality - - Real-time → Recent → Historical data
freshness - - Full dataset → Sample → Summary
completeness - - Precise → Approximate → Estimate
accuracy
Degradation Notification:
- - Log only, no user notification
silent - - Inform user of degradation
warning - - Detailed explanation of what degraded and why
explicit
超时策略:
- - 快速失败、快速降级(5秒/2秒/1秒)
aggressive - - 合理尝试(30秒/10秒/5秒)- 默认值
balanced - - 充分尝试后再回退(120秒/30秒/10秒)
patient - - 自定义超时时间
custom
回退类型:
- - 外部API → 缓存数据 → 静态默认值
service - - 全面分析 → 标准分析 → 最小化分析
quality - - 实时数据 → 近期数据 → 历史数据
freshness - - 完整数据集 → 样本集 → 摘要
completeness - - 精确结果 → 近似结果 → 估算值
accuracy
降级通知:
- - 仅记录日志,不通知用户
silent - - 向用户告知降级情况
warning - - 详细说明降级内容及原因
explicit
Cascade Level Requirements
级联层级要求
PRIMARY (Optimal):
- Best possible outcome
- May depend on external services
- May be slow or resource-intensive
- Can fail or timeout
SECONDARY (Acceptable):
- Reduced quality but functional
- More reliable than primary
- Faster or fewer dependencies
- Acceptable for users
TERTIARY (Guaranteed):
- Always succeeds, never fails
- No external dependencies
- Fast and reliable
- Minimal but functional
- CRITICAL: Must be designed to never fail
一级(最优):
- 能实现最佳结果
- 可能依赖外部服务
- 可能速度慢或资源消耗大
- 可能失败或超时
二级(可接受):
- 质量有所降低但仍可用
- 比一级方案更可靠
- 速度更快或依赖更少
- 用户可接受
三级(保障型):
- 始终成功,不会失败
- 无外部依赖
- 快速且可靠
- 功能最小但可用
- 关键要求:必须设计为永不失败
Execution Process
执行流程
Step 1: Define Cascade Levels
步骤1:定义级联层级
- Use architect agent to identify cascade levels
- Define PRIMARY approach (optimal solution)
- Define SECONDARY approach (acceptable degradation)
- Define TERTIARY approach (guaranteed completion)
- Set timeout for each level
- Document what degrades at each level
- CRITICAL: Ensure tertiary ALWAYS succeeds
Example Cascade Definitions:
Code Analysis with AI:
- PRIMARY: GPT-4 comprehensive analysis (timeout: 30s)
- SECONDARY: GPT-3.5 standard analysis (timeout: 10s)
- TERTIARY: Static analysis with regex (timeout: 5s)
External API Data Fetch:
- PRIMARY: Live API call (timeout: 10s)
- SECONDARY: Cached data (timeout: 2s)
- TERTIARY: Default values (timeout: 0s)
Test Execution:
- PRIMARY: Full test suite (timeout: 120s)
- SECONDARY: Critical tests only (timeout: 30s)
- TERTIARY: Smoke tests (timeout: 10s)
- 使用架构师Agent识别级联层级
- 定义一级方案(最优解)
- 定义二级方案(可接受的降级方案)
- 定义三级方案(保障完成的方案)
- 为每个层级设置超时时间
- 记录每个层级的降级内容
- 关键要求:确保三级方案始终成功
级联定义示例:
AI代码分析:
- 一级:GPT-4全面分析(超时时间:30秒)
- 二级:GPT-3.5标准分析(超时时间:10秒)
- 三级:正则表达式静态分析(超时时间:5秒)
外部API数据获取:
- 一级:实时API调用(超时时间:10秒)
- 二级:缓存数据(超时时间:2秒)
- 三级:默认值(超时时间:0秒)
测试执行:
- 一级:完整测试套件(超时时间:120秒)
- 二级:仅关键测试(超时时间:30秒)
- 三级:冒烟测试(超时时间:10秒)
Step 2: Attempt Primary Approach
步骤2:尝试一级方案
- Execute optimal solution
- Set timeout based on strategy configuration
- Monitor execution progress
- If completes successfully: DONE (best outcome)
- If fails or times out: Continue to Step 3
- Log attempt and reason for failure
python
undefined- 执行最优方案
- 根据策略配置设置超时时间
- 监控执行进度
- 若成功完成:结束(获得最佳结果)
- 若失败或超时:进入步骤3
- 记录尝试情况及失败原因
python
undefinedPseudocode for primary attempt
Pseudocode for primary attempt
try:
result = execute_primary_approach(timeout=PRIMARY_TIMEOUT)
log_success(level="PRIMARY", result=result)
return result # DONE - best outcome achieved
except TimeoutError:
log_failure(level="PRIMARY", reason="timeout")
# Continue to Step 3
except ExternalServiceError as e:
log_failure(level="PRIMARY", reason=f"service_error: {e}")
# Continue to Step 3
undefinedtry:
result = execute_primary_approach(timeout=PRIMARY_TIMEOUT)
log_success(level="PRIMARY", result=result)
return result # DONE - best outcome achieved
except TimeoutError:
log_failure(level="PRIMARY", reason="timeout")
# Continue to Step 3
except ExternalServiceError as e:
log_failure(level="PRIMARY", reason=f"service_error: {e}")
# Continue to Step 3
undefinedStep 3: Attempt Secondary Approach
步骤3:尝试二级方案
- Log degradation to secondary level
- Execute acceptable fallback solution
- Set shorter timeout (typically 1/3 of primary)
- Monitor execution progress
- If completes successfully: DONE (acceptable outcome)
- If fails or times out: Continue to Step 4
- Log attempt and reason for failure
python
undefined- 记录降级至二级层级
- 执行可接受的回退方案
- 设置更短的超时时间(通常为一级的1/3)
- 监控执行进度
- 若成功完成:结束(获得可接受的结果)
- 若失败或超时:进入步骤4
- 记录尝试情况及失败原因
python
undefinedPseudocode for secondary attempt
Pseudocode for secondary attempt
log_degradation(from_level="PRIMARY", to_level="SECONDARY")
try:
result = execute_secondary_approach(timeout=SECONDARY_TIMEOUT)
log_success(level="SECONDARY", result=result, degraded=True)
return result # DONE - acceptable outcome
except TimeoutError:
log_failure(level="SECONDARY", reason="timeout")
# Continue to Step 4
undefinedlog_degradation(from_level="PRIMARY", to_level="SECONDARY")
try:
result = execute_secondary_approach(timeout=SECONDARY_TIMEOUT)
log_success(level="SECONDARY", result=result, degraded=True)
return result # DONE - acceptable outcome
except TimeoutError:
log_failure(level="SECONDARY", reason="timeout")
# Continue to Step 4
undefinedStep 4: Attempt Tertiary Approach
步骤4:尝试三级方案
- Log degradation to tertiary level
- Execute guaranteed completion approach
- Set minimal timeout (typically 1s)
- MUST succeed - no failures allowed
- Return minimal but functional result
- Log success (degraded but functional)
- DONE (guaranteed completion)
python
undefined- 记录降级至三级层级
- 执行保障完成的方案
- 设置最小超时时间(通常为1秒)
- 必须成功 - 不允许失败
- 返回功能最小但可用的结果
- 记录成功情况(已降级但可用)
- 结束(保障完成)
python
undefinedPseudocode for tertiary attempt
Pseudocode for tertiary attempt
log_degradation(from_level="SECONDARY", to_level="TERTIARY")
try:
result = execute_tertiary_approach(timeout=TERTIARY_TIMEOUT)
log_success(level="TERTIARY", result=result, heavily_degraded=True)
return result # DONE - minimal but functional
except Exception as e:
# THIS SHOULD NEVER HAPPEN
log_critical_failure("TERTIARY approach failed - this is a bug!")
raise SystemError("Cascade safety violation: tertiary failed")
undefinedlog_degradation(from_level="SECONDARY", to_level="TERTIARY")
try:
result = execute_tertiary_approach(timeout=TERTIARY_TIMEOUT)
log_success(level="TERTIARY", result=result, heavily_degraded=True)
return result # DONE - minimal but functional
except Exception as e:
# THIS SHOULD NEVER HAPPEN
log_critical_failure("TERTIARY approach failed - this is a bug!")
raise SystemError("Cascade safety violation: tertiary failed")
undefinedStep 5: Report Degradation
步骤5:报告降级情况
- Determine notification level from configuration
- Silent: Log only, no user message
- Warning: Brief notification to user
- Explicit: Detailed degradation explanation
- Document which level succeeded
- Explain impact of degradation
- Log cascade path taken for analysis
Degradation Reporting Templates:
Silent Mode:
[LOG] CASCADE: PRIMARY timeout (30s) → SECONDARY success (6s)
Result: standard_analysis (degraded from comprehensive)Warning Mode:
⚠️ Using cached data (less than 1 hour old)
Current real-time data unavailable.Explicit Mode:
ℹ️ Analysis Quality Notice
We attempted to provide comprehensive code analysis using GPT-4,
but encountered slow response times (>30s timeout).
Fallback Applied:
- Used: GPT-3.5 standard analysis (completed in 6s)
- Quality: Standard (vs. Comprehensive)
- Impact: Advanced semantic insights not included
What You're Getting:
✓ Basic pattern detection
✓ Standard recommendations
✓ Code quality assessment
What's Missing:
✗ Complex architectural insights
✗ Deep semantic analysis
✗ Advanced refactoring suggestions- 根据配置确定通知级别
- **静默模式:**仅记录日志,不向用户发送消息
- **警告模式:**向用户发送简短通知
- **详细模式:**提供降级的详细说明
- 记录最终成功的层级
- 说明降级的影响
- 记录级联路径用于分析
降级报告模板:
静默模式:
[LOG] CASCADE: PRIMARY timeout (30s) → SECONDARY success (6s)
Result: standard_analysis (degraded from comprehensive)警告模式:
⚠️ 使用缓存数据(不足1小时)
当前实时数据不可用。详细模式:
ℹ️ 分析质量通知
我们尝试使用GPT-4为您提供全面的代码分析,
但遇到响应过慢的问题(超过30秒超时)。
已应用回退方案:
- 使用:GPT-3.5标准分析(耗时6秒完成)
- 质量:标准级(原计划为全面级)
- 影响:不包含高级语义洞察
您将获得:
✓ 基础模式检测
✓ 标准建议
✓ 代码质量评估
缺失内容:
✗ 复杂架构洞察
✗ 深度语义分析
✗ 高级重构建议Step 6: Log Cascade Metrics
步骤6:记录级联指标
- Record cascade path taken
- Document level reached (primary/secondary/tertiary)
- Log timing for each level attempted
- Track degradation frequency
- Identify patterns in failures
- Update cascade strategy if needed
Metrics to Track:
- Success rate by level
- Average response times
- Degradation frequency
- User impact assessment
- 记录所走的级联路径
- 记录最终到达的层级(一级/二级/三级)
- 记录每个尝试层级的耗时
- 跟踪降级频率
- 识别失败模式
- 必要时更新级联策略
需跟踪的指标:
- 各层级的成功率
- 平均响应时间
- 降级频率
- 用户影响评估
Step 7: Continuous Optimization
步骤7:持续优化
- Use analyzer agent to review cascade metrics
- Identify optimization opportunities
- Adjust timeouts based on success rates
- Improve secondary approaches if frequently used
- Update tertiary if inadequate
- Store learnings in memory using from
store_discovery()amplihack.memory.discoveries
Optimization Criteria:
- If PRIMARY succeeds < 50%: Timeout too aggressive → Increase timeout
- If SECONDARY used > 40%: Secondary is really the "normal" case → Swap primary and secondary
- If TERTIARY used > 10%: Secondary not reliable enough → Improve secondary
- 使用分析Agent审查级联指标
- 识别优化机会
- 根据成功率调整超时时间
- 若二级方案频繁被使用则进行改进
- 若三级方案不足则更新
- 使用中的
amplihack.memory.discoveries将经验存储到内存中store_discovery()
优化标准:
- **若一级方案成功率<50%:**超时策略过于激进 → 增加超时时间
- **若二级方案使用率>40%:**二级方案实际为“常规”情况 → 交换一级与二级方案
- **若三级方案使用率>10%:**二级方案可靠性不足 → 改进二级方案
Trade-Offs
权衡取舍
Benefit: System always completes, never fully fails
Cost: Users may receive degraded responses
Best For: User-facing features where responsiveness matters
**收益:**系统始终能完成任务,不会完全失败
**成本:**用户可能收到降级后的响应
**最佳适用场景:**响应性优先的用户面向功能
Examples
示例
Example 1: Weather API Integration
示例1:天气API集成
Configuration:
- Strategy: Balanced (30s / 10s / 5s)
- Type: Service fallback
- Notification: Warning
Implementation:
python
async def get_weather(location: str) -> WeatherData:
"""Get weather data with cascade fallback"""
# PRIMARY: Live weather API
try:
return await fetch_weather_api(location, timeout=30)
except (TimeoutError, APIError):
log.warning("PRIMARY weather API failed, trying cache")
# SECONDARY: Cached weather data
try:
cached = await get_cached_weather(location, max_age=3600)
if cached:
notify_user("Using weather data from cache (< 1 hour old)")
return cached
except CacheError:
log.warning("SECONDARY cache failed, using defaults")
# TERTIARY: Default weather data
return get_default_weather(location) # Never failsOutcome: System always returns weather data, quality degrades gracefully
配置:
- 策略:平衡型(30秒/10秒/5秒)
- 类型:服务回退
- 通知:警告模式
实现:
python
async def get_weather(location: str) -> WeatherData:
"""Get weather data with cascade fallback"""
# PRIMARY: Live weather API
try:
return await fetch_weather_api(location, timeout=30)
except (TimeoutError, APIError):
log.warning("PRIMARY weather API failed, trying cache")
# SECONDARY: Cached weather data
try:
cached = await get_cached_weather(location, max_age=3600)
if cached:
notify_user("Using weather data from cache (< 1 hour old)")
return cached
except CacheError:
log.warning("SECONDARY cache failed, using defaults")
# TERTIARY: Default weather data
return get_default_weather(location) # Never fails**结果:**系统始终能返回天气数据,质量会优雅降级
Example 2: Code Review with AI
示例2:AI代码审查
Configuration:
- Strategy: Patient (120s / 30s / 10s)
- Type: Quality fallback
- Notification: Explicit
Cascade Path:
- PRIMARY: GPT-4 comprehensive review - TIMEOUT after 120s
- SECONDARY: GPT-3.5 standard review - SUCCESS in 18s
- TERTIARY: Not attempted
配置:
- 策略:耐心型(120秒/30秒/10秒)
- 类型:质量回退
- 通知:详细模式
级联路径:
- 一级:GPT-4全面审查 - 120秒后超时
- 二级:GPT-3.5标准审查 - 18秒后成功
- 三级:未尝试
Example 3: Search Results Ranking
示例3:搜索结果排序
Configuration:
- Strategy: Aggressive (5s / 2s / 1s)
- Type: Accuracy fallback
- Notification: Silent
Implementation:
python
def search_and_rank(query: str) -> List[Result]:
"""Search with ML ranking, fallback to simple ranking"""
results = fetch_results(query)
# PRIMARY: ML-based ranking (sophisticated)
try:
return ml_rank(results, timeout=5)
except TimeoutError:
pass # Silent fallback
# SECONDARY: Heuristic ranking (good enough)
try:
return heuristic_rank(results, timeout=2)
except TimeoutError:
pass
# TERTIARY: Simple text match ranking (basic)
return simple_rank(results) # Always fast配置:
- 策略:激进型(5秒/2秒/1秒)
- 类型:精度回退
- 通知:静默模式
实现:
python
def search_and_rank(query: str) -> List[Result]:
"""Search with ML ranking, fallback to simple ranking"""
results = fetch_results(query)
# PRIMARY: ML-based ranking (sophisticated)
try:
return ml_rank(results, timeout=5)
except TimeoutError:
pass # Silent fallback
# SECONDARY: Heuristic ranking (good enough)
try:
return heuristic_rank(results, timeout=2)
except TimeoutError:
pass
# TERTIARY: Simple text match ranking (basic)
return simple_rank(results) # Always fastPhilosophy Alignment
理念契合
This workflow enforces:
- Resilience: System always completes, never completely fails
- User Experience: Better degraded service than error message
- Transparency: Users understand what they're getting (if explicit mode)
- Progressive Enhancement: Optimal by default, degrade when necessary
- Measurable Quality: Clear definition of what degrades at each level
- Continuous Improvement: Metrics drive timeout optimization
- Guaranteed Completion: Tertiary level must never fail
该工作流遵循以下理念:
- **韧性:**系统始终能完成任务,不会完全失败
- **用户体验:**降级服务比错误提示更好
- **透明度:**用户清楚自己获得的服务(若启用详细模式)
- **渐进增强:**默认提供最优服务,必要时降级
- **可衡量的质量:**明确定义每个层级的降级内容
- **持续改进:**指标驱动超时策略优化
- **保障完成:**三级方案必须永不失败
Key Principle
核心原则
Better to deliver degraded service than no service
提供降级服务好过不提供服务