aws-lambda-durable-functions

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AWS Lambda durable functions

AWS Lambda durable functions

Build resilient multi-step applications and AI workflows that can execute for up to 1 year while maintaining reliable progress despite interruptions.
构建可运行最长达1年的弹性多步骤应用和AI工作流,即便遇到中断也能保持可靠的执行进度。

Onboarding

入门指南

Step 1: Validate Prerequisites

步骤1:验证前置条件

Before using AWS Lambda durable functions, verify:
  1. AWS CLI is installed (2.33.22 or higher) and configured:
    bash
    aws --version
    aws sts get-caller-identity
  2. Runtime environment is ready:
    • For TypeScript/JavaScript: Node.js 22+ (
      node --version
      )
    • For Python: Python 3.11+ (
      python --version
      . Note that currently only Lambda runtime environments 3.13+ come with the Durable Execution SDK pre-installed. 3.11 is the min supported Python version by the Durable SDK itself, however, you could use OCI to bring your own container image with your own Python runtime + Durable SDK.)
  3. Deployment capability exists (one of):
    • AWS SAM CLI (
      sam --version
      ) 1.153.1 or higher
    • AWS CDK (
      cdk --version
      ) v2.237.1 or higher
    • Direct Lambda deployment access
在使用AWS Lambda durable functions之前,请确认:
  1. AWS CLI 已安装(2.33.22或更高版本)并完成配置:
    bash
    aws --version
    aws sts get-caller-identity
  2. 运行时环境 准备就绪:
    • TypeScript/JavaScript:Node.js 22+(执行
      node --version
      查看版本)
    • Python:Python 3.11+(执行
      python --version
      查看版本。注意目前仅3.13+版本的Lambda运行时环境预安装了Durable Execution SDK,SDK本身最低支持Python 3.11版本,你也可以使用OCI自带包含Python运行时+Durable SDK的容器镜像。)
  3. 具备以下部署能力之一:
    • AWS SAM CLI(
      sam --version
      查看版本,1.153.1或更高版本)
    • AWS CDK(
      cdk --version
      查看版本,v2.237.1或更高版本)
    • 直接Lambda部署权限

Step 2: Select language and IaC framework

步骤2:选择编程语言和IaC框架

Language Selection

语言选择

Default: TypeScript
Override syntax:
  • "use Python" → Generate Python code
  • "use JavaScript" → Generate JavaScript code
When not specified, ALWAYS use TypeScript
默认:TypeScript
覆盖语法:
  • "use Python" → 生成Python代码
  • "use JavaScript" → 生成JavaScript代码
未指定语言时,始终使用TypeScript

IaC framework selection

IaC框架选择

Default: CDK
Override syntax:
  • "use CloudFormation" → Generate YAML templates
  • "use SAM" → Generate YAML templates
When not specified, ALWAYS use CDK
默认:CDK
覆盖语法:
  • "use CloudFormation" → 生成YAML模板
  • "use SAM" → 生成YAML模板
未指定框架时,始终使用CDK

Error Scenarios

错误场景

Unsupported Language

不支持的语言

  • List detected language
  • State: "Durable Execution SDK is not yet available for [framework]"
  • Suggest supported languages as alternatives
  • 列出检测到的语言
  • 提示:"Durable Execution SDK 暂不支持[framework]"
  • 建议使用支持的语言作为替代

Unsupported IaC Framework

不支持的IaC框架

  • List detected framework
  • State: "[framework] might not support Lambda durable functions yet"
  • Suggest supported frameworks as alternatives
  • 列出检测到的框架
  • 提示:"[framework] 可能暂不支持Lambda durable functions"
  • 建议使用支持的框架作为替代

Serverless MCP Server Unavailable

无服务器MCP服务器不可用

  • Inform user: "AWS Serverless MCP not responding"
  • Ask: "Proceed without MCP support?"
  • DO NOT continue without user confirmation
  • 告知用户:"AWS Serverless MCP 无响应"
  • 询问:"是否在不支持MCP的情况下继续?"
  • 未获得用户确认前请勿继续操作

Step 3: Install SDK

步骤3:安装SDK

For TypeScript/JavaScript:
bash
npm install @aws/durable-execution-sdk-js
npm install --save-dev @aws/durable-execution-sdk-js-testing
For Python:
bash
pip install aws-durable-execution-sdk-python
pip install aws-durable-execution-sdk-python-testing
TypeScript/JavaScript:
bash
npm install @aws/durable-execution-sdk-js
npm install --save-dev @aws/durable-execution-sdk-js-testing
Python:
bash
pip install aws-durable-execution-sdk-python
pip install aws-durable-execution-sdk-python-testing

When to Load Reference Files

何时加载参考文件

Load the appropriate reference file based on what the user is working on:
  • Getting started, basic setup, example, ESLint, or Jest setup -> see getting-started.md
  • Understanding replay model, determinism, or non-deterministic errors -> see replay-model-rules.md
  • Creating steps, atomic operations, or retry logic -> see step-operations.md
  • Waiting, delays, callbacks, external systems, or polling -> see wait-operations.md
  • Parallel execution, map operations, batch processing, or concurrency -> see concurrent-operations.md
  • Error handling, retry strategies, saga pattern, or compensating transactions -> see error-handling.md
  • Advanced error handling, timeout handling, circuit breakers, or conditional retries -> see advanced-error-handling.md
  • Testing, local testing, cloud testing, test runner, or flaky tests -> see testing-patterns.md
  • Deployment, CloudFormation, CDK, SAM, log groups, deploy, or infrastructure -> see deployment-iac.md
  • Advanced patterns, GenAI agents, completion policies, step semantics, or custom serialization -> see advanced-patterns.md
  • troubleshooting, stuck execution, failed execution, debug execution ID, or execution history -> see troubleshooting-executions.md
根据用户当前操作场景加载对应的参考文件:
  • 入门、基础配置、示例、ESLint 或 Jest配置 → 查看 getting-started.md
  • 理解重放模型、确定性、非确定性错误 → 查看 replay-model-rules.md
  • 创建步骤、原子操作、重试逻辑 → 查看 step-operations.md
  • 等待、延迟、回调、外部系统、轮询 → 查看 wait-operations.md
  • 并行执行、map操作、批处理、并发 → 查看 concurrent-operations.md
  • 错误处理、重试策略、saga模式、补偿事务 → 查看 error-handling.md
  • 高级错误处理、超时处理、断路器、条件重试 → 查看 advanced-error-handling.md
  • 测试、本地测试、云端测试、测试运行器、不稳定测试 → 查看 testing-patterns.md
  • 部署、CloudFormation、CDK、SAM、日志组、发布、基础设施 → 查看 deployment-iac.md
  • 高级模式、GenAI agents、完成策略、步骤语义、自定义序列化 → 查看 advanced-patterns.md
  • 故障排查、执行卡住、执行失败、调试执行ID、执行历史 → 查看 troubleshooting-executions.md

Quick Reference

快速参考

Basic Handler Pattern

基础处理器模式

TypeScript:
typescript
import { withDurableExecution, DurableContext } from '@aws/durable-execution-sdk-js';

export const handler = withDurableExecution(async (event, context: DurableContext) => {
  const result = await context.step('process', async () => processData(event));
  return result;
});
Python:
python
from aws_durable_execution_sdk_python import durable_execution, DurableContext

@durable_execution
def handler(event: dict, context: DurableContext) -> dict:
    result = context.step(lambda _: process_data(event), name='process')
    return result
TypeScript:
typescript
import { withDurableExecution, DurableContext } from '@aws/durable-execution-sdk-js';

export const handler = withDurableExecution(async (event, context: DurableContext) => {
  const result = await context.step('process', async () => processData(event));
  return result;
});
Python:
python
from aws_durable_execution_sdk_python import durable_execution, DurableContext

@durable_execution
def handler(event: dict, context: DurableContext) -> dict:
    result = context.step(lambda _: process_data(event), name='process')
    return result

Critical Rules

核心规则

  1. All non-deterministic code MUST be in steps (Date.now, Math.random, API calls)
  2. Cannot nest durable operations - use
    runInChildContext
    to group operations
  3. Closure mutations are lost on replay - return values from steps
  4. Side effects outside steps repeat - use
    context.logger
    (replay-aware)
  1. 所有非确定性代码必须放在步骤内(Date.now、Math.random、API调用)
  2. 不能嵌套持久化操作 - 使用
    runInChildContext
    来分组操作
  3. 闭包变更在重放时会丢失 - 从步骤中返回取值
  4. 步骤外的副作用会重复执行 - 使用重放感知的
    context.logger
    进行日志打印

Python API Differences

Python API差异

The Python SDK differs from TypeScript in several key areas:
  • Steps: Use
    @durable_step
    decorator +
    context.step(my_step(args))
    , or inline
    context.step(lambda _: ..., name='...')
    . Prefer the decorator for automatic step naming.
  • Wait:
    context.wait(duration=Duration.from_seconds(n), name='...')
  • Exceptions:
    ExecutionError
    (permanent),
    InvocationError
    (transient),
    CallbackError
    (callback failures)
  • Testing: Use
    DurableFunctionTestRunner
    class directly - instantiate with handler, use context manager, call
    run(input=...)
Python SDK与TypeScript SDK有以下核心差异:
  • 步骤:使用
    @durable_step
    装饰器 +
    context.step(my_step(args))
    ,或者内联写法
    context.step(lambda _: ..., name='...')
    ,优先使用装饰器实现自动步骤命名。
  • 等待
    context.wait(duration=Duration.from_seconds(n), name='...')
  • 异常
    ExecutionError
    (永久错误)、
    InvocationError
    (临时错误)、
    CallbackError
    (回调失败)
  • 测试:直接使用
    DurableFunctionTestRunner
    类 - 传入处理器实例化,使用上下文管理器,调用
    run(input=...)
    执行

Invocation Requirements

调用要求

Durable functions require qualified ARNs (version, alias, or
$LATEST
):
bash
undefined
Durable functions 需要使用完整ARN(版本、别名或
$LATEST
):
bash
undefined

Valid

有效调用

aws lambda invoke --function-name my-function:1 output.json aws lambda invoke --function-name my-function:prod output.json
aws lambda invoke --function-name my-function:1 output.json aws lambda invoke --function-name my-function:prod output.json

Invalid - will fail

无效调用 - 会执行失败

aws lambda invoke --function-name my-function output.json
undefined
aws lambda invoke --function-name my-function output.json
undefined

IAM Permissions

IAM权限

Your Lambda execution role MUST have the
AWSLambdaBasicDurableExecutionRolePolicy
managed policy attached. This includes:
  • lambda:CheckpointDurableExecution
    - Persist execution state
  • lambda:GetDurableExecutionState
    - Retrieve execution state
  • CloudWatch Logs permissions
Additional permissions needed for:
  • Durable invokes:
    lambda:InvokeFunction
    on target function ARNs
  • External callbacks: Systems need
    lambda:SendDurableExecutionCallbackSuccess
    and
    lambda:SendDurableExecutionCallbackFailure
你的Lambda执行角色必须绑定
AWSLambdaBasicDurableExecutionRolePolicy
托管策略,该策略包含以下权限:
  • lambda:CheckpointDurableExecution
    - 持久化执行状态
  • lambda:GetDurableExecutionState
    - 获取执行状态
  • CloudWatch Logs相关权限
额外需要的权限:
  • 持久化调用:目标函数ARN的
    lambda:InvokeFunction
    权限
  • 外部回调:外部系统需要
    lambda:SendDurableExecutionCallbackSuccess
    lambda:SendDurableExecutionCallbackFailure
    权限

Validation Guidelines

验证指南

When writing or reviewing durable function code, ALWAYS check for these replay model violations:
  1. Non-deterministic code outside steps:
    Date.now()
    ,
    Math.random()
    , UUID generation, API calls, database queries must all be inside steps
  2. Nested durable operations in step functions: Cannot call
    context.step()
    ,
    context.wait()
    , or
    context.invoke()
    inside a step function — use
    context.runInChildContext()
    instead
  3. Closure mutations that won't persist: Variables mutated inside steps are NOT preserved across replays — return values from steps instead
  4. Side effects outside steps that repeat on replay: Use
    context.logger
    for logging (it is replay-aware and deduplicates automatically)
When implementing or modifying tests for durable functions, ALWAYS verify:
  1. All operations have descriptive names
  2. Tests get operations by NAME, never by index
  3. Replay behavior is tested with multiple invocations
  4. Use
    LocalDurableTestRunner
    for local testing
编写或审核durable function代码时,始终检查是否存在以下重放模型违规情况:
  1. 步骤外存在非确定性代码
    Date.now()
    Math.random()
    、UUID生成、API调用、数据库查询必须全部放在步骤内
  2. 步骤函数内嵌套持久化操作:不能在步骤函数内部调用
    context.step()
    context.wait()
    context.invoke()
    —— 改用
    context.runInChildContext()
  3. 闭包变更无法持久化:步骤内部修改的变量无法在重放时保留 —— 改为从步骤中返回取值
  4. 步骤外的副作用会在重放时重复执行:使用
    context.logger
    进行日志打印(具备重放感知能力,会自动去重)
实现或修改durable function的测试用例时,始终验证以下内容:
  1. 所有操作都有描述性名称
  2. 测试通过名称获取操作,绝对不要通过索引获取
  3. 多轮调用验证重放行为
  4. 使用
    LocalDurableTestRunner
    进行本地测试

MCP Server Configuration

MCP服务器配置

Write access is enabled by default. The plugin ships with
--allow-write
in
.mcp.json
, so the MCP server can create projects, generate IaC, and deploy on behalf of the user.
Access to sensitive data (like Lambda and API Gateway logs) is not enabled by default. To grant it, add
--allow-sensitive-data-access
to
.mcp.json
.
默认开启写入权限:插件默认在
.mcp.json
中配置了
--allow-write
,因此MCP服务器可以为用户创建项目、生成IaC代码和执行部署。
默认不开启敏感数据(如Lambda和API Gateway日志)访问权限,如需授权请在
.mcp.json
中添加
--allow-sensitive-data-access
参数。

Resources

资源