serverless-development

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Serverless Development

无服务器开发

Overview

概述

This skill covers building applications on serverless compute platforms where infrastructure management is abstracted away. It addresses AWS Lambda (Node.js, Python, Go runtimes), Vercel Edge Functions and Serverless Functions, Cloudflare Workers, cold start optimization, serverless design patterns (fan-out, step functions, event-driven), database connection management, deployment with SAM/SST/Serverless Framework, monitoring, and cost optimization.
Use this skill when building APIs, webhooks, background processing, scheduled tasks, or any workload that benefits from auto-scaling, pay-per-use pricing, and zero infrastructure management.

本技能涵盖在基础设施管理被抽象化的无服务器计算平台上构建应用的相关内容,涉及AWS Lambda(Node.js、Python、Go运行时)、Vercel Edge Functions与Serverless Functions、Cloudflare Workers、冷启动优化、无服务器设计模式(扇出、Step Functions、事件驱动)、数据库连接管理、通过SAM/SST/Serverless Framework进行部署、监控以及成本优化。
当你需要构建API、webhook、后台处理任务、定时任务,或是任何能从自动扩缩容、按使用量付费、零基础设施管理中获益的工作负载时,均可使用本技能。

Core Principles

核心原则

  1. Design for statelessness - Every function invocation is independent. Never store state in global variables between invocations (though you can reuse connections). Use external state stores (databases, caches, queues) for persistence.
  2. Minimize cold starts - Cold starts add 100ms-10s of latency on first invocation. Keep bundles small, use lightweight runtimes, and consider provisioned concurrency for latency-sensitive paths.
  3. Connection management is critical - Serverless functions can spawn thousands of concurrent instances. Each opening its own database connection will exhaust connection pools. Use connection pooling (RDS Proxy, Prisma Data Proxy, PgBouncer).
  4. Pay-per-invocation thinking - You pay for execution time and memory. Optimize hot paths, choose appropriate memory sizes (more memory = more CPU = potentially faster = sometimes cheaper), and batch operations where possible.
  5. Embrace the constraints - Function timeouts, payload limits, and concurrency limits are features, not bugs. Design around them: use queues for long tasks, S3 for large payloads, and step functions for complex orchestration.

  1. 设计无状态化 - 每个函数调用都是独立的。切勿在调用之间将状态存储在全局变量中(不过你可以复用连接)。使用外部状态存储(数据库、缓存、队列)来实现持久化。
  2. 最小化冷启动 - 冷启动会在首次调用时增加100ms-10s的延迟。保持包体积小巧,使用轻量运行时,并对延迟敏感的路径考虑配置预置并发。
  3. 连接管理至关重要 - 无服务器函数可以生成数千个并发实例。每个实例都打开自己的数据库连接会耗尽连接池。使用连接池(RDS Proxy、Prisma Data Proxy、PgBouncer)。
  4. 按调用付费思维 - 你需要为执行时间和内存付费。优化热路径,选择合适的内存大小(内存越大=CPU性能越强=可能执行越快=有时成本更低),并尽可能批量操作。
  5. 接受约束条件 - 函数超时、负载限制和并发限制是特性而非缺陷。围绕这些约束设计:对长任务使用队列,对大负载使用S3,对复杂编排使用Step Functions。

Key Patterns

关键模式

Pattern 1: AWS Lambda with TypeScript

模式1:AWS Lambda + TypeScript

When to use: Backend APIs, event processing, scheduled tasks on AWS.
Implementation:
typescript
// handler.ts - Lambda handler with proper typing
import {
  APIGatewayProxyEventV2,
  APIGatewayProxyResultV2,
  SQSEvent,
  ScheduledEvent,
  Context,
} from "aws-lambda";

// Reusable across invocations (connection reuse)
let dbConnection: PrismaClient | null = null;

function getDb(): PrismaClient {
  if (!dbConnection) {
    dbConnection = new PrismaClient({
      datasources: {
        db: { url: process.env.DATABASE_URL },
      },
    });
  }
  return dbConnection;
}

// API Gateway handler
export async function apiHandler(
  event: APIGatewayProxyEventV2,
  context: Context
): Promise<APIGatewayProxyResultV2> {
  // Don't wait for event loop to drain (for connection reuse)
  context.callbackWaitsForEmptyEventLoop = false;

  try {
    const method = event.requestContext.http.method;
    const path = event.rawPath;

    if (method === "GET" && path === "/api/users") {
      const db = getDb();
      const users = await db.user.findMany({ take: 50 });
      return response(200, { users });
    }

    if (method === "POST" && path === "/api/users") {
      const body = JSON.parse(event.body ?? "{}");
      const db = getDb();
      const user = await db.user.create({ data: body });
      return response(201, { user });
    }

    return response(404, { error: "Not found" });
  } catch (error) {
    console.error("Handler error:", error);
    return response(500, { error: "Internal server error" });
  }
}

// SQS event handler (batch processing)
export async function sqsHandler(event: SQSEvent): Promise<{ batchItemFailures: Array<{ itemIdentifier: string }> }> {
  const failures: string[] = [];

  for (const record of event.Records) {
    try {
      const body = JSON.parse(record.body);
      await processMessage(body);
    } catch (error) {
      console.error(`Failed to process message ${record.messageId}:`, error);
      failures.push(record.messageId);
    }
  }

  // Partial batch failure reporting (only retry failed messages)
  return {
    batchItemFailures: failures.map((id) => ({ itemIdentifier: id })),
  };
}

// Scheduled event handler (cron)
export async function cronHandler(event: ScheduledEvent): Promise<void> {
  console.log("Running scheduled task at:", event.time);
  const db = getDb();

  // Clean up expired sessions
  const deleted = await db.session.deleteMany({
    where: { expiresAt: { lt: new Date() } },
  });

  console.log(`Cleaned up ${deleted.count} expired sessions`);
}

function response(statusCode: number, body: Record<string, unknown>): APIGatewayProxyResultV2 {
  return {
    statusCode,
    headers: {
      "Content-Type": "application/json",
      "Access-Control-Allow-Origin": process.env.ALLOWED_ORIGIN ?? "*",
    },
    body: JSON.stringify(body),
  };
}
yaml
undefined
适用场景: AWS上的后端API、事件处理、定时任务。
实现方式:
typescript
// handler.ts - Lambda handler with proper typing
import {
  APIGatewayProxyEventV2,
  APIGatewayProxyResultV2,
  SQSEvent,
  ScheduledEvent,
  Context,
} from "aws-lambda";

// Reusable across invocations (connection reuse)
let dbConnection: PrismaClient | null = null;

function getDb(): PrismaClient {
  if (!dbConnection) {
    dbConnection = new PrismaClient({
      datasources: {
        db: { url: process.env.DATABASE_URL },
      },
    });
  }
  return dbConnection;
}

// API Gateway handler
export async function apiHandler(
  event: APIGatewayProxyEventV2,
  context: Context
): Promise<APIGatewayProxyResultV2> {
  // Don't wait for event loop to drain (for connection reuse)
  context.callbackWaitsForEmptyEventLoop = false;

  try {
    const method = event.requestContext.http.method;
    const path = event.rawPath;

    if (method === "GET" && path === "/api/users") {
      const db = getDb();
      const users = await db.user.findMany({ take: 50 });
      return response(200, { users });
    }

    if (method === "POST" && path === "/api/users") {
      const body = JSON.parse(event.body ?? "{}");
      const db = getDb();
      const user = await db.user.create({ data: body });
      return response(201, { user });
    }

    return response(404, { error: "Not found" });
  } catch (error) {
    console.error("Handler error:", error);
    return response(500, { error: "Internal server error" });
  }
}

// SQS event handler (batch processing)
export async function sqsHandler(event: SQSEvent): Promise<{ batchItemFailures: Array<{ itemIdentifier: string }> }> {
  const failures: string[] = [];

  for (const record of event.Records) {
    try {
      const body = JSON.parse(record.body);
      await processMessage(body);
    } catch (error) {
      console.error(`Failed to process message ${record.messageId}:`, error);
      failures.push(record.messageId);
    }
  }

  // Partial batch failure reporting (only retry failed messages)
  return {
    batchItemFailures: failures.map((id) => ({ itemIdentifier: id })),
  };
}

// Scheduled event handler (cron)
export async function cronHandler(event: ScheduledEvent): Promise<void> {
  console.log("Running scheduled task at:", event.time);
  const db = getDb();

  // Clean up expired sessions
  const deleted = await db.session.deleteMany({
    where: { expiresAt: { lt: new Date() } },
  });

  console.log(`Cleaned up ${deleted.count} expired sessions`);
}

function response(statusCode: number, body: Record<string, unknown>): APIGatewayProxyResultV2 {
  return {
    statusCode,
    headers: {
      "Content-Type": "application/json",
      "Access-Control-Allow-Origin": process.env.ALLOWED_ORIGIN ?? "*",
    },
    body: JSON.stringify(body),
  };
}
yaml
undefined

template.yaml - AWS SAM template

template.yaml - AWS SAM template

AWSTemplateFormatVersion: "2010-09-09" Transform: AWS::Serverless-2016-10-31
Globals: Function: Runtime: nodejs20.x Timeout: 30 MemorySize: 256 Environment: Variables: DATABASE_URL: !Ref DatabaseUrl NODE_OPTIONS: "--enable-source-maps"
Resources: ApiFunction: Type: AWS::Serverless::Function Properties: Handler: dist/handler.apiHandler Events: Api: Type: HttpApi Properties: Path: /api/{proxy+} Method: ANY Metadata: BuildMethod: esbuild BuildProperties: Minify: true Target: es2022 Sourcemap: true EntryPoints: - src/handler.ts
QueueProcessor: Type: AWS::Serverless::Function Properties: Handler: dist/handler.sqsHandler Events: Queue: Type: SQS Properties: Queue: !GetAtt ProcessingQueue.Arn BatchSize: 10 FunctionResponseTypes: - ReportBatchItemFailures
ScheduledCleanup: Type: AWS::Serverless::Function Properties: Handler: dist/handler.cronHandler Events: Schedule: Type: Schedule Properties: Schedule: rate(1 hour)
ProcessingQueue: Type: AWS::SQS::Queue Properties: VisibilityTimeout: 180 RedrivePolicy: deadLetterTargetArn: !GetAtt DLQ.Arn maxReceiveCount: 3
DLQ: Type: AWS::SQS::Queue

**Why:** AWS Lambda with SAM provides infrastructure-as-code, local testing (`sam local invoke`), and automated deployment. Connection reuse via module-level variables avoids the cold start penalty of reconnecting on every invocation. Partial batch failure reporting for SQS ensures only failed messages are retried.

---
AWSTemplateFormatVersion: "2010-09-09" Transform: AWS::Serverless-2016-10-31
Globals: Function: Runtime: nodejs20.x Timeout: 30 MemorySize: 256 Environment: Variables: DATABASE_URL: !Ref DatabaseUrl NODE_OPTIONS: "--enable-source-maps"
Resources: ApiFunction: Type: AWS::Serverless::Function Properties: Handler: dist/handler.apiHandler Events: Api: Type: HttpApi Properties: Path: /api/{proxy+} Method: ANY Metadata: BuildMethod: esbuild BuildProperties: Minify: true Target: es2022 Sourcemap: true EntryPoints: - src/handler.ts
QueueProcessor: Type: AWS::Serverless::Function Properties: Handler: dist/handler.sqsHandler Events: Queue: Type: SQS Properties: Queue: !GetAtt ProcessingQueue.Arn BatchSize: 10 FunctionResponseTypes: - ReportBatchItemFailures
ScheduledCleanup: Type: AWS::Serverless::Function Properties: Handler: dist/handler.cronHandler Events: Schedule: Type: Schedule Properties: Schedule: rate(1 hour)
ProcessingQueue: Type: AWS::SQS::Queue Properties: VisibilityTimeout: 180 RedrivePolicy: deadLetterTargetArn: !GetAtt DLQ.Arn maxReceiveCount: 3
DLQ: Type: AWS::SQS::Queue

**优势:** AWS Lambda搭配SAM提供基础设施即代码、本地测试(`sam local invoke`)和自动化部署能力。通过模块级变量复用连接,避免了每次调用重新连接带来的冷启动损耗。SQS的部分批量失败报告功能确保只有失败的消息会被重试。

---

Pattern 2: Vercel Edge Functions

模式2:Vercel Edge Functions

When to use: Low-latency API routes and middleware that need to run close to users globally.
Implementation:
typescript
// app/api/geo/route.ts - Edge Function
export const runtime = "edge"; // Run on Vercel Edge Network

export async function GET(request: Request) {
  // Access geo information (available at the edge)
  const country = request.headers.get("x-vercel-ip-country") ?? "US";
  const city = request.headers.get("x-vercel-ip-city") ?? "Unknown";

  // Edge-compatible KV store
  const cachedData = await kv.get(`content:${country}`);
  if (cachedData) {
    return Response.json(cachedData);
  }

  // Lightweight processing at the edge
  const content = await getLocalizedContent(country);
  await kv.set(`content:${country}`, content, { ex: 3600 });

  return Response.json(content, {
    headers: {
      "Cache-Control": "public, s-maxage=3600, stale-while-revalidate=86400",
    },
  });
}
typescript
// middleware.ts - Edge Middleware for auth, redirects, A/B testing
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";

export const config = {
  matcher: ["/dashboard/:path*", "/api/:path*"],
};

export function middleware(request: NextRequest) {
  // Auth check at the edge (fast, runs before serverless function)
  const token = request.cookies.get("session_token");

  if (!token && request.nextUrl.pathname.startsWith("/dashboard")) {
    return NextResponse.redirect(new URL("/login", request.url));
  }

  // A/B test assignment at the edge
  const abCookie = request.cookies.get("ab-pricing");
  if (!abCookie && request.nextUrl.pathname === "/pricing") {
    const variant = Math.random() < 0.5 ? "control" : "new-layout";
    const response = NextResponse.next();
    response.cookies.set("ab-pricing", variant, { maxAge: 60 * 60 * 24 * 30 });

    // Rewrite to variant-specific page
    if (variant === "new-layout") {
      return NextResponse.rewrite(new URL("/pricing-v2", request.url));
    }
    return response;
  }

  return NextResponse.next();
}
Why: Edge Functions run in 30+ global locations with sub-millisecond cold starts (V8 isolates, not containers). They're ideal for auth checks, geo-routing, A/B testing, and API responses that benefit from proximity to users. The tradeoff: limited runtime (no Node.js APIs, no native modules, smaller memory).

适用场景: 需要在全球范围内贴近用户运行的低延迟API路由和中间件。
实现方式:
typescript
// app/api/geo/route.ts - Edge Function
export const runtime = "edge"; // Run on Vercel Edge Network

export async function GET(request: Request) {
  // Access geo information (available at the edge)
  const country = request.headers.get("x-vercel-ip-country") ?? "US";
  const city = request.headers.get("x-vercel-ip-city") ?? "Unknown";

  // Edge-compatible KV store
  const cachedData = await kv.get(`content:${country}`);
  if (cachedData) {
    return Response.json(cachedData);
  }

  // Lightweight processing at the edge
  const content = await getLocalizedContent(country);
  await kv.set(`content:${country}`, content, { ex: 3600 });

  return Response.json(content, {
    headers: {
      "Cache-Control": "public, s-maxage=3600, stale-while-revalidate=86400",
    },
  });
}
typescript
// middleware.ts - Edge Middleware for auth, redirects, A/B testing
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";

export const config = {
  matcher: ["/dashboard/:path*", "/api/:path*"],
};

export function middleware(request: NextRequest) {
  // Auth check at the edge (fast, runs before serverless function)
  const token = request.cookies.get("session_token");

  if (!token && request.nextUrl.pathname.startsWith("/dashboard")) {
    return NextResponse.redirect(new URL("/login", request.url));
  }

  // A/B test assignment at the edge
  const abCookie = request.cookies.get("ab-pricing");
  if (!abCookie && request.nextUrl.pathname === "/pricing") {
    const variant = Math.random() < 0.5 ? "control" : "new-layout";
    const response = NextResponse.next();
    response.cookies.set("ab-pricing", variant, { maxAge: 60 * 60 * 24 * 30 });

    // Rewrite to variant-specific page
    if (variant === "new-layout") {
      return NextResponse.rewrite(new URL("/pricing-v2", request.url));
    }
    return response;
  }

  return NextResponse.next();
}
优势: Edge Functions在全球30+个节点运行,冷启动延迟仅亚毫秒级(基于V8隔离环境,而非容器)。它们非常适合身份验证检查、地理路由、A/B测试以及需要贴近用户以降低延迟的API响应场景。权衡:运行时存在限制(无Node.js API、无原生模块、内存更小)。

Pattern 3: Cold Start Optimization

模式3:冷启动优化

When to use: When latency-sensitive endpoints experience unacceptable cold start times.
Implementation:
typescript
// 1. Minimize bundle size with esbuild tree-shaking
// esbuild.config.mjs
import { build } from "esbuild";

await build({
  entryPoints: ["src/handler.ts"],
  bundle: true,
  minify: true,
  platform: "node",
  target: "node20",
  outdir: "dist",
  external: [
    // Don't bundle the AWS SDK (already available in Lambda)
    "@aws-sdk/*",
  ],
  // Tree-shake unused code
  treeShaking: true,
  // Source maps for debugging
  sourcemap: true,
});
typescript
// 2. Lazy-load heavy dependencies
let _sharp: typeof import("sharp") | null = null;

async function getSharp() {
  if (!_sharp) {
    _sharp = await import("sharp");
  }
  return _sharp;
}

// Only load sharp when actually processing images
export async function imageHandler(event: APIGatewayProxyEventV2) {
  if (event.rawPath.includes("/resize")) {
    const sharp = await getSharp();
    // Use sharp...
  }
}
typescript
// 3. Provisioned concurrency for critical paths
// In SAM template:
// Properties:
//   ProvisionedConcurrencyConfig:
//     ProvisionedConcurrentExecutions: 5

// 4. Keep-alive with scheduled warming (cheaper than provisioned concurrency)
// CloudWatch rule: rate(5 minutes) -> Lambda with warming event
export async function handler(event: unknown) {
  // Check if this is a warming invocation
  if ((event as Record<string, unknown>).source === "serverless-warming") {
    console.log("Warming invocation, returning early");
    return { statusCode: 200, body: "warm" };
  }

  // Normal handler logic...
}
Why: Cold starts are the primary latency concern in serverless. Smaller bundles initialize faster. Lazy loading defers heavy import costs to when they're actually needed. Provisioned concurrency eliminates cold starts entirely for critical paths (at a cost). Warming invocations keep a function instance hot without paying for provisioned concurrency.

适用场景: 对延迟敏感的端点遭遇无法接受的冷启动时间时。
实现方式:
typescript
// 1. Minimize bundle size with esbuild tree-shaking
// esbuild.config.mjs
import { build } from "esbuild";

await build({
  entryPoints: ["src/handler.ts"],
   bundle: true,
  minify: true,
  platform: "node",
  target: "node20",
  outdir: "dist",
  external: [
    // Don't bundle the AWS SDK (already available in Lambda)
    "@aws-sdk/*",
  ],
  // Tree-shake unused code
  treeShaking: true,
  // Source maps for debugging
  sourcemap: true,
});
typescript
// 2. Lazy-load heavy dependencies
let _sharp: typeof import("sharp") | null = null;

async function getSharp() {
  if (!_sharp) {
    _sharp = await import("sharp");
  }
  return _sharp;
}

// Only load sharp when actually processing images
export async function imageHandler(event: APIGatewayProxyEventV2) {
  if (event.rawPath.includes("/resize")) {
    const sharp = await getSharp();
    // Use sharp...
  }
}
typescript
// 3. Provisioned concurrency for critical paths
// In SAM template:
// Properties:
//   ProvisionedConcurrencyConfig:
//     ProvisionedConcurrentExecutions: 5

// 4. Keep-alive with scheduled warming (cheaper than provisioned concurrency)
// CloudWatch rule: rate(5 minutes) -> Lambda with warming event
export async function handler(event: unknown) {
  // Check if this is a warming invocation
  if ((event as Record<string, unknown>).source === "serverless-warming") {
    console.log("Warming invocation, returning early");
    return { statusCode: 200, body: "warm" };
  }

  // Normal handler logic...
}
优势: 冷启动是无服务器架构中主要的延迟问题。更小的包体积初始化速度更快。懒加载将重依赖的加载成本推迟到实际需要时才产生。预置并发为关键路径完全消除冷启动(但会产生额外成本)。预热调用可保持函数实例处于热状态,无需为预置并发付费。

Pattern 4: Database Connection Pooling

模式4:数据库连接池

When to use: Any serverless function that connects to a relational database.
Implementation:
typescript
// prisma/schema.prisma - Use Prisma Accelerate or Data Proxy for serverless
datasource db {
  provider = "postgresql"
  // Direct connection for migrations
  url      = env("DATABASE_URL")
  // Pooled connection for serverless runtime
  directUrl = env("DIRECT_DATABASE_URL")
}

generator client {
  provider = "prisma-client-js"
}
typescript
// For AWS Lambda: Use RDS Proxy
// DATABASE_URL = "postgresql://user:pass@my-rds-proxy.proxy-xxx.region.rds.amazonaws.com:5432/mydb"

// For Vercel/Cloudflare: Use Neon serverless driver or Prisma Accelerate
import { Pool, neonConfig } from "@neondatabase/serverless";
import { PrismaNeon } from "@prisma/adapter-neon";
import { PrismaClient } from "@prisma/client";
import ws from "ws";

// Required for Node.js environments
neonConfig.webSocketConstructor = ws;

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const adapter = new PrismaNeon(pool);
const prisma = new PrismaClient({ adapter });

// Connection reuse pattern
let client: PrismaClient | null = null;

export function getClient(): PrismaClient {
  if (!client) {
    client = new PrismaClient({
      log: process.env.NODE_ENV === "development" ? ["query"] : [],
    });
  }
  return client;
}
Why: Each Lambda invocation can create a new database connection. With thousands of concurrent invocations, this exhausts the database connection limit (typically 100-500 for managed databases). Connection pooling (RDS Proxy, PgBouncer, Neon serverless, Prisma Accelerate) multiplexes many serverless connections through fewer database connections.

适用场景: 任何需要连接关系型数据库的无服务器函数。
实现方式:
typescript
// prisma/schema.prisma - Use Prisma Accelerate or Data Proxy for serverless
datasource db {
  provider = "postgresql"
  // Direct connection for migrations
  url      = env("DATABASE_URL")
  // Pooled connection for serverless runtime
  directUrl = env("DIRECT_DATABASE_URL")
}

generator client {
  provider = "prisma-client-js"
}
typescript
// For AWS Lambda: Use RDS Proxy
// DATABASE_URL = "postgresql://user:pass@my-rds-proxy.proxy-xxx.region.rds.amazonaws.com:5432/mydb"

// For Vercel/Cloudflare: Use Neon serverless driver or Prisma Accelerate
import { Pool, neonConfig } from "@neondatabase/serverless";
import { PrismaNeon } from "@prisma/adapter-neon";
import { PrismaClient } from "@prisma/client";
import ws from "ws";

// Required for Node.js environments
neonConfig.webSocketConstructor = ws;

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const adapter = new PrismaNeon(pool);
const prisma = new PrismaClient({ adapter });

// Connection reuse pattern
let client: PrismaClient | null = null;

export function getClient(): PrismaClient {
  if (!client) {
    client = new PrismaClient({
      log: process.env.NODE_ENV === "development" ? ["query"] : [],
    });
  }
  return client;
}
优势: 每个Lambda调用都可能创建一个新的数据库连接。当存在数千个并发调用时,这会耗尽数据库的连接限制(托管数据库通常为100-500个)。连接池(RDS Proxy、PgBouncer、Neon Serverless、Prisma Accelerate)将大量无服务器连接通过更少的数据库连接进行多路复用。

Cost Optimization Reference

成本优化参考

StrategySavingsTradeoff
Right-size memory20-40%Requires benchmarking
ARM64 (Graviton)20%Minor compatibility concerns
Provisioned concurrency (vs over-provisioned servers)50-70% vs always-onHigher latency on scale-up
Reserved concurrencyPrevents runaway costsLimits throughput
Batch processing (SQS batch size)50-80%Higher latency
Tiered storage (S3 lifecycle)30-60%Access latency for cold tiers

策略�节省比例权衡
合理配置内存20-40%需要基准测试
ARM64(Graviton)20%存在轻微兼容性问题
预置并发(对比过度配置的服务器)比一直运行节省50-70%扩容时延迟更高
预留并发防止成本失控限制吞吐量
批量处理(SQS批量大小)50-80%延迟更高
分层存储(S3生命周期)30-60%冷存储层访问延迟更高

Anti-Patterns

反模式

Anti-PatternWhy It's BadBetter Approach
Opening DB connection per invocationExhausts connection poolModule-level connection reuse + RDS Proxy
Bundling entire
node_modules
Massive cold start (5-10s)Tree-shake with esbuild, external AWS SDK
Storing state in global variablesLost between invocations (unreliable)Use DynamoDB, Redis, or S3 for state
15-minute timeout for API handlersUser waiting 15 minutes?30s for APIs, use Step Functions for long tasks
No concurrency limitsRunaway costs, downstream overloadSet reserved concurrency per function
Synchronous fan-outSlow, timeout riskUse SQS/SNS for fan-out, process async
console.log
without structure
Unsearchable in CloudWatchStructured JSON logging with correlation IDs

反模式危害更佳方案
每次调用都建立数据库连接耗尽连接池模块级连接复用 + RDS Proxy
打包整个
node_modules
冷启动时间极长(5-10s)使用esbuild进行摇树优化,排除AWS SDK
在全局变量中存储状态调用之间状态丢失(不可靠)使用DynamoDB、Redis或S3存储状态
API处理器设置15分钟超时用户需要等待15分钟?API设置30秒超时,长任务使用Step Functions
未设置并发限制成本失控、下游服务过载为每个函数设置预留并发
同步扇出速度慢、存在超时风险使用SQS/SNS进行扇出,异步处理
无结构化的
console.log
在CloudWatch中无法搜索使用带关联ID的结构化JSON日志

Checklist

检查清单

  • Bundle size minimized (esbuild tree-shaking, external AWS SDK)
  • Database connections pooled (RDS Proxy, Neon, Prisma Accelerate)
  • Cold start measured and acceptable for the use case
  • Provisioned concurrency configured for latency-sensitive paths
  • Memory size benchmarked and right-sized
  • SQS partial batch failure reporting enabled
  • Dead letter queues configured for async event sources
  • Reserved concurrency set to prevent runaway costs
  • Structured logging with request/correlation IDs
  • Timeout set appropriately (30s for APIs, longer for batch)
  • Environment variables for all configuration (no hardcoded values)
  • ARM64/Graviton runtime selected for cost savings

  • 已最小化包体积(esbuild摇树优化、排除AWS SDK)
  • 已配置数据库连接池(RDS Proxy、Neon、Prisma Accelerate)
  • 已测量冷启动时间且符合使用场景要求
  • 已为对延迟敏感的路径配置预置并发
  • 已通过基准测试并合理配置内存大小
  • 已启用SQS部分批量失败报告
  • 已为异步事件源配置死信队列
  • 已设置预留并发以防止成本失控
  • 已使用带请求/关联ID的结构化日志
  • 已设置合适的超时时间(API为30秒,批量任务可更长)
  • 所有配置均使用环境变量(无硬编码值)
  • 已选择ARM64/Graviton运行时以节省成本

Related Resources

相关资源

  • Skills:
    event-driven-architecture
    (SQS/SNS patterns),
    monitoring-observability
    (Lambda tracing)
  • Skills:
    performance-engineering
    (cold start impact on latency)
  • Rules:
    docs/reference/stacks/python.md
    (Python Lambda patterns)
  • 技能:
    event-driven-architecture
    (SQS/SNS模式)、
    monitoring-observability
    (Lambda追踪)
  • 技能:
    performance-engineering
    (冷启动对延迟的影响)
  • 规则:
    docs/reference/stacks/python.md
    (Python Lambda模式)