Deployment Patterns

部署模式

Production deployment workflows and CI/CD best practices.

生产环境部署工作流与CI/CD最佳实践。

When to Activate

适用场景

Setting up CI/CD pipelines
Dockerizing an application
Planning deployment strategy (blue-green, canary, rolling)
Implementing health checks and readiness probes
Preparing for a production release
Configuring environment-specific settings

设置CI/CD流水线
将应用Docker容器化
规划部署策略（蓝绿、金丝雀、滚动更新）
实现健康检查与就绪探针
为生产发布做准备
配置环境专属设置

Deployment Strategies

部署策略

Rolling Deployment (Default)

滚动更新部署（默认）

Replace instances gradually — old and new versions run simultaneously during rollout.

Instance 1: v1 → v2  (update first)
Instance 2: v1        (still running v1)
Instance 3: v1        (still running v1)

Instance 1: v2
Instance 2: v1 → v2  (update second)
Instance 3: v1

Instance 1: v2
Instance 2: v2
Instance 3: v1 → v2  (update last)

Pros: Zero downtime, gradual rollout Cons: Two versions run simultaneously — requires backward-compatible changes Use when: Standard deployments, backward-compatible changes

逐步替换实例——发布期间旧版本与新版本同时运行。

Instance 1: v1 → v2  (先更新)
Instance 2: v1        (仍运行v1)
Instance 3: v1        (仍运行v1)

Instance 1: v2
Instance 2: v1 → v2  (再更新)
Instance 3: v1

Instance 1: v2
Instance 2: v2
Instance 3: v1 → v2  (最后更新)

优点： 零停机、逐步发布 缺点： 两个版本同时运行——要求变更具备向后兼容性 适用场景： 标准部署、向后兼容的变更

Blue-Green Deployment

蓝绿部署

Run two identical environments. Switch traffic atomically.

Blue  (v1) ← traffic
Green (v2)   idle, running new version

运行两个完全相同的环境，原子性切换流量。

Blue  (v1) ← 流量
Green (v2)   闲置，运行新版本

After verification:

验证完成后：

Blue (v1) idle (becomes standby) Green (v2) ← traffic


**Pros:** Instant rollback (switch back to blue), clean cutover
**Cons:** Requires 2x infrastructure during deployment
**Use when:** Critical services, zero-tolerance for issues

Blue (v1) 闲置（变为备用环境） Green (v2) ← 流量


**优点：** 即时回滚（切回Blue环境）、干净的切换
**缺点：** 部署期间需要2倍的基础设施资源
**适用场景：** 核心服务、对问题零容忍的场景

Canary Deployment

金丝雀部署

Route a small percentage of traffic to the new version first.

v1: 95% of traffic
v2:  5% of traffic  (canary)

先将小比例流量导向新版本。

v1: 95% 流量
v2:  5% 流量  (金丝雀版本)

If metrics look good:

若指标正常：

v1: 50% of traffic v2: 50% of traffic

v1: 50% 流量 v2: 50% 流量

Final:

最终状态：

v2: 100% of traffic


**Pros:** Catches issues with real traffic before full rollout
**Cons:** Requires traffic splitting infrastructure, monitoring
**Use when:** High-traffic services, risky changes, feature flags

v2: 100% 流量


**优点：** 全量发布前用真实流量发现问题
**缺点：** 需要流量拆分基础设施、监控支持
**适用场景：** 高流量服务、高风险变更、功能开关场景

Docker

Multi-Stage Dockerfile (Node.js)

多阶段Dockerfile（Node.js）

dockerfile

undefined

dockerfile

undefined

Stage 1: Install dependencies

FROM node:22-alpine AS deps WORKDIR /app COPY package.json package-lock.json ./ RUN npm ci --production=false

Stage 2: Build

FROM node:22-alpine AS builder WORKDIR /app COPY --from=deps /app/node_modules ./node_modules COPY . . RUN npm run build RUN npm prune --production

Stage 3: Production image

FROM node:22-alpine AS runner WORKDIR /app

RUN addgroup -g 1001 -S appgroup && adduser -S appuser -u 1001 USER appuser

COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules COPY --from=builder --chown=appuser:appgroup /app/dist ./dist COPY --from=builder --chown=appuser:appgroup /app/package.json ./

ENV NODE_ENV=production EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "dist/server.js"]

undefined

FROM node:22-alpine AS runner WORKDIR /app

RUN addgroup -g 1001 -S appgroup && adduser -S appuser -u 1001 USER appuser

COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules COPY --from=builder --chown=appuser:appgroup /app/dist ./dist COPY --from=builder --chown=appuser:appgroup /app/package.json ./

ENV NODE_ENV=production EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "dist/server.js"]

undefined

Multi-Stage Dockerfile (Go)

多阶段Dockerfile（Go）

dockerfile

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /server ./cmd/server

FROM alpine:3.19 AS runner
RUN apk --no-cache add ca-certificates
RUN adduser -D -u 1001 appuser
USER appuser

COPY --from=builder /server /server

EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s CMD wget -qO- http://localhost:8080/health || exit 1
CMD ["/server"]

dockerfile

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /server ./cmd/server

FROM alpine:3.19 AS runner
RUN apk --no-cache add ca-certificates
RUN adduser -D -u 1001 appuser
USER appuser

COPY --from=builder /server /server

EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s CMD wget -qO- http://localhost:8080/health || exit 1
CMD ["/server"]

Multi-Stage Dockerfile (Python/Django)

多阶段Dockerfile（Python/Django）

dockerfile

FROM python:3.12-slim AS builder
WORKDIR /app
RUN pip install --no-cache-dir uv
COPY requirements.txt .
RUN uv pip install --system --no-cache -r requirements.txt

FROM python:3.12-slim AS runner
WORKDIR /app

RUN useradd -r -u 1001 appuser
USER appuser

COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY . .

ENV PYTHONUNBUFFERED=1
EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=3s CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health/')" || exit 1
CMD ["gunicorn", "config.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "4"]

dockerfile

FROM python:3.12-slim AS builder
WORKDIR /app
RUN pip install --no-cache-dir uv
COPY requirements.txt .
RUN uv pip install --system --no-cache -r requirements.txt

FROM python:3.12-slim AS runner
WORKDIR /app

RUN useradd -r -u 1001 appuser
USER appuser

COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY . .

ENV PYTHONUNBUFFERED=1
EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=3s CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health/')" || exit 1
CMD ["gunicorn", "config.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "4"]

Docker Best Practices

Docker最佳实践

undefined

undefined

GOOD practices

BAD practices

不推荐实践

Running as root
Using :latest tags
Copying entire repo in one COPY layer
Installing dev dependencies in production image
Storing secrets in image (use env vars or secrets manager)

undefined

以root用户运行
使用:latest标签
单次COPY复制整个仓库
在生产镜像中安装开发依赖
在镜像中存储密钥（使用环境变量或密钥管理器）

undefined

CI/CD Pipeline

CI/CD流水线

GitHub Actions (Standard Pipeline)

GitHub Actions（标准流水线）

yaml

name: CI/CD

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - run: npm run lint
      - run: npm run typecheck
      - run: npm test -- --coverage
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: coverage
          path: coverage/

  build:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v5
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment: production
    steps:
      - name: Deploy to production
        run: |
          # Platform-specific deployment command
          # Railway: railway up
          # Vercel: vercel --prod
          # K8s: kubectl set image deployment/app app=ghcr.io/${{ github.repository }}:${{ github.sha }}
          echo "Deploying ${{ github.sha }}"

yaml

name: CI/CD

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - run: npm run lint
      - run: npm run typecheck
      - run: npm test -- --coverage
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: coverage
          path: coverage/

  build:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v5
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment: production
    steps:
      - name: Deploy to production
        run: |
          # Platform-specific deployment command
          # Railway: railway up
          # Vercel: vercel --prod
          # K8s: kubectl set image deployment/app app=ghcr.io/${{ github.repository }}:${{ github.sha }}
          echo "Deploying ${{ github.sha }}"

Pipeline Stages

流水线阶段

PR opened:
  lint → typecheck → unit tests → integration tests → preview deploy

Merged to main:
  lint → typecheck → unit tests → integration tests → build image → deploy staging → smoke tests → deploy production

PR提交后：
  代码检查 → 类型检查 → 单元测试 → 集成测试 → 预览部署

合并至main分支后：
  代码检查 → 类型检查 → 单元测试 → 集成测试 → 构建镜像 → 部署至预发布环境 → 冒烟测试 → 部署至生产环境

Health Checks

健康检查

Health Check Endpoint

健康检查端点

typescript

// Simple health check
app.get("/health", (req, res) => {
  res.status(200).json({ status: "ok" });
});

// Detailed health check (for internal monitoring)
app.get("/health/detailed", async (req, res) => {
  const checks = {
    database: await checkDatabase(),
    redis: await checkRedis(),
    externalApi: await checkExternalApi(),
  };

  const allHealthy = Object.values(checks).every(c => c.status === "ok");

  res.status(allHealthy ? 200 : 503).json({
    status: allHealthy ? "ok" : "degraded",
    timestamp: new Date().toISOString(),
    version: process.env.APP_VERSION || "unknown",
    uptime: process.uptime(),
    checks,
  });
});

async function checkDatabase(): Promise<HealthCheck> {
  try {
    await db.query("SELECT 1");
    return { status: "ok", latency_ms: 2 };
  } catch (err) {
    return { status: "error", message: "Database unreachable" };
  }
}

typescript

// 简单健康检查
app.get("/health", (req, res) => {
  res.status(200).json({ status: "ok" });
});

// 详细健康检查（用于内部监控）
app.get("/health/detailed", async (req, res) => {
  const checks = {
    database: await checkDatabase(),
    redis: await checkRedis(),
    externalApi: await checkExternalApi(),
  };

  const allHealthy = Object.values(checks).every(c => c.status === "ok");

  res.status(allHealthy ? 200 : 503).json({
    status: allHealthy ? "ok" : "degraded",
    timestamp: new Date().toISOString(),
    version: process.env.APP_VERSION || "unknown",
    uptime: process.uptime(),
    checks,
  });
});

async function checkDatabase(): Promise<HealthCheck> {
  try {
    await db.query("SELECT 1");
    return { status: "ok", latency_ms: 2 };
  } catch (err) {
    return { status: "error", message: "Database unreachable" };
  }
}

Kubernetes Probes

Kubernetes探针

yaml

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 30
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 2

startupProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 0
  periodSeconds: 5
  failureThreshold: 30    # 30 * 5s = 150s max startup time

yaml

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 30
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 2

startupProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 0
  periodSeconds: 5
  failureThreshold: 30    # 30 * 5s = 150s max startup time

Environment Configuration

环境配置

Twelve-Factor App Pattern

十二因素应用模式

bash

undefined

bash

undefined

All config via environment variables — never in code

所有配置通过环境变量传递——绝不在代码中硬编码

DATABASE_URL=postgres://user:pass@host:5432/db REDIS_URL=redis://host:6379/0 API_KEY=${API_KEY} # injected by secrets manager LOG_LEVEL=info PORT=3000

DATABASE_URL=postgres://user:pass@host:5432/db REDIS_URL=redis://host:6379/0 API_KEY=${API_KEY} # 由密钥管理器注入 LOG_LEVEL=info PORT=3000

Environment-specific behavior

环境专属行为

NODE_ENV=production # or staging, development APP_ENV=production # explicit app environment

undefined

NODE_ENV=production # 或staging、development APP_ENV=production # 显式声明应用环境

undefined

Configuration Validation

配置验证

typescript

import { z } from "zod";

const envSchema = z.object({
  NODE_ENV: z.enum(["development", "staging", "production"]),
  PORT: z.coerce.number().default(3000),
  DATABASE_URL: z.string().url(),
  REDIS_URL: z.string().url(),
  JWT_SECRET: z.string().min(32),
  LOG_LEVEL: z.enum(["debug", "info", "warn", "error"]).default("info"),
});

// Validate at startup — fail fast if config is wrong
export const env = envSchema.parse(process.env);

typescript

import { z } from "zod";

const envSchema = z.object({
  NODE_ENV: z.enum(["development", "staging", "production"]),
  PORT: z.coerce.number().default(3000),
  DATABASE_URL: z.string().url(),
  REDIS_URL: z.string().url(),
  JWT_SECRET: z.string().min(32),
  LOG_LEVEL: z.enum(["debug", "info", "warn", "error"]).default("info"),
});

// 启动时验证——配置错误则立即终止
export const env = envSchema.parse(process.env);

Rollback Strategy

回滚策略

Instant Rollback

即时回滚

bash

undefined

bash

undefined

Docker/Kubernetes: point to previous image

Docker/Kubernetes：指向旧版本镜像

kubectl rollout undo deployment/app

Vercel: promote previous deployment

Vercel：推广上一个部署版本

vercel rollback

Railway: redeploy previous commit

Railway：重新部署上一次提交

railway up --commit <previous-sha>

Database: rollback migration (if reversible)

数据库：回滚迁移（若支持可逆）

npx prisma migrate resolve --rolled-back <migration-name>

undefined

npx prisma migrate resolve --rolled-back <migration-name>

undefined

Rollback Checklist

回滚检查清单

Previous image/artifact is available and tagged
Database migrations are backward-compatible (no destructive changes)
Feature flags can disable new features without deploy
Monitoring alerts configured for error rate spikes
Rollback tested in staging before production release

旧版本镜像/制品可用且已打标签
数据库迁移具备向后兼容性（无破坏性变更）
可通过功能开关禁用新功能，无需重新部署
已配置错误率突增的监控告警
回滚操作已在预发布环境测试

Production Readiness Checklist

生产就绪检查清单

Before any production deployment:

生产部署前需完成：

Application

应用层面

All tests pass (unit, integration, E2E)
No hardcoded secrets in code or config files
Error handling covers all edge cases
Logging is structured (JSON) and does not contain PII
Health check endpoint returns meaningful status

所有测试通过（单元、集成、端到端）
代码或配置文件中无硬编码密钥
异常处理覆盖所有边缘场景
日志采用结构化格式（JSON）且不包含PII（个人可识别信息）
健康检查端点返回有效状态

Infrastructure

基础设施层面

Docker image builds reproducibly (pinned versions)
Environment variables documented and validated at startup
Resource limits set (CPU, memory)
Horizontal scaling configured (min/max instances)
SSL/TLS enabled on all endpoints

Docker镜像可重复构建（版本已固定）
环境变量已文档化且启动时会验证
已设置资源限制（CPU、内存）
已配置水平扩容（最小/最大实例数）
所有端点已启用SSL/TLS

Monitoring

监控层面

Application metrics exported (request rate, latency, errors)
Alerts configured for error rate > threshold
Log aggregation set up (structured logs, searchable)
Uptime monitoring on health endpoint

已导出应用指标（请求量、延迟、错误数）
已配置错误率超过阈值的告警
已设置日志聚合（结构化日志、可搜索）
已对健康检查端点配置可用性监控

Security

安全层面

Dependencies scanned for CVEs
CORS configured for allowed origins only
Rate limiting enabled on public endpoints
Authentication and authorization verified
Security headers set (CSP, HSTS, X-Frame-Options)

已扫描依赖的CVE漏洞
CORS仅配置允许的源
公开端点已启用速率限制
已验证认证与授权逻辑
已设置安全头（CSP、HSTS、X-Frame-Options）

Operations

运维层面

Rollback plan documented and tested
Database migration tested against production-sized data
Runbook for common failure scenarios
On-call rotation and escalation path defined

回滚计划已文档化且经过测试
数据库迁移已针对生产规模数据测试
常见故障场景的运行手册已准备
已定义值班轮转与升级路径

deployment-patterns

Original

Translation

Deployment Patterns

部署模式

When to Activate

适用场景

Deployment Strategies

部署策略

Rolling Deployment (Default)

滚动更新部署（默认）

Blue-Green Deployment

蓝绿部署

After verification:

验证完成后：

Canary Deployment

金丝雀部署

If metrics look good:

若指标正常：

Final:

最终状态：

Docker

Docker

Multi-Stage Dockerfile (Node.js)

多阶段Dockerfile（Node.js）

Stage 1: Install dependencies

Stage 1: Install dependencies

Stage 2: Build

Stage 2: Build

Stage 3: Production image

Stage 3: Production image

Multi-Stage Dockerfile (Go)

多阶段Dockerfile（Go）

Multi-Stage Dockerfile (Python/Django)

多阶段Dockerfile（Python/Django）

Docker Best Practices

Docker最佳实践

GOOD practices

推荐实践

BAD practices

不推荐实践

CI/CD Pipeline

CI/CD流水线

GitHub Actions (Standard Pipeline)

GitHub Actions（标准流水线）

Pipeline Stages

流水线阶段

Health Checks

健康检查

Health Check Endpoint

健康检查端点

Kubernetes Probes

Kubernetes探针

Environment Configuration

环境配置

Twelve-Factor App Pattern

十二因素应用模式

All config via environment variables — never in code

所有配置通过环境变量传递——绝不在代码中硬编码

Environment-specific behavior

环境专属行为

Configuration Validation

配置验证

Rollback Strategy

回滚策略

Instant Rollback

即时回滚

Docker/Kubernetes: point to previous image

Docker/Kubernetes：指向旧版本镜像

Vercel: promote previous deployment

Vercel：推广上一个部署版本

Railway: redeploy previous commit

Railway：重新部署上一次提交

Database: rollback migration (if reversible)

数据库：回滚迁移（若支持可逆）

Rollback Checklist

回滚检查清单

Production Readiness Checklist

生产就绪检查清单

Application