rollback-workflow-builder

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Rollback Workflow Builder

回滚工作流构建器

Build safe, fast rollback mechanisms for production deployments.
为生产环境部署构建安全、快速的回滚机制。

Manual Rollback Workflow

手动回滚工作流

yaml
undefined
yaml
undefined

.github/workflows/rollback.yml

.github/workflows/rollback.yml

name: Rollback
on: workflow_dispatch: inputs: version: description: "Version to rollback to (e.g., v1.2.3 or previous)" required: true type: string environment: description: "Environment to rollback" required: true type: choice options: - staging - production reason: description: "Reason for rollback" required: true type: string
jobs: rollback: runs-on: ubuntu-latest environment: ${{ github.event.inputs.environment }} steps: - uses: actions/checkout@v4 with: ref: ${{ github.event.inputs.version }}
  - name: Verify version exists
    run: |
      if ! git rev-parse ${{ github.event.inputs.version }} >/dev/null 2>&1; then
        echo "❌ Version ${{ github.event.inputs.version }} not found"
        exit 1
      fi
      echo "✅ Version ${{ github.event.inputs.version }} exists"

  - name: Get current version
    id: current
    run: |
      CURRENT=$(git describe --tags --abbrev=0)
      echo "version=$CURRENT" >> $GITHUB_OUTPUT
      echo "Current version: $CURRENT"

  - name: Confirm rollback
    run: |
      echo "🔄 Rolling back from ${{ steps.current.outputs.version }} to ${{ github.event.inputs.version }}"
      echo "Environment: ${{ github.event.inputs.environment }}"
      echo "Reason: ${{ github.event.inputs.reason }}"

  - uses: actions/setup-node@v4
    with:
      node-version: "20"

  - run: npm ci
  - run: npm run build

  - name: Deploy rollback
    run: |
      ./scripts/deploy.sh ${{ github.event.inputs.environment }}
    env:
      DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}

  - name: Verify deployment
    run: |
      ./scripts/health-check.sh ${{ github.event.inputs.environment }}

  - name: Create incident issue
    uses: actions/github-script@v7
    with:
      script: |
        github.rest.issues.create({
          owner: context.repo.owner,
          repo: context.repo.repo,
          title: `Rollback: ${context.payload.inputs.environment} to ${context.payload.inputs.version}`,
          body: `## Rollback Details

          **Environment:** ${context.payload.inputs.environment}
          **From:** ${{ steps.current.outputs.version }}
          **To:** ${context.payload.inputs.version}
          **Reason:** ${context.payload.inputs.reason}
          **Triggered by:** @${context.actor}
          **Time:** ${new Date().toISOString()}

          ## Next Steps
          - [ ] Verify rollback successful
          - [ ] Investigate root cause
          - [ ] Create fix
          - [ ] Update postmortem
          `,
          labels: ['incident', 'rollback']
        })
undefined
name: Rollback
on: workflow_dispatch: inputs: version: description: "Version to rollback to (e.g., v1.2.3 or previous)" required: true type: string environment: description: "Environment to rollback" required: true type: choice options: - staging - production reason: description: "Reason for rollback" required: true type: string
jobs: rollback: runs-on: ubuntu-latest environment: ${{ github.event.inputs.environment }} steps: - uses: actions/checkout@v4 with: ref: ${{ github.event.inputs.version }}
  - name: Verify version exists
    run: |
      if ! git rev-parse ${{ github.event.inputs.version }} >/dev/null 2>&1; then
        echo "❌ Version ${{ github.event.inputs.version }} not found"
        exit 1
      fi
      echo "✅ Version ${{ github.event.inputs.version }} exists"

  - name: Get current version
    id: current
    run: |
      CURRENT=$(git describe --tags --abbrev=0)
      echo "version=$CURRENT" >> $GITHUB_OUTPUT
      echo "Current version: $CURRENT"

  - name: Confirm rollback
    run: |
      echo "🔄 Rolling back from ${{ steps.current.outputs.version }} to ${{ github.event.inputs.version }}"
      echo "Environment: ${{ github.event.inputs.environment }}"
      echo "Reason: ${{ github.event.inputs.reason }}"

  - uses: actions/setup-node@v4
    with:
      node-version: "20"

  - run: npm ci
  - run: npm run build

  - name: Deploy rollback
    run: |
      ./scripts/deploy.sh ${{ github.event.inputs.environment }}
    env:
      DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}

  - name: Verify deployment
    run: |
      ./scripts/health-check.sh ${{ github.event.inputs.environment }}

  - name: Create incident issue
    uses: actions/github-script@v7
    with:
      script: |
        github.rest.issues.create({
          owner: context.repo.owner,
          repo: context.repo.repo,
          title: `Rollback: ${context.payload.inputs.environment} to ${context.payload.inputs.version}`,
          body: `## Rollback Details

          **Environment:** ${context.payload.inputs.environment}
          **From:** ${{ steps.current.outputs.version }}
          **To:** ${context.payload.inputs.version}
          **Reason:** ${context.payload.inputs.reason}
          **Triggered by:** @${context.actor}
          **Time:** ${new Date().toISOString()}

          ## Next Steps
          - [ ] Verify rollback successful
          - [ ] Investigate root cause
          - [ ] Create fix
          - [ ] Update postmortem
          `,
          labels: ['incident', 'rollback']
        })
undefined

Automated Rollback on Failure

部署失败时自动回滚

yaml
deploy:
  runs-on: ubuntu-latest
  steps:
    - name: Deploy
      id: deploy
      run: ./scripts/deploy.sh production
      continue-on-error: true

    - name: Verify deployment
      id: verify
      if: steps.deploy.outcome == 'success'
      run: ./scripts/health-check.sh production
      continue-on-error: true

    - name: Auto-rollback on failure
      if: steps.deploy.outcome == 'failure' || steps.verify.outcome == 'failure'
      run: |
        echo "⚠️ Deployment failed, initiating automatic rollback"
        PREVIOUS_VERSION=$(git describe --tags --abbrev=0 HEAD^)
        ./scripts/deploy.sh production $PREVIOUS_VERSION

        # Verify rollback
        if ./scripts/health-check.sh production; then
          echo "✅ Rollback successful"
        else
          echo "❌ Rollback failed - manual intervention required"
          exit 1
        fi
yaml
deploy:
  runs-on: ubuntu-latest
  steps:
    - name: Deploy
      id: deploy
      run: ./scripts/deploy.sh production
      continue-on-error: true

    - name: Verify deployment
      id: verify
      if: steps.deploy.outcome == 'success'
      run: ./scripts/health-check.sh production
      continue-on-error: true

    - name: Auto-rollback on failure
      if: steps.deploy.outcome == 'failure' || steps.verify.outcome == 'failure'
      run: |
        echo "⚠️ Deployment failed, initiating automatic rollback"
        PREVIOUS_VERSION=$(git describe --tags --abbrev=0 HEAD^)
        ./scripts/deploy.sh production $PREVIOUS_VERSION

        # Verify rollback
        if ./scripts/health-check.sh production; then
          echo "✅ Rollback successful"
        else
          echo "❌ Rollback failed - manual intervention required"
          exit 1
        fi

Kubernetes Rollback

Kubernetes 回滚

yaml
rollback-k8s:
  runs-on: ubuntu-latest
  steps:
    - name: Setup kubectl
      uses: azure/setup-kubectl@v3

    - name: Configure kubectl
      run: |
        echo "${{ secrets.KUBECONFIG }}" > kubeconfig
        export KUBECONFIG=kubeconfig

    - name: Rollback deployment
      run: |
        kubectl rollout undo deployment/myapp -n production
        kubectl rollout status deployment/myapp -n production --timeout=5m

    - name: Get rollback revision
      run: |
        kubectl rollout history deployment/myapp -n production
yaml
rollback-k8s:
  runs-on: ubuntu-latest
  steps:
    - name: Setup kubectl
      uses: azure/setup-kubectl@v3

    - name: Configure kubectl
      run: |
        echo "${{ secrets.KUBECONFIG }}" > kubeconfig
        export KUBECONFIG=kubeconfig

    - name: Rollback deployment
      run: |
        kubectl rollout undo deployment/myapp -n production
        kubectl rollout status deployment/myapp -n production --timeout=5m

    - name: Get rollback revision
      run: |
        kubectl rollout history deployment/myapp -n production

Docker Image Rollback

Docker 镜像回滚

yaml
- name: Rollback to previous image
  run: |
    # Get previous image tag
    PREVIOUS_TAG=$(docker inspect myapp:latest | jq -r '.[0].ContainerConfig.Labels.previous_tag')

    # Retag previous as latest
    docker pull myapp:$PREVIOUS_TAG
    docker tag myapp:$PREVIOUS_TAG myapp:latest
    docker push myapp:latest

    # Restart containers
    docker-compose pull
    docker-compose up -d
yaml
- name: Rollback to previous image
  run: |
    # Get previous image tag
    PREVIOUS_TAG=$(docker inspect myapp:latest | jq -r '.[0].ContainerConfig.Labels.previous_tag')

    # Retag previous as latest
    docker pull myapp:$PREVIOUS_TAG
    docker tag myapp:$PREVIOUS_TAG myapp:latest
    docker push myapp:latest

    # Restart containers
    docker-compose pull
    docker-compose up -d

Database Migration Rollback

数据库迁移回滚

yaml
- name: Rollback database migrations
  run: |
    # Get migration to rollback to
    CURRENT=$(npm run migrate:current)
    TARGET=${{ github.event.inputs.migration }}

    echo "Rolling back from $CURRENT to $TARGET"
    npm run migrate:down -- --to=$TARGET

    # Verify rollback
    AFTER=$(npm run migrate:current)
    if [ "$AFTER" != "$TARGET" ]; then
      echo "❌ Migration rollback failed"
      exit 1
    fi
  env:
    DATABASE_URL: ${{ secrets.DATABASE_URL }}
yaml
- name: Rollback database migrations
  run: |
    # Get migration to rollback to
    CURRENT=$(npm run migrate:current)
    TARGET=${{ github.event.inputs.migration }}

    echo "Rolling back from $CURRENT to $TARGET"
    npm run migrate:down -- --to=$TARGET

    # Verify rollback
    AFTER=$(npm run migrate:current)
    if [ "$AFTER" != "$TARGET" ]; then
      echo "❌ Migration rollback failed"
      exit 1
    fi
  env:
    DATABASE_URL: ${{ secrets.DATABASE_URL }}

Rollback Runbook

回滚手册

markdown
undefined
markdown
undefined

Production Rollback Runbook

生产环境回滚手册

When to Rollback

何时执行回滚

Rollback if:
  • Critical bugs affecting >10% of users
  • Data integrity issues
  • Security vulnerabilities
  • Performance degradation >50%
  • Error rate >5%
出现以下情况时执行回滚:
  • 影响超过10%用户的严重bug
  • 数据完整性问题
  • 安全漏洞
  • 性能下降超过50%
  • 错误率超过5%

Before Rollback

回滚前准备

  1. Assess impact: Check monitoring dashboards
  2. Identify version: Determine last known good version
  3. Notify team: Post in #incidents Slack channel
  4. Enable maintenance mode (if possible)
  1. 评估影响:查看监控仪表盘
  2. 确定版本:找到最后一个已知可用版本
  3. 通知团队:在#incidents Slack频道发布消息
  4. 启用维护模式(如果可能)

Rollback Steps

回滚步骤

Automated Rollback (Preferred)

自动回滚(优先选择)

  1. Go to Actions → Rollback workflow
  2. Select environment (staging/production)
  3. Enter target version (e.g., v1.2.3 or "previous")
  4. Enter reason for rollback
  5. Click "Run workflow"
  6. Monitor progress in Actions tab
  1. 进入Actions → 回滚工作流
  2. 选择环境(staging/生产环境)
  3. 输入目标版本(例如:v1.2.3或"previous")
  4. 输入回滚原因
  5. 点击「Run workflow」
  6. 在Actions标签页监控进度

Manual Rollback (Emergency)

手动回滚(紧急情况)

bash
undefined
bash
undefined

1. SSH to production server

1. SSH连接到生产服务器

ssh production
ssh production

2. Check current version

2. 查看当前版本

docker ps | grep myapp
docker ps | grep myapp

3. Pull previous version

3. 拉取上一个版本

docker pull myapp:v1.2.3
docker pull myapp:v1.2.3

4. Update docker-compose

4. 更新docker-compose

vim docker-compose.yml
vim docker-compose.yml

Change image: myapp:latest to myapp:v1.2.3

将image: myapp:latest修改为myapp:v1.2.3

5. Deploy

5. 部署

docker-compose up -d
docker-compose up -d

6. Verify

6. 验证

7. Check logs

7. 查看日志

docker logs myapp -f
undefined
docker logs myapp -f
undefined

After Rollback

回滚后操作

  1. Verify: Run smoke tests
  2. Monitor: Watch error rates for 15 minutes
  3. Notify: Update #incidents with status
  4. Disable maintenance mode
  5. Create incident ticket
  6. Schedule postmortem
  1. 验证:运行冒烟测试
  2. 监控:观察15分钟内的错误率
  3. 通知:在#incidents频道更新状态
  4. 关闭维护模式
  5. 创建事件工单
  6. 安排事后复盘

Rollback Verification

回滚验证清单

  • Health check returns 200
  • Error rate <1%
  • Response time p95 <500ms
  • Key features working (login, checkout, etc.)
  • Database connectivity OK
  • 健康检查返回200状态码
  • 错误率<1%
  • p95响应时间<500ms
  • 核心功能正常(登录、结账等)
  • 数据库连接正常

Communication Template

沟通模板

🔄 ROLLBACK IN PROGRESS

Environment: Production
From: v1.3.0
To: v1.2.3
Reason: Critical bug in checkout flow
Status: In progress
ETA: 5 minutes

Updates: #incidents
🔄 回滚进行中

环境:生产环境
从版本:v1.3.0
到版本:v1.2.3
原因:结账流程存在严重bug
状态:进行中
预计完成时间:5分钟

更新渠道:#incidents

Common Issues

常见问题

Issue: Rollback Fails

问题:回滚失败

Symptom: Deployment doesn't start Fix: Check logs, verify version exists, ensure secrets are valid
症状:部署无法启动 解决方法:查看日志、验证版本是否存在、确保密钥有效

Issue: Database Incompatibility

问题:数据库不兼容

Symptom: App starts but can't read data Fix: May need to rollback migrations first
症状:应用启动但无法读取数据 解决方法:可能需要先回滚数据库迁移

Issue: Traffic Not Routing

问题:流量未路由

Symptom: Users still see new version Fix: Clear CDN cache, check load balancer config
undefined
症状:用户仍看到新版本 解决方法:清除CDN缓存、检查负载均衡配置
undefined

Health Check Script

健康检查脚本

bash
#!/bin/bash
bash
#!/bin/bash

scripts/health-check.sh

scripts/health-check.sh

ENVIRONMENT=$1 BASE_URL="https://${ENVIRONMENT}.myapp.com"
echo "Running health checks for $ENVIRONMENT..."
ENVIRONMENT=$1 BASE_URL="https://${ENVIRONMENT}.myapp.com"
echo "Running health checks for $ENVIRONMENT..."

API health

API health

if ! curl -f "$BASE_URL/api/health" > /dev/null 2>&1; then echo "❌ API health check failed" exit 1 fi
if ! curl -f "$BASE_URL/api/health" > /dev/null 2>&1; then echo "❌ API health check failed" exit 1 fi

Database connection

Database connection

if ! curl -f "$BASE_URL/api/health/db" > /dev/null 2>&1; then echo "❌ Database health check failed" exit 1 fi
if ! curl -f "$BASE_URL/api/health/db" > /dev/null 2>&1; then echo "❌ Database health check failed" exit 1 fi

Key endpoints

Key endpoints

ENDPOINTS=("/api/users" "/api/products" "/api/orders") for endpoint in "${ENDPOINTS[@]}"; do if ! curl -f "$BASE_URL$endpoint" > /dev/null 2>&1; then echo "❌ Endpoint $endpoint health check failed" exit 1 fi done
echo "✅ All health checks passed" exit 0
undefined
ENDPOINTS=("/api/users" "/api/products" "/api/orders") for endpoint in "${ENDPOINTS[@]}"; do if ! curl -f "$BASE_URL$endpoint" > /dev/null 2>&1; then echo "❌ Endpoint $endpoint health check failed" exit 1 fi done
echo "✅ All health checks passed" exit 0
undefined

Best Practices

最佳实践

  1. Fast rollback: <5 minutes to previous version
  2. Automated: One-click rollback workflow
  3. Verified: Health checks after rollback
  4. Documented: Clear runbook
  5. Tested: Practice rollbacks regularly
  6. Monitored: Alert on failures
  7. Communicated: Notify stakeholders
  1. 快速回滚:5分钟内回滚到上一版本
  2. 自动化:一键式回滚工作流
  3. 可验证:回滚后执行健康检查
  4. 文档化:清晰的回滚手册
  5. 可测试:定期演练回滚操作
  6. 可监控:对失败情况设置告警
  7. 及时沟通:通知相关利益干系人

Output Checklist

输出清单

  • Manual rollback workflow
  • Automated rollback on failure
  • Platform-specific rollback (K8s/Docker)
  • Database rollback procedure
  • Rollback runbook documented
  • Health check scripts
  • Communication templates
  • Incident issue automation
  • 手动回滚工作流
  • 部署失败时自动回滚
  • 平台专属回滚(K8s/Docker)
  • 数据库回滚流程
  • 回滚手册文档
  • 健康检查脚本
  • 沟通模板
  • 事件工单自动化