ci-optimization-specialist

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

CI Optimization Specialist

CI优化专家

Quick Start

快速开始

This skill optimizes GitHub Actions workflows for:
  1. Test sharding: Parallel test execution across multiple runners
  2. Caching: pnpm store, Playwright browsers, Vite build cache
  3. Workflow optimization: Job dependencies and concurrency
本技能针对以下场景优化GitHub Actions工作流:
  1. 测试分片:在多个运行器上并行执行测试
  2. 缓存:pnpm存储、Playwright浏览器、Vite构建缓存
  3. 工作流优化:任务依赖与并发控制

When to Use

适用场景

  • CI execution time exceeds 10-15 minutes
  • GitHub Actions costs too high
  • Need faster developer feedback loops
  • Tests not parallelized
  • CI执行时间超过10-15分钟
  • GitHub Actions成本过高
  • 需要更快的开发者反馈循环
  • 测试未进行并行化处理

Test Sharding Setup

测试分片配置

Basic Pattern (Automatic Distribution)

基础模式(自动分配)

Add matrix strategy to
.github/workflows/ci.yml
:
yaml
e2e-tests:
  name: 🧪 E2E Tests [Shard ${{ matrix.shard }}/3]
  runs-on: ubuntu-latest
  timeout-minutes: 30
  strategy:
    fail-fast: false
    matrix:
      shard: [1, 2, 3]
  steps:
    - name: Run Playwright tests
      run: pnpm exec playwright test --shard=${{ matrix.shard }}/3
      env:
        CI: true
Expected improvement: 60-65% faster for 3 shards
.github/workflows/ci.yml
中添加matrix策略:
yaml
e2e-tests:
  name: 🧪 E2E Tests [Shard ${{ matrix.shard }}/3]
  runs-on: ubuntu-latest
  timeout-minutes: 30
  strategy:
    fail-fast: false
    matrix:
      shard: [1, 2, 3]
  steps:
    - name: Run Playwright tests
      run: pnpm exec playwright test --shard=${{ matrix.shard }}/3
      env:
        CI: true
预期优化效果:使用3个分片可提速60-65%

Advanced Pattern (Manual Distribution)

进阶模式(手动分配)

For unbalanced test suites, manually distribute by duration:
yaml
matrix:
  include:
    - shard: 1
      pattern: 'ai-generation|project-management' # Heavy tests
    - shard: 2
      pattern: 'project-wizard|settings|publishing' # Medium tests
    - shard: 3
      pattern: 'world-building|versioning|mock-validation' # Light tests
针对测试套件负载不均衡的情况,可按测试时长手动分配:
yaml
matrix:
  include:
    - shard: 1
      pattern: 'ai-generation|project-management' # Heavy tests
    - shard: 2
      pattern: 'project-wizard|settings|publishing' # Medium tests
    - shard: 3
      pattern: 'world-building|versioning|mock-validation' # Light tests

In step:

In step:

run: pnpm exec playwright test --grep "${{ matrix.pattern }}"
undefined
run: pnpm exec playwright test --grep "${{ matrix.pattern }}"
undefined

Critical Caching Patterns

关键缓存模式

pnpm Store Cache

pnpm存储缓存

ALWAYS cache pnpm store to avoid re-downloading packages:
yaml
- name: Get pnpm store directory
  id: pnpm-cache
  shell: bash
  run: echo "STORE_PATH=$(pnpm store path)" >> $GITHUB_OUTPUT

- name: Setup pnpm cache
  uses: actions/cache@v4
  with:
    path: ${{ steps.pnpm-cache.outputs.STORE_PATH }}
    key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
    restore-keys: |
      ${{ runner.os }}-pnpm-store-
始终缓存pnpm存储以避免重复下载包:
yaml
- name: Get pnpm store directory
  id: pnpm-cache
  shell: bash
  run: echo "STORE_PATH=$(pnpm store path)" >> $GITHUB_OUTPUT

- name: Setup pnpm cache
  uses: actions/cache@v4
  with:
    path: ${{ steps.pnpm-cache.outputs.STORE_PATH }}
    key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
    restore-keys: |
      ${{ runner.os }}-pnpm-store-

Playwright Browsers Cache

Playwright浏览器缓存

Cache 500MB+ browser binaries:
yaml
- name: Cache Playwright browsers
  uses: actions/cache@v4
  id: playwright-cache
  with:
    path: ~/.cache/ms-playwright
    key: ${{ runner.os }}-playwright-${{ hashFiles('**/pnpm-lock.yaml') }}

- name: Install Playwright browsers
  if: steps.playwright-cache.outputs.cache-hit != 'true'
  run: pnpm exec playwright install --with-deps chromium

- name: Install Playwright system dependencies
  if: steps.playwright-cache.outputs.cache-hit == 'true'
  run: pnpm exec playwright install-deps chromium
缓存500MB以上的浏览器二进制文件:
yaml
- name: Cache Playwright browsers
  uses: actions/cache@v4
  id: playwright-cache
  with:
    path: ~/.cache/ms-playwright
    key: ${{ runner.os }}-playwright-${{ hashFiles('**/pnpm-lock.yaml') }}

- name: Install Playwright browsers
  if: steps.playwright-cache.outputs.cache-hit != 'true'
  run: pnpm exec playwright install --with-deps chromium

- name: Install Playwright system dependencies
  if: steps.playwright-cache.outputs.cache-hit == 'true'
  run: pnpm exec playwright install-deps chromium

Vite Build Cache

Vite构建缓存

For monorepos or frequent builds:
yaml
- name: Cache Vite build
  uses: actions/cache@v4
  with:
    path: |
      dist/
      node_modules/.vite/
    key: ${{ runner.os }}-vite-${{ hashFiles('src/**', 'vite.config.ts') }}
适用于单体仓库或频繁构建的场景:
yaml
- name: Cache Vite build
  uses: actions/cache@v4
  with:
    path: |
      dist/
      node_modules/.vite/
    key: ${{ runner.os }}-vite-${{ hashFiles('src/**', 'vite.config.ts') }}

Workflow Optimization

工作流优化

Job Dependencies

任务依赖

Use
needs
to control execution flow:
yaml
jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - name: Build
        run: pnpm run build
      - name: Run unit tests
        run: pnpm test

  e2e-tests:
    needs: build-and-test # Wait for build to complete
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3]
    steps:
      - name: Run E2E tests
        run: pnpm exec playwright test --shard=${{ matrix.shard }}/3
使用
needs
控制执行流程:
yaml
jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - name: Build
        run: pnpm run build
      - name: Run unit tests
        run: pnpm test

  e2e-tests:
    needs: build-and-test # Wait for build to complete
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3]
    steps:
      - name: Run E2E tests
        run: pnpm exec playwright test --shard=${{ matrix.shard }}/3

Concurrency Control

并发控制

Prevent multiple runs on same branch:
yaml
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
避免同一分支上的多次运行:
yaml
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

Artifact Management

制品管理

Per-Shard Artifacts

分片制品

Upload test reports from each shard:
yaml
- name: Upload Playwright report
  if: always()
  uses: actions/upload-artifact@v4
  with:
    name: playwright-report-shard-${{ matrix.shard }}-${{ github.sha }}
    path: playwright-report/
    retention-days: 7
    compression-level: 6
上传每个分片的测试报告:
yaml
- name: Upload Playwright report
  if: always()
  uses: actions/upload-artifact@v4
  with:
    name: playwright-report-shard-${{ matrix.shard }}-${{ github.sha }}
    path: playwright-report/
    retention-days: 7
    compression-level: 6

Artifact Cleanup

制品清理

Set short retention for test reports to reduce storage costs:
yaml
retention-days: 7 # Default is 90 days
compression-level: 6 # Compress to reduce storage
为测试报告设置较短的保留期以降低存储成本:
yaml
retention-days: 7 # Default is 90 days
compression-level: 6 # Compress to reduce storage

Performance Monitoring

性能监控

Expected Benchmarks

预期基准

OptimizationBeforeAfterImprovement
Test sharding (3 shards)27 min9-10 min60-65%
pnpm cache hit2-3 min10-15s85-90%
Playwright cache hit1-2 min5-10s90-95%
Vite build cache1-2 min5-10s90-95%
优化项优化前优化后优化幅度
测试分片(3个分片)27分钟9-10分钟60-65%
pnpm缓存命中2-3分钟10-15秒85-90%
Playwright缓存命中1-2分钟5-10秒90-95%
Vite构建缓存1-2分钟5-10秒90-95%

Regression Detection

回归检测

Set timeout thresholds as guardrails:
yaml
timeout-minutes: 30 # Fail if shard exceeds 30 minutes
Monitor shard execution times and rebalance if one shard consistently exceeds others by >2 minutes.
设置超时阈值作为防护措施:
yaml
timeout-minutes: 30 # Fail if shard exceeds 30 minutes
监控分片执行时间,如果某个分片的持续时间始终比其他分片长2分钟以上,则需要重新平衡分片。

Optimization Workflow

优化工作流

Phase 1: Baseline

阶段1:基准测试

  1. Record current CI execution times
  2. Identify slowest jobs
  3. Measure cache hit rates (check Actions logs)
  1. 记录当前CI执行时间
  2. 识别最慢的任务
  3. 测量缓存命中率(查看Actions日志)

Phase 2: Implement Caching

阶段2:实现缓存

  1. Add pnpm store cache (highest impact)
  2. Add Playwright browser cache
  3. Add build caches if applicable
  4. Verify cache keys work correctly
  1. 添加pnpm存储缓存(影响最大)
  2. 添加Playwright浏览器缓存
  3. 如有需要,添加构建缓存
  4. 验证缓存键是否正常工作

Phase 3: Implement Sharding

阶段3:实现分片

  1. Calculate optimal shard count (target 3-5 min per shard)
  2. Add matrix strategy to workflow
  3. Test locally:
    playwright test --shard=1/3
  4. Monitor shard balance in CI
  1. 计算最优分片数量(目标每个分片耗时3-5分钟)
  2. 向工作流中添加matrix策略
  3. 本地测试:
    playwright test --shard=1/3
  4. 在CI中监控分片负载平衡

Phase 4: Monitor & Adjust

阶段4:监控与调整

  1. Track execution times over 5-10 runs
  2. Identify unbalanced shards (>2 min variance)
  3. Adjust shard distribution if needed
  4. Set up alerts for regressions
  1. 跟踪5-10次运行的执行时间
  2. 识别负载不均衡的分片(差异>2分钟)
  3. 如有需要,调整分片分配
  4. 设置回归告警

Common Issues

常见问题

Shard imbalance (one shard takes 2x longer)
  • Use manual distribution with
    --grep
    patterns
  • Group heavy tests together, distribute across shards
Cache misses despite correct key
  • Verify
    hashFiles
    glob patterns match actual files
  • Check if lock file changes on every run (shouldn't happen)
Playwright install fails with cache hit
  • Ensure system dependencies installed separately:
    playwright install-deps
Tests fail in CI but pass locally
  • Check environment variables (CI=true may affect behavior)
  • Verify mock setup works in parallel execution
  • Increase timeouts for slow operations
分片负载不均衡(某个分片耗时是其他的2倍)
  • 使用
    --grep
    模式进行手动分配
  • 将重型测试分组,分散到不同分片
缓存键正确但缓存未命中
  • 验证
    hashFiles
    通配符模式是否匹配实际文件
  • 检查锁文件是否每次运行都变化(正常情况下不应发生)
缓存命中时Playwright安装失败
  • 确保单独安装系统依赖:
    playwright install-deps
CI中测试失败但本地通过
  • 检查环境变量(CI=true可能影响行为)
  • 验证mock设置在并行执行时是否正常工作
  • 为慢操作增加超时时间

Success Criteria

成功标准

  • CI execution time < 15 minutes total
  • Cache hit rate > 85% for dependencies
  • Shard execution time variance < 2 minutes
  • Zero timeout failures from slow tests
  • 总CI执行时间<15分钟
  • 依赖缓存命中率>85%
  • 分片执行时间差异<2分钟
  • 无因测试缓慢导致的超时失败

References

参考资料