flaky-test-detective

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Flaky Test Detective

不稳定测试侦探

Diagnose and eliminate flaky tests systematically.
系统化诊断并消除不稳定测试。

Common Flaky Test Patterns

常见不稳定测试模式

1. Timing Issues

1. 时序问题

typescript
// ❌ Flaky: Race condition
test("should load user data", async () => {
  render(<UserProfile userId="123" />);

  // Race condition - might pass or fail
  expect(screen.getByText("John Doe")).toBeInTheDocument();
});

// ✅ Fixed: Wait for element
test("should load user data", async () => {
  render(<UserProfile userId="123" />);

  await waitFor(() => {
    expect(screen.getByText("John Doe")).toBeInTheDocument();
  });
});

// ❌ Flaky: Fixed timeout
test("should complete animation", async () => {
  render(<AnimatedComponent />);
  await new Promise((resolve) => setTimeout(resolve, 500)); // Brittle!
  expect(element).toHaveClass("animated");
});

// ✅ Fixed: Wait for condition
test("should complete animation", async () => {
  render(<AnimatedComponent />);
  await waitFor(
    () => {
      expect(element).toHaveClass("animated");
    },
    { timeout: 2000 }
  );
});
typescript
// ❌ 不稳定:竞态条件
test("should load user data", async () => {
  render(<UserProfile userId="123" />);

  // 竞态条件 - 可能通过也可能失败
  expect(screen.getByText("John Doe")).toBeInTheDocument();
});

// ✅ 修复:等待元素
test("should load user data", async () => {
  render(<UserProfile userId="123" />);

  await waitFor(() => {
    expect(screen.getByText("John Doe")).toBeInTheDocument();
  });
});

// ❌ 不稳定:固定超时
test("should complete animation", async () => {
  render(<AnimatedComponent />);
  await new Promise((resolve) => setTimeout(resolve, 500)); // 脆弱!
  expect(element).toHaveClass("animated");
});

// ✅ 修复:等待条件满足
test("should complete animation", async () => {
  render(<AnimatedComponent />);
  await waitFor(
    () => {
      expect(element).toHaveClass("animated");
    },
    { timeout: 2000 }
  );
});

2. Shared State

2. 共享状态

typescript
// ❌ Flaky: Global state pollution
let userId = "123";

test("test A", () => {
  userId = "456"; // Modifies global
  // ...
});

test("test B", () => {
  expect(userId).toBe("123"); // Fails if test A runs first!
});

// ✅ Fixed: Isolated state
test("test A", () => {
  const userId = "456"; // Local variable
  // ...
});

test("test B", () => {
  const userId = "123";
  expect(userId).toBe("123");
});

// ❌ Flaky: Database not cleaned
test("should create user", async () => {
  await db.user.create({ email: "test@example.com" });
  // No cleanup!
});

test("should create another user", async () => {
  await db.user.create({ email: "test@example.com" }); // Fails! Duplicate
});

// ✅ Fixed: Proper cleanup
afterEach(async () => {
  await db.user.deleteMany();
});
typescript
// ❌ 不稳定:全局状态污染
let userId = "123";

test("test A", () => {
  userId = "456"; // 修改全局变量
  // ...
});

test("test B", () => {
  expect(userId).toBe("123"); // 如果test A先运行则失败!
});

// ✅ 修复:隔离状态
test("test A", () => {
  const userId = "456"; // 局部变量
  // ...
});

test("test B", () => {
  const userId = "123";
  expect(userId).toBe("123");
});

// ❌ 不稳定:数据库未清理
test("should create user", async () => {
  await db.user.create({ email: "test@example.com" });
  // 未清理!
});

test("should create another user", async () => {
  await db.user.create({ email: "test@example.com" }); // 失败!重复数据
});

// ✅ 修复:正确清理
afterEach(async () => {
  await db.user.deleteMany();
});

3. Randomness

3. 随机性

typescript
// ❌ Flaky: Random data
test("should sort users", () => {
  const users = generateRandomUsers(10); // Different each time!
  const sorted = sortUsers(users);
  expect(sorted[0].name).toBe("Alice"); // Might not be Alice
});

// ✅ Fixed: Deterministic data
test("should sort users", () => {
  const users = [
    { name: "Charlie", age: 30 },
    { name: "Alice", age: 25 },
    { name: "Bob", age: 35 },
  ];
  const sorted = sortUsers(users);
  expect(sorted[0].name).toBe("Alice");
});

// ✅ Fixed: Seeded randomness
import { faker } from "@faker-js/faker";

beforeEach(() => {
  faker.seed(12345); // Same data every time
});
typescript
// ❌ 不稳定:随机数据
test("should sort users", () => {
  const users = generateRandomUsers(10); // 每次都不同!
  const sorted = sortUsers(users);
  expect(sorted[0].name).toBe("Alice"); // 可能不是Alice
});

// ✅ 修复:确定性数据
test("should sort users", () => {
  const users = [
    { name: "Charlie", age: 30 },
    { name: "Alice", age: 25 },
    { name: "Bob", age: 35 },
  ];
  const sorted = sortUsers(users);
  expect(sorted[0].name).toBe("Alice");
});

// ✅ 修复:带种子的随机性
import { faker } from "@faker-js/faker";

beforeEach(() => {
  faker.seed(12345); // 每次生成相同数据
});

4. Network Dependencies

4. 网络依赖

typescript
// ❌ Flaky: Real API call
test("should fetch users", async () => {
  const users = await fetchUsers(); // External API!
  expect(users).toHaveLength(10); // Might fail if API down
});

// ✅ Fixed: Mocked API
test("should fetch users", async () => {
  server.use(
    http.get("/api/users", () => {
      return HttpResponse.json([
        { id: "1", name: "User 1" },
        { id: "2", name: "User 2" },
      ]);
    })
  );

  const users = await fetchUsers();
  expect(users).toHaveLength(2);
});
typescript
// ❌ 不稳定:真实API调用
test("should fetch users", async () => {
  const users = await fetchUsers(); // 外部API!
  expect(users).toHaveLength(10); // 如果API宕机则可能失败
});

// ✅ 修复:Mock API
test("should fetch users", async () => {
  server.use(
    http.get("/api/users", () => {
      return HttpResponse.json([
        { id: "1", name: "User 1" },
        { id: "2", name: "User 2" },
      ]);
    })
  );

  const users = await fetchUsers();
  expect(users).toHaveLength(2);
});

Flaky Test Detection Script

不稳定测试检测脚本

typescript
// scripts/detect-flaky-tests.ts
import { execSync } from "child_process";

async function detectFlakyTests(iterations: number = 10) {
  const results = new Map<string, { passed: number; failed: number }>();

  for (let i = 0; i < iterations; i++) {
    console.log(`\nRun ${i + 1}/${iterations}`);

    try {
      const output = execSync("npm test -- --reporter=json", {
        encoding: "utf-8",
      });

      const testResults = JSON.parse(output);

      testResults.testResults.forEach((file: any) => {
        file.assertionResults.forEach((test: any) => {
          const key = `${file.name}::${test.fullName}`;
          const stats = results.get(key) || { passed: 0, failed: 0 };

          if (test.status === "passed") {
            stats.passed++;
          } else {
            stats.failed++;
          }

          results.set(key, stats);
        });
      });
    } catch (error) {
      console.error("Test run failed:", error);
    }
  }

  // Analyze results
  console.log("\n🔍 Flaky Test Report\n");

  const flakyTests: string[] = [];

  results.forEach((stats, testName) => {
    if (stats.failed > 0 && stats.passed > 0) {
      const failureRate = (stats.failed / iterations) * 100;
      console.log(`❌ FLAKY: ${testName}`);
      console.log(`   Passed: ${stats.passed}/${iterations}`);
      console.log(`   Failed: ${stats.failed}/${iterations}`);
      console.log(`   Failure rate: ${failureRate.toFixed(1)}%\n`);
      flakyTests.push(testName);
    }
  });

  if (flakyTests.length === 0) {
    console.log("✅ No flaky tests detected!");
  } else {
    console.log(`\n🚨 Found ${flakyTests.length} flaky tests`);
    process.exit(1);
  }
}

detectFlakyTests(20); // Run tests 20 times
typescript
// scripts/detect-flaky-tests.ts
import { execSync } from "child_process";

async function detectFlakyTests(iterations: number = 10) {
  const results = new Map<string, { passed: number; failed: number }>();

  for (let i = 0; i < iterations; i++) {
    console.log(`\nRun ${i + 1}/${iterations}`);

    try {
      const output = execSync("npm test -- --reporter=json", {
        encoding: "utf-8",
      });

      const testResults = JSON.parse(output);

      testResults.testResults.forEach((file: any) => {
        file.assertionResults.forEach((test: any) => {
          const key = `${file.name}::${test.fullName}`;
          const stats = results.get(key) || { passed: 0, failed: 0 };

          if (test.status === "passed") {
            stats.passed++;
          } else {
            stats.failed++;
          }

          results.set(key, stats);
        });
      });
    } catch (error) {
      console.error("Test run failed:", error);
    }
  }

  // 分析结果
  console.log("\n🔍 不稳定测试报告\n");

  const flakyTests: string[] = [];

  results.forEach((stats, testName) => {
    if (stats.failed > 0 && stats.passed > 0) {
      const failureRate = (stats.failed / iterations) * 100;
      console.log(`❌ FLAKY: ${testName}`);
      console.log(`   Passed: ${stats.passed}/${iterations}`);
      console.log(`   Failed: ${stats.failed}/${iterations}`);
      console.log(`   Failure rate: ${failureRate.toFixed(1)}%\n`);
      flakyTests.push(testName);
    }
  });

  if (flakyTests.length === 0) {
    console.log("✅ 未检测到不稳定测试!");
  } else {
    console.log(`\n🚨 发现 ${flakyTests.length} 个不稳定测试`);
    process.exit(1);
  }
}

detectFlakyTests(20); // 运行测试20次

Root Cause Analysis

根本原因分析

typescript
// Framework for analyzing flaky tests
interface FlakyTestAnalysis {
  testName: string;
  failureRate: number;
  symptoms: string[];
  rootCause: "timing" | "state" | "randomness" | "network" | "unknown";
  recommendation: string;
}

function analyzeTest(
  testName: string,
  errorMessages: string[]
): FlakyTestAnalysis {
  const analysis: FlakyTestAnalysis = {
    testName,
    failureRate: 0,
    symptoms: [],
    rootCause: "unknown",
    recommendation: "",
  };

  // Detect timing issues
  if (
    errorMessages.some(
      (msg) => msg.includes("timeout") || msg.includes("not found")
    )
  ) {
    analysis.symptoms.push("Timeout or element not found");
    analysis.rootCause = "timing";
    analysis.recommendation =
      "Add explicit waits using waitFor() or findBy* queries";
  }

  // Detect shared state
  if (
    errorMessages.some(
      (msg) =>
        msg.includes("already exists") || msg.includes("unique constraint")
    )
  ) {
    analysis.symptoms.push("Duplicate or existing data");
    analysis.rootCause = "state";
    analysis.recommendation =
      "Add beforeEach/afterEach cleanup or use unique test data";
  }

  // Detect randomness
  if (
    errorMessages.some(
      (msg) => msg.includes("expected") && msg.includes("received")
    )
  ) {
    analysis.symptoms.push("Inconsistent values");
    analysis.rootCause = "randomness";
    analysis.recommendation =
      "Use deterministic test data or seed random generators";
  }

  // Detect network issues
  if (
    errorMessages.some(
      (msg) => msg.includes("network") || msg.includes("ECONNREFUSED")
    )
  ) {
    analysis.symptoms.push("Network or connection errors");
    analysis.rootCause = "network";
    analysis.recommendation = "Mock all network requests using MSW or similar";
  }

  return analysis;
}
typescript
// 不稳定测试分析框架
interface FlakyTestAnalysis {
  testName: string;
  failureRate: number;
  symptoms: string[];
  rootCause: "timing" | "state" | "randomness" | "network" | "unknown";
  recommendation: string;
}

function analyzeTest(
  testName: string,
  errorMessages: string[]
): FlakyTestAnalysis {
  const analysis: FlakyTestAnalysis = {
    testName,
    failureRate: 0,
    symptoms: [],
    rootCause: "unknown",
    recommendation: "",
  };

  // 检测时序问题
  if (
    errorMessages.some(
      (msg) => msg.includes("timeout") || msg.includes("not found")
    )
  ) {
    analysis.symptoms.push("超时或元素未找到");
    analysis.rootCause = "timing";
    analysis.recommendation =
      "使用waitFor()或findBy*查询添加显式等待";
  }

  // 检测共享状态问题
  if (
    errorMessages.some(
      (msg) =>
        msg.includes("already exists") || msg.includes("unique constraint")
    )
  ) {
    analysis.symptoms.push("重复或已存在数据");
    analysis.rootCause = "state";
    analysis.recommendation =
      "添加beforeEach/afterEach清理逻辑或使用唯一测试数据";
  }

  // 检测随机性问题
  if (
    errorMessages.some(
      (msg) => msg.includes("expected") && msg.includes("received")
    )
  ) {
    analysis.symptoms.push("值不一致");
    analysis.rootCause = "randomness";
    analysis.recommendation =
      "使用确定性测试数据或为随机生成器设置种子";
  }

  // 检测网络问题
  if (
    errorMessages.some(
      (msg) => msg.includes("network") || msg.includes("ECONNREFUSED")
    )
  ) {
    analysis.symptoms.push("网络或连接错误");
    analysis.rootCause = "network";
    analysis.recommendation = "使用MSW或类似工具Mock所有网络请求";
  }

  return analysis;
}

Stabilization Guidelines

稳定化指南

typescript
// Test stability checklist
const stabilityChecklist = {
  timing: [
    "Use waitFor() instead of fixed timeouts",
    "Use findBy* queries (built-in waiting)",
    "Set appropriate timeout values",
    "Wait for loading states to disappear",
  ],
  state: [
    "Clear database before each test",
    "Reset mocks after each test",
    "Use test-specific data (unique IDs)",
    "Avoid global variables",
  ],
  randomness: [
    "Use fixed seed for random generators",
    "Use deterministic test data",
    "Avoid Date.now() - mock time instead",
    "Generate IDs deterministically",
  ],
  network: [
    "Mock all API calls",
    "Use MSW for HTTP mocking",
    "Avoid real external services",
    "Test network errors explicitly",
  ],
  parallelism: [
    "Use isolated databases per test worker",
    "Avoid port conflicts (random ports)",
    "Dont share file system state",
    "Use test.concurrent cautiously",
  ],
};
typescript
// 测试稳定性检查清单
const stabilityChecklist = {
  timing: [
    "使用waitFor()替代固定超时",
    "使用findBy*查询(内置等待逻辑)",
    "设置合适的超时值",
    "等待加载状态消失",
  ],
  state: [
    "每次测试前清理数据库",
    "每次测试后重置Mock",
    "使用测试专属数据(唯一ID)",
    "避免全局变量",
  ],
  randomness: [
    "为随机生成器设置固定种子",
    "使用确定性测试数据",
    "避免使用Date.now() - 改为Mock时间",
    "确定性生成ID",
  ],
  network: [
    "Mock所有API调用",
    "使用MSW进行HTTP Mock",
    "避免真实外部服务",
    "显式测试网络错误",
  ],
  parallelism: [
    "为每个测试工作线程使用独立数据库",
    "避免端口冲突(使用随机端口)",
    "不要共享文件系统状态",
    "谨慎使用test.concurrent",
  ],
};

Auto-Fix Patterns

自动修复模式

typescript
// Automated fixes for common issues

// Fix 1: Add waitFor to assertions
function addWaitFor(code: string): string {
  // Replace: expect(screen.getByText('...')).toBeInTheDocument()
  // With: await waitFor(() => expect(screen.getByText('...')).toBeInTheDocument())

  return code
    .replace(
      /expect\(screen\.getBy/g,
      "await waitFor(() => expect(screen.getBy"
    )
    .replace(/\)\.toBeInTheDocument\(\)/g, ").toBeInTheDocument())");
}

// Fix 2: Replace getBy with findBy
function replaceGetByWithFindBy(code: string): string {
  return code.replace(/screen\.getBy/g, "await screen.findBy");
}

// Fix 3: Add cleanup
function addCleanup(code: string): string {
  if (!code.includes("afterEach")) {
    const insertPoint = code.indexOf("test(");
    return (
      code.slice(0, insertPoint) +
      "afterEach(async () => {\n  await cleanup();\n});\n\n" +
      code.slice(insertPoint)
    );
  }
  return code;
}
typescript
// 常见问题的自动化修复

// 修复1:为断言添加waitFor
function addWaitFor(code: string): string {
  // 替换: expect(screen.getByText('...')).toBeInTheDocument()
  // 为: await waitFor(() => expect(screen.getByText('...')).toBeInTheDocument())

  return code
    .replace(
      /expect\(screen\.getBy/g,
      "await waitFor(() => expect(screen.getBy"
    )
    .replace(/\)\.toBeInTheDocument\(\)/g, ").toBeInTheDocument())");
}

// 修复2:将getBy替换为findBy
function replaceGetByWithFindBy(code: string): string {
  return code.replace(/screen\.getBy/g, "await screen.findBy");
}

// 修复3:添加清理逻辑
function addCleanup(code: string): string {
  if (!code.includes("afterEach")) {
    const insertPoint = code.indexOf("test(");
    return (
      code.slice(0, insertPoint) +
      "afterEach(async () => {\n  await cleanup();\n});\n\n" +
      code.slice(insertPoint)
    );
  }
  return code;
}

Monitoring Flaky Tests in CI

在CI中监控不稳定测试

yaml
undefined
yaml
undefined

.github/workflows/test-stability.yml

.github/workflows/test-stability.yml

name: Test Stability
on: schedule: - cron: "0 2 * * *" # Run nightly
jobs: stability-check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4
  - uses: actions/setup-node@v4
    with:
      node-version: "20"

  - run: npm ci

  - name: Run tests 20 times
    run: |
      for i in {1..20}; do
        echo "Run $i/20"
        npm test || echo "FAILED: Run $i"
      done

  - name: Analyze results
    run: npm run detect-flaky-tests
undefined
name: Test Stability
on: schedule: - cron: "0 2 * * *" # 每晚运行
jobs: stability-check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4
  - uses: actions/setup-node@v4
    with:
      node-version: "20"

  - run: npm ci

  - name: 运行测试20次
    run: |
      for i in {1..20}; do
        echo "Run $i/20"
        npm test || echo "FAILED: Run $i"
      done

  - name: 分析结果
    run: npm run detect-flaky-tests
undefined

Best Practices

最佳实践

  1. Explicit waits: Never use sleep/timeout
  2. Clean state: Reset between tests
  3. Deterministic data: No randomness
  4. Mock external deps: APIs, time, randomness
  5. Run tests multiple times: Catch intermittent failures
  6. Isolate tests: No shared state
  7. Monitor CI: Track flaky test trends
  1. 显式等待:绝不使用sleep/固定超时
  2. 清理状态:测试间重置状态
  3. 确定性数据:避免随机性
  4. Mock外部依赖:API、时间、随机生成器
  5. 多次运行测试:捕捉间歇性失败
  6. 隔离测试:无共享状态
  7. 监控CI:跟踪不稳定测试趋势

Output Checklist

输出检查清单

  • Common patterns identified
  • Root cause analysis performed
  • Timing issues fixed (waitFor)
  • Shared state eliminated (cleanup)
  • Randomness removed (fixed seeds)
  • Network mocked (MSW)
  • Detection script implemented
  • Stabilization guidelines documented
  • CI monitoring configured
  • 已识别常见模式
  • 已执行根本原因分析
  • 已修复时序问题(waitFor)
  • 已消除共享状态(清理逻辑)
  • 已移除随机性(固定种子)
  • 已Mock网络(MSW)
  • 已实现检测脚本
  • 已记录稳定化指南
  • 已配置CI监控