Test-Driven Development (TDD)

Overview

Write tests first. Watch them fail. Write the minimal code to make them pass.

Core Principles: If you don't see the test fail, you don't know if it's testing the right thing.

Breaking the letter of the rule means breaking the spirit of the rule.

When to Use

Always:

New features
Bug fixes
Refactoring
Behavior changes

Exceptions (consult your human partner):

One-off prototypes
Generated code
Configuration files

Thinking "I'll skip TDD this time"? Stop. That's rationalization.

Non-Negotiable Rules

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Wrote code before the test? Delete it. Start over.

No exceptions:

Don't keep it as "reference"
Don't "adapt" the test to fit the code
Don't look at it
Delete means delete

Implement a new test. Period.

Red-Green-Refactor

dot

digraph tdd_cycle {
    rankdir=LR;
    red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
    verify_red [label="Verify fails\ncorrectly", shape=diamond];
    green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
    verify_green [label="Verify passes\nAll green", shape=diamond];
    refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
    next [label="Next", shape=ellipse];

    red -> verify_red;
    verify_red -> green [label="yes"];
    verify_red -> red [label="wrong\nfailure"];
    green -> verify_green;
    verify_green -> refactor [label="yes"];
    verify_green -> green [label="no"];
    refactor -> verify_green [label="stay\ngreen"];
    verify_green -> next;
    next -> red;
}

RED - Write a Failing Test

Write a minimal test that shows what should happen.

<Good> ```typescript test('retries failed operations 3 times', async () => { let attempts = 0; const operation = () => { attempts++; if (attempts < 3) throw new Error('fail'); return 'success'; };

const result = await retryOperation(operation);

expect(result).toBe('success'); expect(attempts).toBe(3); });

Clear name, tests real behavior, focuses on one single thing
</Good>

<Bad>
```typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});

Vague name, tests mocks instead of actual code </Bad>

Requirements:

One behavior per test
Clear name
Real code (avoid mocks unless unavoidable)

Verify RED - Watch It Fail

Mandatory. Never skip.

bash

npm test path/to/test.test.ts

Confirm:

The test fails (not errors out)
The failure message is as expected
It fails due to missing functionality (not typos)

Test passed? You're testing existing behavior. Fix the test.

Test errored? Fix the error, re-run until it fails correctly.

GREEN - Minimal Code

Write the simplest code possible to pass the test.

<Good> ```typescript async function retryOperation<T>(fn: () => Promise<T>): Promise<T> { for (let i = 0; i < 3; i++) { try { return await fn(); } catch (e) { if (i === 2) throw e; } } throw new Error('unreachable'); } ``` Just enough to pass the test </Good> <Bad> ```typescript async function retryOperation<T>( fn: () => Promise<T>, options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (attempt: number) => void; } ): Promise<T> { // YAGNI } ``` Over-engineered </Bad>

Don't add extra features, refactor other code, or make "improvements" beyond what the test requires.

Verify GREEN - Watch It Pass

Mandatory.

bash

npm test path/to/test.test.ts

Confirm:

The test passes
All other tests still pass
Output is clean (no errors, warnings)

Test failed? Fix the code, not the test.

Other tests failed? Fix them immediately.

REFACTOR - Clean Up

Only when all tests are green:

Remove duplication
Improve naming
Extract helper functions

Keep tests green. Don't add new behavior.

Repeat

Write the next failing test for the next feature.

Good Tests

Quality	Good	Bad
Minimal	One thing. "And" in the name? Split it.	`test('validates email and domain and whitespace')`
Clear	Name describes behavior	`test('test1')`
Expresses Intent	Shows the desired API	Obscures how the code executes

Why Order Matters

"I'll write tests later to verify it works"

Tests written after code pass immediately. Immediate passing proves nothing:

It might test the wrong thing
It might test the implementation, not the behavior
It might miss edge cases you forgot
It will never catch bugs you didn't anticipate

Test-first forces you to see the test fail, proving it actually tests something.

"I've manually tested all edge cases"

Manual testing is ad-hoc. You think you tested everything, but:

There's no record of what you tested
You can't re-run it after code changes
It's easy to forget cases under pressure
"I tried it once and it worked" ≠ comprehensive

Automated tests are systematic. They run the same way every time.

"Deleting X hours of work is a waste"

Sunk cost fallacy. The time is already spent. Your options now:

Delete and rewrite with TDD (a few more hours, high confidence)
Keep it and add tests later (30 minutes, low confidence, potential bugs)

The "waste" is keeping code you don't trust. Working code without real tests is technical debt.

"TDD is dogmatic, pragmatism means adapting"

TDD is pragmatic:

Catches bugs before they're committed (faster than debugging later)
Prevents regressions (tests catch breaks immediately)
Documents behavior (tests show how to use the code)
Enables refactoring (change freely, tests catch breaks)

"Pragmatic" shortcuts = debugging in production = slower overall.

"Testing after achieves the same goal - it's the spirit not the ritual"

No. Tests-after answer "What does this do?" Test-first answers "What should this do?"

Tests-after are biased by your implementation. You test what you built, not what you needed. You verify edge cases you remembered, not the ones you missed.

Test-first forces you to discover edge cases before implementation. Tests-after only verify what you remembered (and you didn't remember everything).

≠ TDD with tests written 30 minutes later. You get the insurance, but lose the proof that the tests work.

Common Rationalizations

Excuse	Reality
"It's too simple to test"	Simple code breaks. Writing the test takes 30 seconds.
"I'll test later"	Tests written later pass immediately and prove nothing.
"Testing after achieves the same goal"	Tests-after = "What does this do?" Test-first = "What should this do?"
"I already tested manually"	Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is a waste"	Sunk cost fallacy. Keeping unvalidated code is technical debt.
"I'll keep this as reference and write the test first next time"	You'll adapt the test to fit it. Later becomes never. Delete means delete.
"I need to explore first"	Fine. Do your exploration, then start with TDD.
"Hard to test = unclear design"	Listen to the tests. Hard to test = hard to use.
"TDD slows me down"	TDD is faster than debugging. Pragmatism = test first.
"Manual testing is faster"	Manual can't prove edge cases. You'll have to re-test every change.
"Existing code has no tests"	You're improving it. Add tests for existing code.

Red Flags - Stop and Start Over

Wrote code before the test
Wrote tests after implementation
Tests pass immediately
Can't explain why a test failed
Planning to add tests "later"
Rationalizing "just this once"
"I already tested manually"
"Testing after achieves the same goal"
"It's about the spirit not the ritual"
"Keeping it as reference" or "adapting existing code"
"I spent X hours, deleting is a waste"
"TDD is dogmatic, I'm pragmatic"
"This is different because..."

All of these mean: Delete the production code. Start over with TDD.

Example: Bug Fix

Bug: Accepts empty email addresses

RED

typescript

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

Verify RED

bash

$ npm test
FAIL: expected 'Email required', got undefined

GREEN

typescript

function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}

Verify GREEN

bash

$ npm test
PASS

REFACTOR Extract validation for multiple fields if needed.

Verification Checklist

Before marking work complete:

Every new function/method has a test
Watched each test fail before implementation
Each test failed for the expected reason (missing functionality, not typos)
Wrote minimal code to pass each test
All tests pass
Output is clean (no errors, warnings)
Tests use real code (mocks only when unavoidable)
Edge cases and errors are covered

Can't check all boxes? You skipped TDD. Start over.

When Stuck

Problem	Solution
Don't know how to test	Write the API you want. Write the assertions first. Consult your human partner.
Test is too complex	Design is too complex. Simplify the interface.
Have to mock everything	Code is too coupled. Use dependency injection.
Test setup is huge	Extract helpers. Still complex? Simplify the design.

Debugging Integration

Found a bug? Write a failing test to reproduce it. Follow the TDD cycle. The test proves the fix works and prevents regressions.

Never fix a bug without writing a test first.

Testing Anti-Patterns

When adding mocks or test utilities, refer to @testing-anti-patterns.md for common pitfalls:

Testing mock behavior instead of real behavior
Adding test-only methods to production classes
Mocking without understanding dependencies

Final Rule

Production code → test exists and failed first
Otherwise → not TDD

No exceptions without approval from your human partner.

test-driven-development

NPX Install

Tags

SKILL.md Content (Chinese)

Test-Driven Development (TDD)

Overview

When to Use

Non-Negotiable Rules

Red-Green-Refactor

RED - Write a Failing Test

Verify RED - Watch It Fail

GREEN - Minimal Code

Verify GREEN - Watch It Pass

REFACTOR - Clean Up

Repeat

Good Tests

Why Order Matters

Common Rationalizations

Red Flags - Stop and Start Over

Example: Bug Fix

Verification Checklist

When Stuck

Debugging Integration

Testing Anti-Patterns

Final Rule