Test-Driven Development (TDD)
Overview
Write tests first. Watch them fail. Write the minimal code to make them pass.
Core Principles: If you don't see the test fail, you don't know if it's testing the right thing.
Breaking the letter of the rule means breaking the spirit of the rule.
When to Use
Always:
- New features
- Bug fixes
- Refactoring
- Behavior changes
Exceptions (consult your human partner):
- One-off prototypes
- Generated code
- Configuration files
Thinking "I'll skip TDD this time"? Stop. That's rationalization.
Non-Negotiable Rules
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Wrote code before the test? Delete it. Start over.
No exceptions:
- Don't keep it as "reference"
- Don't "adapt" the test to fit the code
- Don't look at it
- Delete means delete
Implement a new test. Period.
Red-Green-Refactor
dot
digraph tdd_cycle {
rankdir=LR;
red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
verify_red [label="Verify fails\ncorrectly", shape=diamond];
green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
verify_green [label="Verify passes\nAll green", shape=diamond];
refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
next [label="Next", shape=ellipse];
red -> verify_red;
verify_red -> green [label="yes"];
verify_red -> red [label="wrong\nfailure"];
green -> verify_green;
verify_green -> refactor [label="yes"];
verify_green -> green [label="no"];
refactor -> verify_green [label="stay\ngreen"];
verify_green -> next;
next -> red;
}
RED - Write a Failing Test
Write a minimal test that shows what should happen.
<Good>
```typescript
test('retries failed operations 3 times', async () => {
let attempts = 0;
const operation = () => {
attempts++;
if (attempts < 3) throw new Error('fail');
return 'success';
};
const result = await retryOperation(operation);
expect(result).toBe('success');
expect(attempts).toBe(3);
});
Clear name, tests real behavior, focuses on one single thing
</Good>
<Bad>
```typescript
test('retry works', async () => {
const mock = jest.fn()
.mockRejectedValueOnce(new Error())
.mockRejectedValueOnce(new Error())
.mockResolvedValueOnce('success');
await retryOperation(mock);
expect(mock).toHaveBeenCalledTimes(3);
});
Vague name, tests mocks instead of actual code
</Bad>
Requirements:
- One behavior per test
- Clear name
- Real code (avoid mocks unless unavoidable)
Verify RED - Watch It Fail
Mandatory. Never skip.
bash
npm test path/to/test.test.ts
Confirm:
- The test fails (not errors out)
- The failure message is as expected
- It fails due to missing functionality (not typos)
Test passed? You're testing existing behavior. Fix the test.
Test errored? Fix the error, re-run until it fails correctly.
GREEN - Minimal Code
Write the simplest code possible to pass the test.
<Good>
```typescript
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
for (let i = 0; i < 3; i++) {
try {
return await fn();
} catch (e) {
if (i === 2) throw e;
}
}
throw new Error('unreachable');
}
```
Just enough to pass the test
</Good>
<Bad>
```typescript
async function retryOperation<T>(
fn: () => Promise<T>,
options?: {
maxRetries?: number;
backoff?: 'linear' | 'exponential';
onRetry?: (attempt: number) => void;
}
): Promise<T> {
// YAGNI
}
```
Over-engineered
</Bad>
Don't add extra features, refactor other code, or make "improvements" beyond what the test requires.
Verify GREEN - Watch It Pass
Mandatory.
bash
npm test path/to/test.test.ts
Confirm:
- The test passes
- All other tests still pass
- Output is clean (no errors, warnings)
Test failed? Fix the code, not the test.
Other tests failed? Fix them immediately.
REFACTOR - Clean Up
Only when all tests are green:
- Remove duplication
- Improve naming
- Extract helper functions
Keep tests green. Don't add new behavior.
Repeat
Write the next failing test for the next feature.
Good Tests
| Quality | Good | Bad |
|---|
| Minimal | One thing. "And" in the name? Split it. | test('validates email and domain and whitespace')
|
| Clear | Name describes behavior | |
| Expresses Intent | Shows the desired API | Obscures how the code executes |
Why Order Matters
"I'll write tests later to verify it works"
Tests written after code pass immediately. Immediate passing proves nothing:
- It might test the wrong thing
- It might test the implementation, not the behavior
- It might miss edge cases you forgot
- It will never catch bugs you didn't anticipate
Test-first forces you to see the test fail, proving it actually tests something.
"I've manually tested all edge cases"
Manual testing is ad-hoc. You think you tested everything, but:
- There's no record of what you tested
- You can't re-run it after code changes
- It's easy to forget cases under pressure
- "I tried it once and it worked" ≠ comprehensive
Automated tests are systematic. They run the same way every time.
"Deleting X hours of work is a waste"
Sunk cost fallacy. The time is already spent. Your options now:
- Delete and rewrite with TDD (a few more hours, high confidence)
- Keep it and add tests later (30 minutes, low confidence, potential bugs)
The "waste" is keeping code you don't trust. Working code without real tests is technical debt.
"TDD is dogmatic, pragmatism means adapting"
TDD is pragmatic:
- Catches bugs before they're committed (faster than debugging later)
- Prevents regressions (tests catch breaks immediately)
- Documents behavior (tests show how to use the code)
- Enables refactoring (change freely, tests catch breaks)
"Pragmatic" shortcuts = debugging in production = slower overall.
"Testing after achieves the same goal - it's the spirit not the ritual"
No. Tests-after answer "What does this do?" Test-first answers "What should this do?"
Tests-after are biased by your implementation. You test what you built, not what you needed. You verify edge cases you remembered, not the ones you missed.
Test-first forces you to discover edge cases before implementation. Tests-after only verify what you remembered (and you didn't remember everything).
≠ TDD with tests written 30 minutes later. You get the insurance, but lose the proof that the tests work.
Common Rationalizations
| Excuse | Reality |
|---|
| "It's too simple to test" | Simple code breaks. Writing the test takes 30 seconds. |
| "I'll test later" | Tests written later pass immediately and prove nothing. |
| "Testing after achieves the same goal" | Tests-after = "What does this do?" Test-first = "What should this do?" |
| "I already tested manually" | Ad-hoc ≠ systematic. No record, can't re-run. |
| "Deleting X hours is a waste" | Sunk cost fallacy. Keeping unvalidated code is technical debt. |
| "I'll keep this as reference and write the test first next time" | You'll adapt the test to fit it. Later becomes never. Delete means delete. |
| "I need to explore first" | Fine. Do your exploration, then start with TDD. |
| "Hard to test = unclear design" | Listen to the tests. Hard to test = hard to use. |
| "TDD slows me down" | TDD is faster than debugging. Pragmatism = test first. |
| "Manual testing is faster" | Manual can't prove edge cases. You'll have to re-test every change. |
| "Existing code has no tests" | You're improving it. Add tests for existing code. |
Red Flags - Stop and Start Over
- Wrote code before the test
- Wrote tests after implementation
- Tests pass immediately
- Can't explain why a test failed
- Planning to add tests "later"
- Rationalizing "just this once"
- "I already tested manually"
- "Testing after achieves the same goal"
- "It's about the spirit not the ritual"
- "Keeping it as reference" or "adapting existing code"
- "I spent X hours, deleting is a waste"
- "TDD is dogmatic, I'm pragmatic"
- "This is different because..."
All of these mean: Delete the production code. Start over with TDD.
Example: Bug Fix
Bug: Accepts empty email addresses
RED
typescript
test('rejects empty email', async () => {
const result = await submitForm({ email: '' });
expect(result.error).toBe('Email required');
});
Verify RED
bash
$ npm test
FAIL: expected 'Email required', got undefined
GREEN
typescript
function submitForm(data: FormData) {
if (!data.email?.trim()) {
return { error: 'Email required' };
}
// ...
}
Verify GREEN
REFACTOR
Extract validation for multiple fields if needed.
Verification Checklist
Before marking work complete:
Can't check all boxes? You skipped TDD. Start over.
When Stuck
| Problem | Solution |
|---|
| Don't know how to test | Write the API you want. Write the assertions first. Consult your human partner. |
| Test is too complex | Design is too complex. Simplify the interface. |
| Have to mock everything | Code is too coupled. Use dependency injection. |
| Test setup is huge | Extract helpers. Still complex? Simplify the design. |
Debugging Integration
Found a bug? Write a failing test to reproduce it. Follow the TDD cycle. The test proves the fix works and prevents regressions.
Never fix a bug without writing a test first.
Testing Anti-Patterns
When adding mocks or test utilities, refer to @testing-anti-patterns.md for common pitfalls:
- Testing mock behavior instead of real behavior
- Adding test-only methods to production classes
- Mocking without understanding dependencies
Final Rule
Production code → test exists and failed first
Otherwise → not TDD
No exceptions without approval from your human partner.