Systematic Debugging

A strict debugging workflow. Use when dealing with bugs, test failures, or unexpected behavior.

Three core purposes:

Fix the cause, not the symptom.
Prevent guess-based fixes.
Lock the failure with a test before fixing.

Hard Gates

These rules have no exceptions.

Do not fix until you have a reproducible or observable state.
Do not fix until you have stated a root-cause hypothesis.
Do not fix until you have a failing test or equivalent reproduction mechanism.
Verify only one hypothesis at a time.
No "while I'm here" refactoring during a fix.
If three fix attempts fail, suspect a structural issue before applying another patch.

Violating this process is considered a debugging failure.

When To Use

Use this skill in the following situations:

When a test fails
When a bug occurs in production or locally
When a response, state, rendering, or query result differs from expectations
When investigating performance degradation, timeouts, race conditions, or intermittent failures
When something breaks again after being fixed at least once

The following excuses are not accepted:

"It looks simple, I'll just fix it directly"
"No time, let's patch it and move on"
"It's probably this, let me just change it"

Required Output Contract

When using this skill, the following items must be locked internally:

Problem statement: Define what went wrong in one sentence
Reproduction path: How to reproduce or observe the failure
Evidence: Actual observed results
Root-cause hypothesis: Why you believe this problem occurs
Failing guard: One of: failing test, reproduction script, or log verification
Fix: A single fix targeting the root cause
Verification: Reproduction path and related test results after the fix

If any of these seven items are missing, the work is not done.

Workflow

Follow the steps below in order.

Phase 1. Define The Problem

First, condense the problem.

What is the expected behavior
What is the observed behavior
What is the scope of impact
Is it always reproducible or intermittent

Output format:

text

Problem: <expected> but got <actual> under <condition>

Do not mix symptoms with speculation.

text

Good: Product detail API returns 500 when brand is null.
Bad: Serializer is broken because brand mapping seems wrong.

Phase 2. Reproduce Or Instrument

You must be able to see the failure again before fixing it.

Priority:

Reproduce with existing tests
Reproduce with a minimal integration test
Reproduce with a unit test
Observe via reproduction script or command
Observe after adding logs/instrumentation

Rules:

Make the reproduction path as small as possible.
Even if the bug is only visible in the UI, prefer reproducing at a lower layer if possible.
For intermittent failures, increase observability by adding logs, capturing inputs, timestamps, and concurrency conditions.
If reproduction fails, do not proceed to fixing — increase observability instead.

What to do when reproduction is not possible:

Record input values
Check for environment differences
Check recent changes
Add logs at boundary points
Search for smaller conditions that produce the same symptom

Phase 3. Gather Evidence

Collect only observable facts.

Always check:

Full error messages and stack traces
Failing input values
Recently changed files or commits
Environment/configuration differences
Call paths and data flow

For multi-component problems, check at each boundary.

Examples:

controller -> application -> service -> repository
client -> API -> external service
scheduler -> batch service -> database

At each boundary, check:

What came in
What went out
What values were transformed
Under what conditions it breaks

Do not fix until you have pinpointed the problem location.

Phase 4. Isolate Root Cause

Formulate exactly one cause candidate.

Format:

text

Hypothesis: <root cause> because <evidence>

Qualities of a good hypothesis:

Points to a single cause
Connects to observed evidence
Can be disproved with a small experiment

Examples of bad hypotheses:

"There seems to be some async issue somewhere"
"The whole serialization layer seems unstable"

Trace the cause back to the source. If the error appears deep in the stack, trace the origin of the input, not the symptom.

Phase 5. Lock The Failure

Lock the failure before fixing.

Priority:

Automated failing test
Add a regression case to existing tests
Minimal reproduction script
Temporary verification via logs/assertions

Rules:

Create an automated test whenever possible.
It must fail before the fix.
It must pass on the same path after the fix.
The test name must reveal what broke.

If an automated test is feasible, use the

test-driven-development

skill alongside this one.

Phase 6. Implement A Single Fix

The fix addresses only one hypothesis.

Allowed:

Minimal code change that directly addresses the cause
Minimal supporting changes needed for verification

Forbidden:

Bundling multiple seemingly related fixes
Combining refactoring with the fix
Sneaking in formatting/cleanup/renaming
Adding null-guards without evidence
Swallowing exceptions

If the fix fails, immediately return to Phase 1 or Phase 3. The previous hypothesis was wrong.

Phase 7. Verify And Close

All of the following must be satisfied before closing:

The original reproduction path no longer fails.
The new failing guard passes.
Related tests are not broken.
You can explain that the fix blocks the cause, not the symptom.

For intermittent bugs, a single pass is not enough. Verification under repeated runs or varying conditions is required.

Stop Conditions

Stop and reframe in the following situations.

1. Reproduction Failed

If reproduction fails after multiple attempts:

Check if observability is insufficient.
Check if there are environment differences.
Check if the problem definition is wrong.

Changing code without reproduction is forbidden.

2. Three Failed Fixes

If three consecutive fixes miss the mark, conclude:

The current understanding is wrong, or
The problem is likely structural — shared state, boundary design, responsibility separation

From this point, a "fourth patch" is not the answer — a structural discussion is needed.

3. No Failing Guard

If you cannot create a failing test or equivalent reproduction mechanism, do not declare completion. At minimum, leave behind the reproduction command and observed results.

Red Flags

If any of the following thoughts arise, stop immediately and return to an earlier phase.

"I'll just change this one line and it should work"
"I'll check the logs later, let me fix it first"
"I'll add the test later"
"Let me fix this and that together at once"
"The error is gone, so I don't need to know the cause"

Minimal Checklist

Use this checklist for self-verification during execution.

Defined the problem in one sentence
Reproduced or made the failure observable
Collected evidence
Created a single root-cause hypothesis
Created a failing guard before fixing
Applied only a single fix
Verified via the same path after fixing

Completion Standard

The completion criterion for this skill is not "the code changed."

Completion criteria:

The problem definition is clear
The failure was locked before fixing
The fix is connected to the root cause
Verification results remain

Without these four, debugging is not finished.

systematic-debugging

NPX Install

SKILL.md Content

Systematic Debugging

Hard Gates

When To Use

Required Output Contract

Workflow

Phase 1. Define The Problem

Phase 2. Reproduce Or Instrument

Phase 3. Gather Evidence

Phase 4. Isolate Root Cause

Phase 5. Lock The Failure

Phase 6. Implement A Single Fix

Phase 7. Verify And Close

Stop Conditions

1. Reproduction Failed

2. Three Failed Fixes

3. No Failing Guard

Red Flags

Minimal Checklist

Completion Standard