Code to Catalog
Turn a codebase into EventCatalog documentation through a guided, evidence-based interview. Works for two situations:
- No catalog yet — document an unfamiliar or undocumented codebase from scratch.
- Existing catalog — reconcile the catalog with the current code (add new resources, flag drift, surface stale entries).
This skill does not write catalog files itself. It produces a
plan file (
) that captures the agreed architectural model, then hands off to the
catalog-documentation-creator
skill to generate the actual documentation.
How this skill works
The skill runs in six phases. Follow them in order — later phases depend on earlier ones.
- Locate & inventory — find the code directory and any existing catalog
- Discovery scan — read the code, form a hypothesis
- Reconcile with existing catalog — categorize findings as / / /
- Tiered grilling — interview the user on structural decisions only
- Produce the plan file — write , get approval
- Handoff — ask whether to generate the catalog now or stop at the plan
Conversational style (applies throughout)
- One question at a time. Never batch questions. The user answers, then move on.
- Always provide a recommended answer. Every question includes what you think is true, with evidence (file path + line number). The user confirms, corrects, or overrides.
- Cite the code. When you present a finding, point at the file where you saw it — e.g., . The user should be able to verify without trusting you.
- Be honest about uncertainty. If the code does not tell you whether something is an event or a command, say so. Do not guess silently.
- Surface conflicts, do not pick silently. When catalog and code disagree, the user decides. Never overwrite without confirmation.
- Respect the user's time. Grilling is tiered on purpose — structural decisions only. Do not grill on per-resource fields (summaries, owners, schemas).
- No catalog deletions. Resources in the catalog that you cannot find in code are flagged — never removed automatically.
Phase 1: Locate & inventory
Find the codebase
Ask the user: "Which code directory should I analyze?"
Verify the directory exists and looks like a code project (has a
,
,
,
,
, source directories, etc.). If the directory is ambiguous (e.g., a monorepo), confirm the scope: the whole repo, or a specific subdirectory.
Find the catalog
Ask the user: "Do you already have an EventCatalog project, or do you want to start fresh?"
If they already have one:
- Ask for the path. Verify it's an EventCatalog project by checking for or the standard directories (, , , , , , ).
- Build an inventory of what already exists. If the EventCatalog MCP server is connected, use , , . Otherwise read the filesystem directly and parse the frontmatter of each /.
- Record for each resource: , , , , , and (for services) / relationships.
- Note the catalog's conventions: nested (
domains/X/services/Y/events/Z
) vs flat, PascalCase vs kebab-case IDs, existing owners, schema formats in use.
If they do not have a catalog:
- That's fine. Note that scaffolding will happen at handoff time through
catalog-documentation-creator
(which runs npx @eventcatalog/create-eventcatalog@latest <name> --empty
).
- Phase 3 (reconciliation) becomes a no-op — everything discovered will be .
Phase 2: Discovery scan
Read the codebase and form a hypothesis. Do not show the user your findings yet — you'll present them as questions in Phase 4, backed by evidence.
For detailed detection heuristics per language/framework (Node.js, Python, Go, Java, .NET), see
. Read that file now if the codebase uses a stack you need guidance on.
Detect:
Project structure
- Monorepo vs single service (look for workspace configs: , with , , , , multiple top-level service directories with their own manifests).
- Language and framework per service.
- Build/deploy units (Dockerfiles, Helm charts, , stacks, k8s manifests).
Service boundaries
A service is an independently-deployable, independently-ownable unit. Signals:
- Separate package with its own manifest
- Separate Dockerfile / deployment config
- Its own entrypoint (, , , etc.)
- Consumed by others over a network boundary (HTTP, message bus)
When in doubt, mark as a candidate and grill the user in Phase 4.
Messages (events, commands, queries)
Candidates come from:
- Naming patterns — , , , (likely events); , , , (likely commands); , , (likely queries).
- Message bus clients — Kafka (, , ), RabbitMQ (, ), NATS, AWS SNS/SQS/EventBridge, GCP PubSub, Azure Service Bus.
- Schema files — JSON Schema (), Avro (), Protobuf (). These are strong signals of a message contract.
- DTO / type definitions — especially if they look like payloads (flat, data-only, named after a domain event).
Classify each candidate as event, command, or query based on evidence:
- Event: past tense, published to a topic/exchange, multiple consumers possible, no direct reply expected.
- Command: imperative, sent to one specific handler, expects to be processed.
- Query: read-only, expects a response.
If evidence is ambiguous (common), mark it as uncertain and grill in Phase 4 — do not silently pick.
Channels
Anywhere messages flow through named infrastructure:
- Kafka topics (string literals passed to
producer.send({ topic: '...' })
)
- RabbitMQ queues/exchanges
- SNS topics, SQS queues
- HTTP endpoints for query services ()
Domains (candidates)
Strong signals:
- Top-level folder grouping (, , )
- Bounded-context hints in READMEs, module docs
- Package namespaces ()
- Ownership files (, )
Do not guess domains with low confidence. If unclear, propose "single domain = whole codebase" and let the user split in Phase 4.
Containers
Databases, caches, queues referenced in config, env vars, or client instantiation:
- Postgres / MySQL / SQLite / Mongo / DynamoDB / Cassandra
- Redis / Memcached
- S3 buckets, GCS buckets
Output of Phase 2
An internal draft map:
domains:
- name: <candidate>
confidence: high|medium|low
services: [...]
services:
- name: <candidate>
path: <dir>
sends: [...]
receives: [...]
channels: [...]
containers: [...]
messages:
- name: <candidate>
classification: event|command|query|uncertain
evidence: <file:line>
producer: <service>
consumer: <service or unknown>
channels: [...]
containers: [...]
Hold this map internally. You'll use it to drive Phase 3 and Phase 4.
Phase 3: Reconcile with existing catalog
Skip this phase if there is no existing catalog — everything discovered is
.
For each item in the draft map, find the matching resource in the catalog inventory (match by ID, then by name, then by fuzzy match on name + type). Assign a status:
| Status | Meaning |
|---|
| Resource exists in catalog and matches what's in code. No user grilling needed. |
| Resource exists in catalog, but the code has drifted — new messages emitted, renamed fields, changed schemas, new sends/receives relationships. |
| Found in code, not in catalog. Candidate for a new resource. |
| Exists in catalog, not found in code. Possibly stale, possibly removed, possibly in a different repo. Never delete — only surface. |
For
items, capture
what specifically drifted:
- in catalog sends , code also sends → drift: new send.
- schema field renamed → → drift: schema change.
This categorization drives Phase 4 — you only grill on
, ambiguous
, and
items.
resources are silent.
Phase 4: Tiered grilling
Grill only on structural decisions. Per-resource details (summary text, owner names, schema fields, badge styles) are not your concern —
catalog-documentation-creator
handles them.
For the full question bank with recommended-answer templates, see
. Read it now.
Walk the topics in this order. Dependencies flow downward — resolve earlier topics before later ones.
Topic 1: Domains & boundaries
Ask about domain groupings first, because service placement depends on it.
- If you detected clear domain candidates: present each with its recommended services and ask the user to confirm.
- If you detected none: propose "single domain for the whole codebase" and ask whether they want to split.
- For any ambiguous service ("does belong to or its own domain?"), grill it.
Topic 2: Service boundaries
- Confirm each service candidate. Present its evidence (path, entrypoint, Dockerfile).
- For ambiguous cases ("this module could be its own service or part of another"), grill with a recommended answer.
- Handle monorepo edge cases: is the shared package a service? (Probably not — it's infrastructure.)
Topic 3: Message classification
This is the most commonly-wrong call. Grill it hard.
For every message marked
in discovery:
I found
at
src/orders/handlers.ts:18
. It's consumed from a queue and there's no reply path, so I'd classify it as an
event. Agree?
For messages with high-confidence classification, present them for quick bulk confirmation:
I identified these as events (based on past-tense naming + pub/sub pattern):
,
,
. Any you'd reclassify?
Topic 4: Drift reconciliation (only if existing catalog)
Catalog says
sends
. Code also emits
at
src/payments/refund.ts:24
. Add
to the service's sends? I'd say yes.
Catalog has
(event, v0.0.3). I could not find it in the code. It might live in another repo, or it might be removed. I'll flag it as
in the plan — the user can decide later. OK?
Do not propose deletions. Only surface.
What NOT to grill on
- Summary text for each resource
- Owner / team assignments (unless discovered from and ambiguous)
- Schema field-level detail
- Badges, visual customizations
- Flow diagrams (that's )
Those pass through to
catalog-documentation-creator
with sensible defaults.
Watch for interview fatigue
If the grilling is running long (say, more than ~15 questions):
- Summarize progress: "Here's what we've agreed so far — domains X, Y; services A, B, C; 12 messages classified."
- Offer to pause at the plan file: "We can stop here, write the plan, and pick up details later."
Phase 5: Produce the plan file
Write the plan to
. Default location: root of the code directory. Ask the user if they'd like it elsewhere.
Use this exact structure:
markdown
# Catalog Plan
**Generated:** YYYY-MM-DD
**Codebase:** /absolute/path/to/repo
**Existing catalog:** /absolute/path/to/catalog (or "none — will be created")
## Summary
<1–2 paragraph narrative of what was found and the agreed architectural model>
## Domains
- **Orders** (status: new)
- Services: OrderService, ShippingService
- Rationale: both deal with the order lifecycle; confirmed with user
- **Payments** (status: unchanged)
- Services: PaymentService
## Services
### OrderService (status: new)
- Domain: Orders
- Path in code: /services/orders
- Receives: PlaceOrder (command)
- Sends: OrderPlaced (event), OrderCancelled (event)
- Channels: orders.commands, orders.events
- Containers: orders-db (postgres)
### PaymentService (status: update)
- Domain: Payments
- Path in code: /services/payments
- Drift: code emits new PaymentRefunded event not in catalog
- Sends (after update): PaymentProcessed, PaymentRefunded
## Messages
- **OrderPlaced** (event, status: new) — emitted by OrderService via orders.events
- **OrderCancelled** (event, status: new) — emitted by OrderService via orders.events
- **PlaceOrder** (command, status: new) — handled by OrderService
- **PaymentRefunded** (event, status: new) — emitted by PaymentService
- **LegacyPaymentConfirmed** (event, status: investigate) — in catalog, not found in code
## Channels
- orders.events (Kafka topic, status: new)
- orders.commands (Kafka topic, status: new)
## Containers
- orders-db (postgres, status: new) — used by OrderService
## Open decisions / rationale
- Classified `CancelOrder` as a command (user confirmed — it expects a handler)
- Kept Shipping as a sub-area of Orders rather than splitting (user preference)
- Flagged `LegacyPaymentConfirmed` for manual review — may live in a different repo
## Next step
Run `catalog-documentation-creator` with this plan to generate resources marked `new` or `update`.
Status values (use exactly these):
,
,
,
.
After writing, show the plan to the user and ask for explicit approval: "Here's the plan. Does this match what we agreed? Anything to add, change, or remove before we proceed?"
Loop on edits until the user approves.
Phase 6: Handoff
Once the plan is approved, ask:
"Generate the catalog now, or stop here with just the plan?"
If generate now:
- Invoke the
catalog-documentation-creator
skill, passing the plan file path.
- Tell it to only create/update resources flagged or . Skip . Report items to the user as a list — do not auto-handle.
- If the user has no existing catalog,
catalog-documentation-creator
will scaffold one first.
If stop here:
- Confirm the plan file location.
- Tell the user how to resume: "When you're ready, run
catalog-documentation-creator
and point it at this plan file. It will generate the and resources."
After handoff
Let the user know:
- Where the plan was saved.
- (If generated) which resources were created/updated.
- Which items need manual review.
- That they can run next if they want to document business flows across the catalog.
Quality checklist
Before finishing, verify:
- The plan file exists at the agreed path.
- Every domain, service, message, channel, and container has a status: / / / .
- Every item lists what specifically drifted.
- Every item is flagged, not deleted.
- Message classifications (event / command / query) were either confirmed by the user or clearly recommended with evidence.
- Service-to-domain mapping is explicit for every service.
- No per-resource grilling happened (summaries, owners, schemas — those are for
catalog-documentation-creator
).
- If handing off,
catalog-documentation-creator
has the plan path and instructions to skip .