Data Structure Protocol (DSP)
LLM coding agents lose context between tasks. On large codebases they spend most of their tokens on "orientation" — figuring out where things live, what depends on what, and what is safe to change. DSP solves this by externalizing the project's structural map into a persistent, queryable graph stored in a
directory next to the code.
DSP is NOT documentation for humans and NOT an AST dump. It captures three things: meaning (why an entity exists), boundaries (what it imports and exposes), and reasons (why each connection exists). This is enough for an agent to navigate, refactor, and generate code without loading the entire source tree into the context window.
When to Use
Use this skill when:
- The project has a directory (DSP is already set up)
- The user asks to set up DSP, bootstrap, or map a project's structure
- Creating, modifying, or deleting code files in a DSP-tracked project (to keep the graph updated)
- Navigating project structure, understanding dependencies, or finding specific modules
- The user mentions DSP, dsp-cli, , or structure mapping
- Performing impact analysis before a refactor or dependency replacement
Core Concepts
Code = graph
DSP models the codebase as a directed graph. Nodes are entities, edges are imports and shared/exports.
Two entity kinds exist:
- Object: any "thing" that isn't a function (module/file/class/config/resource/external dependency)
- Function: an exported function/method/handler/pipeline
Identity by UID, not by file path
Every entity gets a stable UID:
for objects,
for functions. File paths are attributes that can change; UIDs survive renames, moves, and reformatting.
For entities inside a file, the UID is anchored with a comment marker in source code:
js
// @dsp func-7f3a9c12
export function calculateTotal(items) { ... }
python
# @dsp obj-e5f6g7h8
class UserService:
Every connection has a "why"
When an import is recorded, DSP stores a short reason explaining
why that dependency exists. This lives in the
reverse index of the imported entity. A dependency graph without reasons tells you
what imports what; reasons tell you
what is safe to change and who will break.
Storage format
Each entity gets a small directory under
:
.dsp/
├── TOC # ordered list of all entity UIDs from root
├── obj-a1b2c3d4/
│ ├── description # source path, kind, purpose (1-3 sentences)
│ ├── imports # UIDs this entity depends on (one per line)
│ ├── shared # UIDs of public API / exported entities
│ └── exports/ # reverse index: who imports this and why
│ ├── <importer_uid> # file content = "why" text
│ └── <shared_uid>/
│ ├── description # what is exported
│ └── <importer_uid> # why this specific export is imported
└── func-7f3a9c12/
├── description
├── imports
└── exports/
Everything is plain text. Diffable. Reviewable. No database needed.
Full import coverage
Every file or artifact that is imported anywhere must be represented in
as an Object — code, images, styles, configs, JSON, wasm, everything. External dependencies (npm packages, stdlib, etc.) are recorded as
but their internals are never analyzed.
How It Works
Initial Setup
The skill relies on a standalone Python CLI script
. If it is missing from the project, download it:
bash
curl -O https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/skills/data-structure-protocol/scripts/dsp-cli.py
Requires
Python 3.10+. All commands use
python dsp-cli.py --root <project-root> <command>
.
Bootstrap (initial mapping)
If
is empty, traverse the project from root entrypoint(s) via DFS on imports:
- Identify root entrypoints ( main, framework entry, , etc.)
- Document the root file: , for each export, , for all dependencies
- Take the first non-external import, document it fully, descend into its imports
- Backtrack when no unvisited local imports remain; continue until all reachable files are documented
- External dependencies:
create-object --kind external
, add to TOC, but never descend into //etc.
Workflow Rules
- Before changing code: Find affected entities via , , or . Read their and to understand context.
- When creating a file/module: Call . For each exported function — (with ). Register exports via .
- When adding an import: Call with a brief . For external deps — first
create-object --kind external
if the entity doesn't exist.
- When removing import/export/file: Call , , . Cascade cleanup is automatic.
- When renaming/moving a file: Call . UID does not change.
- Don't touch DSP if only internal implementation changed without affecting purpose or dependencies.
Key Commands
| Category | Commands |
|---|
| Create | , , , , |
| Update | , , |
| Delete | , , |
| Navigate | , , , , , |
| Search | , |
| Diagnostics | , , |
When to Update DSP
| Code Change | DSP Action |
|---|
| New file/module | + + + |
| New import added | (+ create-object --kind external
if new dep) |
| Import removed | |
| Export added | (+ if new) |
| Export removed | |
| File renamed/moved | |
| File deleted | |
| Purpose changed | |
| Internal-only change | No DSP update needed |
Examples
Example 1: Setting up DSP and documenting a module
bash
python dsp-cli.py --root . init
python dsp-cli.py --root . create-object "src/app.ts" "Main application entrypoint"
# Output: obj-a1b2c3d4
python dsp-cli.py --root . create-function "src/app.ts#start" "Starts the HTTP server" --owner obj-a1b2c3d4
# Output: func-7f3a9c12
python dsp-cli.py --root . create-shared obj-a1b2c3d4 func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP routing"
Example 2: Navigating the graph before making changes
bash
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . get-entity obj-a1b2c3d4
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
python dsp-cli.py --root . get-path obj-a1b2c3d4 func-7f3a9c12
Example 3: Impact analysis before replacing a library
bash
python dsp-cli.py --root . find-by-source "lodash"
# Output: obj-11223344
python dsp-cli.py --root . get-recipients obj-11223344
# Shows every module that imports lodash and WHY — lets you systematically replace it
Best Practices
- ✅ Do: Update DSP immediately when creating new files, adding imports, or changing public APIs
- ✅ Do: Always add a meaningful reason when recording an import — this is where most of DSP's value lives
- ✅ Do: Use for third-party libraries without analyzing their internals
- ✅ Do: Keep descriptions minimal (1-3 sentences about purpose, not implementation)
- ✅ Do: Treat diffs like code diffs — review them, keep them accurate
- ❌ Don't: Touch for internal-only changes that don't affect purpose or dependencies
- ❌ Don't: Change an entity's UID on rename/move (use instead)
- ❌ Don't: Create UIDs for every local variable or helper — only file-level Objects and public/shared entities
Integration
This skill connects naturally to:
- context-compression — DSP reduces the need for compression by providing targeted retrieval instead of loading everything
- context-optimization — DSP is a structural optimization: agents pull minimal "context bundles" instead of raw source
- architecture — DSP captures architectural boundaries (imports/exports) that feed system design decisions
References
- Full architecture specification: ARCHITECTURE.md
- CLI source + reference docs: skills/data-structure-protocol
- Introduction article: article.md