data-structure-protocol

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Data Structure Protocol (DSP)

Data Structure Protocol (DSP)

LLM coding agents lose context between tasks. On large codebases they spend most of their tokens on "orientation" — figuring out where things live, what depends on what, and what is safe to change. DSP solves this by externalizing the project's structural map into a persistent, queryable graph stored in a
.dsp/
directory next to the code.
DSP is NOT documentation for humans and NOT an AST dump. It captures three things: meaning (why an entity exists), boundaries (what it imports and exposes), and reasons (why each connection exists). This is enough for an agent to navigate, refactor, and generate code without loading the entire source tree into the context window.
LLM编码Agent在不同任务之间会丢失上下文信息。在大型代码库中,它们的大部分token都消耗在「定位」上——弄清楚各类内容的存放位置、依赖关系以及哪些内容可以安全修改。DSP通过将项目的结构映射外部化为存储在代码旁
.dsp/
目录中的持久化可查询图结构,解决了这一问题。
DSP并非面向人类的文档,也不是AST转储文件。它仅捕获三类信息:含义(实体存在的原因)、边界(它导入和暴露的内容)以及关联理由(每个连接存在的原因)。这些信息足以让Agent在无需将整个源码树加载到上下文窗口的情况下,完成导航、重构和代码生成工作。

When to Use

适用场景

Use this skill when:
  • The project has a
    .dsp/
    directory (DSP is already set up)
  • The user asks to set up DSP, bootstrap, or map a project's structure
  • Creating, modifying, or deleting code files in a DSP-tracked project (to keep the graph updated)
  • Navigating project structure, understanding dependencies, or finding specific modules
  • The user mentions DSP, dsp-cli,
    .dsp
    , or structure mapping
  • Performing impact analysis before a refactor or dependency replacement
在以下场景中使用该技能:
  • 项目中已存在
    .dsp/
    目录(DSP已配置完成)
  • 用户要求配置DSP、初始化项目结构映射或引导项目
  • 在DSP跟踪的项目中创建、修改或删除代码文件(以保持图结构更新)
  • 导航项目结构、理解依赖关系或查找特定模块
  • 用户提及DSP、dsp-cli、
    .dsp
    或结构映射
  • 在重构或替换依赖前执行影响分析

Core Concepts

核心概念

Code = graph

代码 = 图结构

DSP models the codebase as a directed graph. Nodes are entities, edges are imports and shared/exports.
Two entity kinds exist:
  • Object: any "thing" that isn't a function (module/file/class/config/resource/external dependency)
  • Function: an exported function/method/handler/pipeline
DSP将代码库建模为有向图。节点代表实体,边代表导入共享/导出关系。
存在两类实体:
  • Object(对象):任何非函数的“事物”(模块/文件/类/配置/资源/外部依赖)
  • Function(函数):导出的函数/方法/处理器/流水线

Identity by UID, not by file path

基于UID而非文件路径的身份标识

Every entity gets a stable UID:
obj-<8hex>
for objects,
func-<8hex>
for functions. File paths are attributes that can change; UIDs survive renames, moves, and reformatting.
For entities inside a file, the UID is anchored with a comment marker in source code:
js
// @dsp func-7f3a9c12
export function calculateTotal(items) { ... }
python
undefined
每个实体都有一个稳定的UID:对象为
obj-<8hex>
,函数为
func-<8hex>
。文件路径是可变更的属性;UID在重命名、移动和格式化操作中保持不变。
对于文件内的实体,UID通过源码中的注释标记锚定:
js
// @dsp func-7f3a9c12
export function calculateTotal(items) { ... }
python
undefined

@dsp obj-e5f6g7h8

@dsp obj-e5f6g7h8

class UserService:
undefined
class UserService:
undefined

Every connection has a "why"

每个连接都有“理由”

When an import is recorded, DSP stores a short reason explaining why that dependency exists. This lives in the
exports/
reverse index of the imported entity. A dependency graph without reasons tells you what imports what; reasons tell you what is safe to change and who will break.
记录导入关系时,DSP会存储一段简短的理由,解释该依赖存在的原因。这段理由存储在被导入实体的
exports/
反向索引中。没有理由的依赖图只能告诉你谁导入了谁;而带理由的图能告诉你哪些内容可以安全修改,以及修改会影响谁

Storage format

存储格式

Each entity gets a small directory under
.dsp/
:
.dsp/
├── TOC                        # ordered list of all entity UIDs from root
├── obj-a1b2c3d4/
│   ├── description            # source path, kind, purpose (1-3 sentences)
│   ├── imports                # UIDs this entity depends on (one per line)
│   ├── shared                 # UIDs of public API / exported entities
│   └── exports/               # reverse index: who imports this and why
│       ├── <importer_uid>     # file content = "why" text
│       └── <shared_uid>/
│           ├── description    # what is exported
│           └── <importer_uid> # why this specific export is imported
└── func-7f3a9c12/
    ├── description
    ├── imports
    └── exports/
Everything is plain text. Diffable. Reviewable. No database needed.
每个实体在
.dsp/
目录下对应一个小型目录:
.dsp/
├── TOC                        # 从根节点开始的所有实体UID有序列表
├── obj-a1b2c3d4/
│   ├── description            # 源码路径、类型、用途(1-3句话)
│   ├── imports                # 该实体依赖的UID(每行一个)
│   ├── shared                 # 公共API/导出实体的UID
│   └── exports/               # 反向索引:谁导入了该实体及理由
│       ├── <importer_uid>     # 文件内容 = 理由文本
│       └── <shared_uid>/
│           ├── description    # 导出内容说明
│           └── <importer_uid> # 导入该特定导出内容的理由
└── func-7f3a9c12/
    ├── description
    ├── imports
    └── exports/
所有内容均为纯文本,可进行差异对比、可评审,无需数据库。

Full import coverage

完整的导入覆盖

Every file or artifact that is imported anywhere must be represented in
.dsp
as an Object — code, images, styles, configs, JSON, wasm, everything. External dependencies (npm packages, stdlib, etc.) are recorded as
kind: external
but their internals are never analyzed.
任何被导入的文件或制品都必须在
.dsp
中表示为Object——包括代码、图片、样式、配置、JSON、wasm等。外部依赖(npm包、标准库等)会被标记为
kind: external
,但不会分析其内部实现。

How It Works

工作原理

Initial Setup

初始配置

The skill relies on a standalone Python CLI script
dsp-cli.py
. If it is missing from the project, download it:
bash
curl -O https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/skills/data-structure-protocol/scripts/dsp-cli.py
Requires Python 3.10+. All commands use
python dsp-cli.py --root <project-root> <command>
.
该技能依赖独立的Python CLI脚本
dsp-cli.py
。如果项目中缺少该脚本,请下载:
bash
curl -O https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/skills/data-structure-protocol/scripts/dsp-cli.py
要求Python 3.10+。所有命令格式为
python dsp-cli.py --root <project-root> <command>

Bootstrap (initial mapping)

引导(初始映射)

If
.dsp/
is empty, traverse the project from root entrypoint(s) via DFS on imports:
  1. Identify root entrypoints (
    package.json
    main, framework entry,
    main.py
    , etc.)
  2. Document the root file:
    create-object
    ,
    create-function
    for each export,
    create-shared
    ,
    add-import
    for all dependencies
  3. Take the first non-external import, document it fully, descend into its imports
  4. Backtrack when no unvisited local imports remain; continue until all reachable files are documented
  5. External dependencies:
    create-object --kind external
    , add to TOC, but never descend into
    node_modules
    /
    site-packages
    /etc.
如果
.dsp/
目录为空,将从项目根入口点开始,通过对导入关系的深度优先搜索(DFS)遍历项目:
  1. 识别根入口点(
    package.json
    的main字段、框架入口、
    main.py
    等)
  2. 记录根文件:为每个导出内容执行
    create-object
    create-function
    ,为所有依赖执行
    create-shared
    add-import
  3. 选取第一个非外部导入,完整记录该导入内容,然后深入其导入关系
  4. 当没有未访问的本地导入时回溯;继续遍历直到所有可访问文件都被记录
  5. 外部依赖:执行
    create-object --kind external
    ,添加到TOC,但绝不深入
    node_modules
    /
    site-packages
    等目录

Workflow Rules

工作流规则

  • Before changing code: Find affected entities via
    search
    ,
    find-by-source
    , or
    read-toc
    . Read their
    description
    and
    imports
    to understand context.
  • When creating a file/module: Call
    create-object
    . For each exported function —
    create-function
    (with
    --owner
    ). Register exports via
    create-shared
    .
  • When adding an import: Call
    add-import
    with a brief
    why
    . For external deps — first
    create-object --kind external
    if the entity doesn't exist.
  • When removing import/export/file: Call
    remove-import
    ,
    remove-shared
    ,
    remove-entity
    . Cascade cleanup is automatic.
  • When renaming/moving a file: Call
    move-entity
    . UID does not change.
  • Don't touch DSP if only internal implementation changed without affecting purpose or dependencies.
  • 修改代码前:通过
    search
    find-by-source
    read-toc
    查找受影响的实体。阅读它们的
    description
    imports
    以理解上下文。
  • 创建文件/模块时:调用
    create-object
    。对于每个导出函数,调用
    create-function
    (指定
    --owner
    )。通过
    create-shared
    注册导出内容。
  • 添加导入时:调用
    add-import
    并附带简短的
    why
    理由。对于外部依赖,如果实体不存在,先执行
    create-object --kind external
  • 移除导入/导出/文件时:调用
    remove-import
    remove-shared
    remove-entity
    。级联清理会自动执行。
  • 重命名/移动文件时:调用
    move-entity
    。UID不会改变。
  • 仅修改内部实现且未影响用途或依赖时:无需修改DSP。

Key Commands

核心命令

CategoryCommands
Create
init
,
create-object
,
create-function
,
create-shared
,
add-import
Update
update-description
,
update-import-why
,
move-entity
Delete
remove-import
,
remove-shared
,
remove-entity
Navigate
get-entity
,
get-children --depth N
,
get-parents --depth N
,
get-path
,
get-recipients
,
read-toc
Search
search <query>
,
find-by-source <path>
Diagnostics
detect-cycles
,
get-orphans
,
get-stats
分类命令
创建
init
,
create-object
,
create-function
,
create-shared
,
add-import
更新
update-description
,
update-import-why
,
move-entity
删除
remove-import
,
remove-shared
,
remove-entity
导航
get-entity
,
get-children --depth N
,
get-parents --depth N
,
get-path
,
get-recipients
,
read-toc
搜索
search <query>
,
find-by-source <path>
诊断
detect-cycles
,
get-orphans
,
get-stats

When to Update DSP

DSP更新时机

Code ChangeDSP Action
New file/module
create-object
+
create-function
+
create-shared
+
add-import
New import added
add-import
(+
create-object --kind external
if new dep)
Import removed
remove-import
Export added
create-shared
(+
create-function
if new)
Export removed
remove-shared
File renamed/moved
move-entity
File deleted
remove-entity
Purpose changed
update-description
Internal-only changeNo DSP update needed
代码变更DSP操作
新建文件/模块
create-object
+
create-function
+
create-shared
+
add-import
添加新导入
add-import
(如果是新依赖,需先执行
create-object --kind external
移除导入
remove-import
添加导出
create-shared
(如果是新函数,需先执行
create-function
移除导出
remove-shared
文件重命名/移动
move-entity
删除文件
remove-entity
用途变更
update-description
仅内部变更无需更新DSP

Examples

示例

Example 1: Setting up DSP and documenting a module

示例1:配置DSP并记录模块

bash
python dsp-cli.py --root . init

python dsp-cli.py --root . create-object "src/app.ts" "Main application entrypoint"
bash
python dsp-cli.py --root . init

python dsp-cli.py --root . create-object "src/app.ts" "主应用入口文件"

Output: obj-a1b2c3d4

输出: obj-a1b2c3d4

python dsp-cli.py --root . create-function "src/app.ts#start" "Starts the HTTP server" --owner obj-a1b2c3d4
python dsp-cli.py --root . create-function "src/app.ts#start" "启动HTTP服务器" --owner obj-a1b2c3d4

Output: func-7f3a9c12

输出: func-7f3a9c12

python dsp-cli.py --root . create-shared obj-a1b2c3d4 func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP routing"
undefined
python dsp-cli.py --root . create-shared obj-a1b2c3d4 func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP路由"
undefined

Example 2: Navigating the graph before making changes

示例2:修改前导航图结构

bash
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . get-entity obj-a1b2c3d4
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
python dsp-cli.py --root . get-path obj-a1b2c3d4 func-7f3a9c12
bash
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . get-entity obj-a1b2c3d4
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
python dsp-cli.py --root . get-path obj-a1b2c3d4 func-7f3a9c12

Example 3: Impact analysis before replacing a library

示例3:替换库前的影响分析

bash
python dsp-cli.py --root . find-by-source "lodash"
bash
python dsp-cli.py --root . find-by-source "lodash"

Output: obj-11223344

输出: obj-11223344

python dsp-cli.py --root . get-recipients obj-11223344
python dsp-cli.py --root . get-recipients obj-11223344

Shows every module that imports lodash and WHY — lets you systematically replace it

显示所有导入lodash的模块及导入理由——可据此系统性替换该库

undefined
undefined

Best Practices

最佳实践

  • Do: Update DSP immediately when creating new files, adding imports, or changing public APIs
  • Do: Always add a meaningful
    why
    reason when recording an import — this is where most of DSP's value lives
  • Do: Use
    kind: external
    for third-party libraries without analyzing their internals
  • Do: Keep descriptions minimal (1-3 sentences about purpose, not implementation)
  • Do: Treat
    .dsp/
    diffs like code diffs — review them, keep them accurate
  • Don't: Touch
    .dsp/
    for internal-only changes that don't affect purpose or dependencies
  • Don't: Change an entity's UID on rename/move (use
    move-entity
    instead)
  • Don't: Create UIDs for every local variable or helper — only file-level Objects and public/shared entities
  • 建议:创建新文件、添加导入或修改公共API时,立即更新DSP
  • 建议:记录导入时始终添加有意义的
    why
    理由——这是DSP核心价值所在
  • 建议:第三方库使用
    kind: external
    类型,无需分析其内部实现
  • 建议:保持描述简洁(1-3句话说明用途,而非实现细节)
  • 建议:将
    .dsp/
    的差异视为代码差异——进行评审,确保其准确性
  • 禁止:仅修改内部实现且未影响用途或依赖时,修改DSP
  • 禁止:重命名/移动文件时修改实体的UID(请使用
    move-entity
    命令)
  • 禁止:为每个局部变量或辅助函数创建UID——仅为文件级Object和公共/共享实体创建UID

Integration

集成

This skill connects naturally to:
  • context-compression — DSP reduces the need for compression by providing targeted retrieval instead of loading everything
  • context-optimization — DSP is a structural optimization: agents pull minimal "context bundles" instead of raw source
  • architecture — DSP captures architectural boundaries (imports/exports) that feed system design decisions
该技能可与以下功能自然集成:
  • context-compression —— DSP通过提供针对性检索,减少了对压缩的需求,无需加载所有内容
  • context-optimization —— DSP是一种结构优化:Agent可提取最小化的「上下文包」,而非原始源码
  • architecture —— DSP捕获的架构边界(导入/导出关系)可辅助系统设计决策

References

参考资料