data-structure-protocol
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseData Structure Protocol (DSP)
Data Structure Protocol (DSP)
LLM coding agents lose context between tasks. On large codebases they spend most of their tokens on "orientation" — figuring out where things live, what depends on what, and what is safe to change. DSP solves this by externalizing the project's structural map into a persistent, queryable graph stored in a directory next to the code.
.dsp/DSP is NOT documentation for humans and NOT an AST dump. It captures three things: meaning (why an entity exists), boundaries (what it imports and exposes), and reasons (why each connection exists). This is enough for an agent to navigate, refactor, and generate code without loading the entire source tree into the context window.
LLM编码Agent在不同任务之间会丢失上下文信息。在大型代码库中,它们的大部分token都消耗在「定位」上——弄清楚各类内容的存放位置、依赖关系以及哪些内容可以安全修改。DSP通过将项目的结构映射外部化为存储在代码旁目录中的持久化可查询图结构,解决了这一问题。
.dsp/DSP并非面向人类的文档,也不是AST转储文件。它仅捕获三类信息:含义(实体存在的原因)、边界(它导入和暴露的内容)以及关联理由(每个连接存在的原因)。这些信息足以让Agent在无需将整个源码树加载到上下文窗口的情况下,完成导航、重构和代码生成工作。
When to Use
适用场景
Use this skill when:
- The project has a directory (DSP is already set up)
.dsp/ - The user asks to set up DSP, bootstrap, or map a project's structure
- Creating, modifying, or deleting code files in a DSP-tracked project (to keep the graph updated)
- Navigating project structure, understanding dependencies, or finding specific modules
- The user mentions DSP, dsp-cli, , or structure mapping
.dsp - Performing impact analysis before a refactor or dependency replacement
在以下场景中使用该技能:
- 项目中已存在目录(DSP已配置完成)
.dsp/ - 用户要求配置DSP、初始化项目结构映射或引导项目
- 在DSP跟踪的项目中创建、修改或删除代码文件(以保持图结构更新)
- 导航项目结构、理解依赖关系或查找特定模块
- 用户提及DSP、dsp-cli、或结构映射
.dsp - 在重构或替换依赖前执行影响分析
Core Concepts
核心概念
Code = graph
代码 = 图结构
DSP models the codebase as a directed graph. Nodes are entities, edges are imports and shared/exports.
Two entity kinds exist:
- Object: any "thing" that isn't a function (module/file/class/config/resource/external dependency)
- Function: an exported function/method/handler/pipeline
DSP将代码库建模为有向图。节点代表实体,边代表导入和共享/导出关系。
存在两类实体:
- Object(对象):任何非函数的“事物”(模块/文件/类/配置/资源/外部依赖)
- Function(函数):导出的函数/方法/处理器/流水线
Identity by UID, not by file path
基于UID而非文件路径的身份标识
Every entity gets a stable UID: for objects, for functions. File paths are attributes that can change; UIDs survive renames, moves, and reformatting.
obj-<8hex>func-<8hex>For entities inside a file, the UID is anchored with a comment marker in source code:
js
// @dsp func-7f3a9c12
export function calculateTotal(items) { ... }python
undefined每个实体都有一个稳定的UID:对象为,函数为。文件路径是可变更的属性;UID在重命名、移动和格式化操作中保持不变。
obj-<8hex>func-<8hex>对于文件内的实体,UID通过源码中的注释标记锚定:
js
// @dsp func-7f3a9c12
export function calculateTotal(items) { ... }python
undefined@dsp obj-e5f6g7h8
@dsp obj-e5f6g7h8
class UserService:
undefinedclass UserService:
undefinedEvery connection has a "why"
每个连接都有“理由”
When an import is recorded, DSP stores a short reason explaining why that dependency exists. This lives in the reverse index of the imported entity. A dependency graph without reasons tells you what imports what; reasons tell you what is safe to change and who will break.
exports/记录导入关系时,DSP会存储一段简短的理由,解释该依赖存在的原因。这段理由存储在被导入实体的反向索引中。没有理由的依赖图只能告诉你谁导入了谁;而带理由的图能告诉你哪些内容可以安全修改,以及修改会影响谁。
exports/Storage format
存储格式
Each entity gets a small directory under :
.dsp/.dsp/
├── TOC # ordered list of all entity UIDs from root
├── obj-a1b2c3d4/
│ ├── description # source path, kind, purpose (1-3 sentences)
│ ├── imports # UIDs this entity depends on (one per line)
│ ├── shared # UIDs of public API / exported entities
│ └── exports/ # reverse index: who imports this and why
│ ├── <importer_uid> # file content = "why" text
│ └── <shared_uid>/
│ ├── description # what is exported
│ └── <importer_uid> # why this specific export is imported
└── func-7f3a9c12/
├── description
├── imports
└── exports/Everything is plain text. Diffable. Reviewable. No database needed.
每个实体在目录下对应一个小型目录:
.dsp/.dsp/
├── TOC # 从根节点开始的所有实体UID有序列表
├── obj-a1b2c3d4/
│ ├── description # 源码路径、类型、用途(1-3句话)
│ ├── imports # 该实体依赖的UID(每行一个)
│ ├── shared # 公共API/导出实体的UID
│ └── exports/ # 反向索引:谁导入了该实体及理由
│ ├── <importer_uid> # 文件内容 = 理由文本
│ └── <shared_uid>/
│ ├── description # 导出内容说明
│ └── <importer_uid> # 导入该特定导出内容的理由
└── func-7f3a9c12/
├── description
├── imports
└── exports/所有内容均为纯文本,可进行差异对比、可评审,无需数据库。
Full import coverage
完整的导入覆盖
Every file or artifact that is imported anywhere must be represented in as an Object — code, images, styles, configs, JSON, wasm, everything. External dependencies (npm packages, stdlib, etc.) are recorded as but their internals are never analyzed.
.dspkind: external任何被导入的文件或制品都必须在中表示为Object——包括代码、图片、样式、配置、JSON、wasm等。外部依赖(npm包、标准库等)会被标记为,但不会分析其内部实现。
.dspkind: externalHow It Works
工作原理
Initial Setup
初始配置
The skill relies on a standalone Python CLI script . If it is missing from the project, download it:
dsp-cli.pybash
curl -O https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/skills/data-structure-protocol/scripts/dsp-cli.pyRequires Python 3.10+. All commands use .
python dsp-cli.py --root <project-root> <command>该技能依赖独立的Python CLI脚本。如果项目中缺少该脚本,请下载:
dsp-cli.pybash
curl -O https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/skills/data-structure-protocol/scripts/dsp-cli.py要求Python 3.10+。所有命令格式为。
python dsp-cli.py --root <project-root> <command>Bootstrap (initial mapping)
引导(初始映射)
If is empty, traverse the project from root entrypoint(s) via DFS on imports:
.dsp/- Identify root entrypoints (main, framework entry,
package.json, etc.)main.py - Document the root file: ,
create-objectfor each export,create-function,create-sharedfor all dependenciesadd-import - Take the first non-external import, document it fully, descend into its imports
- Backtrack when no unvisited local imports remain; continue until all reachable files are documented
- External dependencies: , add to TOC, but never descend into
create-object --kind external/node_modules/etc.site-packages
如果目录为空,将从项目根入口点开始,通过对导入关系的深度优先搜索(DFS)遍历项目:
.dsp/- 识别根入口点(的main字段、框架入口、
package.json等)main.py - 记录根文件:为每个导出内容执行、
create-object,为所有依赖执行create-function、create-sharedadd-import - 选取第一个非外部导入,完整记录该导入内容,然后深入其导入关系
- 当没有未访问的本地导入时回溯;继续遍历直到所有可访问文件都被记录
- 外部依赖:执行,添加到TOC,但绝不深入
create-object --kind external/node_modules等目录site-packages
Workflow Rules
工作流规则
- Before changing code: Find affected entities via ,
search, orfind-by-source. Read theirread-tocanddescriptionto understand context.imports - When creating a file/module: Call . For each exported function —
create-object(withcreate-function). Register exports via--owner.create-shared - When adding an import: Call with a brief
add-import. For external deps — firstwhyif the entity doesn't exist.create-object --kind external - When removing import/export/file: Call ,
remove-import,remove-shared. Cascade cleanup is automatic.remove-entity - When renaming/moving a file: Call . UID does not change.
move-entity - Don't touch DSP if only internal implementation changed without affecting purpose or dependencies.
- 修改代码前:通过、
search或find-by-source查找受影响的实体。阅读它们的read-toc和description以理解上下文。imports - 创建文件/模块时:调用。对于每个导出函数,调用
create-object(指定create-function)。通过--owner注册导出内容。create-shared - 添加导入时:调用并附带简短的
add-import理由。对于外部依赖,如果实体不存在,先执行why。create-object --kind external - 移除导入/导出/文件时:调用、
remove-import、remove-shared。级联清理会自动执行。remove-entity - 重命名/移动文件时:调用。UID不会改变。
move-entity - 仅修改内部实现且未影响用途或依赖时:无需修改DSP。
Key Commands
核心命令
| Category | Commands |
|---|---|
| Create | |
| Update | |
| Delete | |
| Navigate | |
| Search | |
| Diagnostics | |
| 分类 | 命令 |
|---|---|
| 创建 | |
| 更新 | |
| 删除 | |
| 导航 | |
| 搜索 | |
| 诊断 | |
When to Update DSP
DSP更新时机
| Code Change | DSP Action |
|---|---|
| New file/module | |
| New import added | |
| Import removed | |
| Export added | |
| Export removed | |
| File renamed/moved | |
| File deleted | |
| Purpose changed | |
| Internal-only change | No DSP update needed |
| 代码变更 | DSP操作 |
|---|---|
| 新建文件/模块 | |
| 添加新导入 | |
| 移除导入 | |
| 添加导出 | |
| 移除导出 | |
| 文件重命名/移动 | |
| 删除文件 | |
| 用途变更 | |
| 仅内部变更 | 无需更新DSP |
Examples
示例
Example 1: Setting up DSP and documenting a module
示例1:配置DSP并记录模块
bash
python dsp-cli.py --root . init
python dsp-cli.py --root . create-object "src/app.ts" "Main application entrypoint"bash
python dsp-cli.py --root . init
python dsp-cli.py --root . create-object "src/app.ts" "主应用入口文件"Output: obj-a1b2c3d4
输出: obj-a1b2c3d4
python dsp-cli.py --root . create-function "src/app.ts#start" "Starts the HTTP server" --owner obj-a1b2c3d4
python dsp-cli.py --root . create-function "src/app.ts#start" "启动HTTP服务器" --owner obj-a1b2c3d4
Output: func-7f3a9c12
输出: func-7f3a9c12
python dsp-cli.py --root . create-shared obj-a1b2c3d4 func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP routing"
undefinedpython dsp-cli.py --root . create-shared obj-a1b2c3d4 func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP路由"
undefinedExample 2: Navigating the graph before making changes
示例2:修改前导航图结构
bash
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . get-entity obj-a1b2c3d4
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
python dsp-cli.py --root . get-path obj-a1b2c3d4 func-7f3a9c12bash
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . get-entity obj-a1b2c3d4
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
python dsp-cli.py --root . get-path obj-a1b2c3d4 func-7f3a9c12Example 3: Impact analysis before replacing a library
示例3:替换库前的影响分析
bash
python dsp-cli.py --root . find-by-source "lodash"bash
python dsp-cli.py --root . find-by-source "lodash"Output: obj-11223344
输出: obj-11223344
python dsp-cli.py --root . get-recipients obj-11223344
python dsp-cli.py --root . get-recipients obj-11223344
Shows every module that imports lodash and WHY — lets you systematically replace it
显示所有导入lodash的模块及导入理由——可据此系统性替换该库
undefinedundefinedBest Practices
最佳实践
- ✅ Do: Update DSP immediately when creating new files, adding imports, or changing public APIs
- ✅ Do: Always add a meaningful reason when recording an import — this is where most of DSP's value lives
why - ✅ Do: Use for third-party libraries without analyzing their internals
kind: external - ✅ Do: Keep descriptions minimal (1-3 sentences about purpose, not implementation)
- ✅ Do: Treat diffs like code diffs — review them, keep them accurate
.dsp/ - ❌ Don't: Touch for internal-only changes that don't affect purpose or dependencies
.dsp/ - ❌ Don't: Change an entity's UID on rename/move (use instead)
move-entity - ❌ Don't: Create UIDs for every local variable or helper — only file-level Objects and public/shared entities
- ✅ 建议:创建新文件、添加导入或修改公共API时,立即更新DSP
- ✅ 建议:记录导入时始终添加有意义的理由——这是DSP核心价值所在
why - ✅ 建议:第三方库使用类型,无需分析其内部实现
kind: external - ✅ 建议:保持描述简洁(1-3句话说明用途,而非实现细节)
- ✅ 建议:将的差异视为代码差异——进行评审,确保其准确性
.dsp/ - ❌ 禁止:仅修改内部实现且未影响用途或依赖时,修改DSP
- ❌ 禁止:重命名/移动文件时修改实体的UID(请使用命令)
move-entity - ❌ 禁止:为每个局部变量或辅助函数创建UID——仅为文件级Object和公共/共享实体创建UID
Integration
集成
This skill connects naturally to:
- context-compression — DSP reduces the need for compression by providing targeted retrieval instead of loading everything
- context-optimization — DSP is a structural optimization: agents pull minimal "context bundles" instead of raw source
- architecture — DSP captures architectural boundaries (imports/exports) that feed system design decisions
该技能可与以下功能自然集成:
- context-compression —— DSP通过提供针对性检索,减少了对压缩的需求,无需加载所有内容
- context-optimization —— DSP是一种结构优化:Agent可提取最小化的「上下文包」,而非原始源码
- architecture —— DSP捕获的架构边界(导入/导出关系)可辅助系统设计决策
References
参考资料
- Full architecture specification: ARCHITECTURE.md
- CLI source + reference docs: skills/data-structure-protocol
- Introduction article: article.md
- 完整架构规范:ARCHITECTURE.md
- CLI源码 + 参考文档:skills/data-structure-protocol
- 介绍文章:article.md