code-header-annotator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Code Header Annotator

Maintain a fixed 20-line header region per code file (marked by

@codex-header: v1

) that captures the file's purpose and a compact index of key symbols (with line addresses).

为每个代码文件维护一个固定20行的头区域（由

@codex-header: v1

标记），用于记录文件用途和关键符号的紧凑索引（带行地址）。

Workflow (per file)

单文件处理流程

Preserve prolog lines - Keep required first lines (shebang, encoding cookie, build tags, etc.)
Ensure 20-line header - Insert new or update existing, always keep
```
@codex-header: v1
```
marker
Scan file sections - Imports → config → types → functions → entrypoints → side effects
Populate header - Use precise names + line addresses, write
```
TODO
```
when unsure
Verify addresses - Ensure every
```
Name@L123
```
points to correct definition

保留序言行 - 保留必需的首行内容（shebang、编码声明、构建标签等）
确保20行头 - 插入新头或更新现有头，始终保留
```
@codex-header: v1
```
标记
扫描文件章节 - 导入→配置→类型→函数→入口点→副作用
填充头内容 - 使用精确名称+行地址，不确定时填写
```
TODO
```
验证地址 - 确保每个
```
Name@L123
```
指向正确的定义

Navigation

导航方法

When exploring large repos, read headers first, only open full files when relevant:

Find annotated files:
```
rg "@codex-header: v1"
```
Check
```
Purpose
```
,
```
Key types
```
,
```
Key funcs
```
,
```
Inheritance
```
for relevance
Use
```
Inheritance
```
to jump "up" (bases) and "sideways" (siblings)
If parent is external (e.g.,
```
Base@pkg/base.py#L42
```
), open that file's header first

Relationship notation:

->

inherits,

~>

implements,

mixin

在探索大型代码库时，先阅读头注释，仅当内容相关时再打开完整文件：

查找带注释的文件：
```
rg "@codex-header: v1"
```
查看
```
Purpose
```
（用途）、
```
Key types
```
（关键类型）、
```
Key funcs
```
（关键函数）、
```
Inheritance
```
（继承关系）判断相关性
使用
```
Inheritance
```
跳转到“上层”（父类）和“同级”（兄弟类）
如果父类是外部的（例如
```
Base@pkg/base.py#L42
```
），先打开该文件的头注释

关系表示法：

->

继承，

~>

实现，

混入

Upward Reasoning (inheritance / parent objects)

向上溯源（继承/父对象）

Use the header to "walk upward" from a concrete type to its parents:

Start at the child's header
```
Inheritance:
```
(e.g.,
```
Child@L120->Base@L30
```
).
If the base has an in-file address (
```
Base@L..
```
), jump there.
If the base is external, prefer a cross-file pointer when available (e.g.,
```
Base@path/to/base.ts#L30
```
) and jump to that file's header first.
If the base is external and has no pointer, use
```
Dependencies:
```
/
```
Public API:
```
hints, then search the repo for the base definition (e.g.,
```
rg "class Base\\b"
```
/
```
rg "interface Base\\b"
```
), and annotate the base file too so it becomes indexable.
Repeat until you reach the framework root / stable abstract base (or the registry/factory entrypoint).

使用头注释从具体类型向上追溯到其父类：

从子类头注释的
```
Inheritance:
```
（例如
```
Child@L120->Base@L30
```
）开始。
如果父类有文件内地址（
```
Base@L..
```
），直接跳转到对应行。
如果父类是外部的，优先使用跨文件指针（例如
```
Base@path/to/base.ts#L30
```
），并先打开该文件的头注释。
如果父类是外部且无指针，使用
```
Dependencies:
```
（依赖）/
```
Public API:
```
（公共API）的提示，然后在代码库中搜索父类定义（例如
```
rg "class Base\\b"
```
/
```
rg "interface Base\\b"
```
），并为父类文件添加注释使其可被索引。
重复此过程直到到达框架根类/稳定抽象基类（或注册表/工厂入口点）。

Verification (required)

验证（必需步骤）

After processing all files, always run verification to ensure all auto-populated fields are complete:

bash

python code-header-annotator/scripts/check_incomplete_headers.py <files-or-dirs> --root <repo-root>

This script checks for incomplete auto-populated fields (Key types, Key funcs, Entrypoints, Index) that should have been filled by the annotation script but weren't (e.g., due to tool crashes or interruptions).

If incomplete files are found, re-process them:

bash

python code-header-annotator/scripts/annotate_code_headers.py <incomplete-files> --root <repo-root>

Then re-run verification until all headers are complete.

处理完所有文件后，必须运行验证以确保所有自动填充的字段完整：

bash

python code-header-annotator/scripts/check_incomplete_headers.py <files-or-dirs> --root <repo-root>

该脚本会检查自动填充字段（关键类型、关键函数、入口点、索引）是否存在未完成的情况（例如因工具崩溃或中断导致的缺失）。

如果发现未完成的文件，重新处理它们：

bash

python code-header-annotator/scripts/annotate_code_headers.py <incomplete-files> --root <repo-root>

然后重新运行验证直到所有头注释都完整。

What to Capture (priority order)

需记录的内容（优先级排序）

Purpose - What this file is responsible for (one sentence)
Public surface - What other files import/call/instantiate from here
Key symbols + addresses - Classes/interfaces, factories, handlers, main functions
Inheritance & extension points - Base classes, subclasses, registries, plugin hooks
Side effects / I-O - DB/filesystem/network, global state, caches
Constraints - Important invariants, error modes, performance or security notes

Purpose（用途）- 该文件的职责（一句话描述）
Public surface（公共接口）- 其他文件从这里导入/调用/实例化的内容
Key symbols + addresses（关键符号+地址）- 类/接口、工厂、处理器、主函数
Inheritance & extension points（继承与扩展点）- 基类、子类、注册表、插件钩子
Side effects / I-O（副作用/输入输出）- 数据库/文件系统/网络、全局状态、缓存
Constraints（约束）- 重要的不变量、错误模式、性能或安全注意事项

Automation

自动化操作

Use bundled script to insert/update headers:

bash

python code-header-annotator/scripts/annotate_code_headers.py <files-or-dirs> --root <repo-root> --resolve-parents

Always verify after processing:

bash

python code-header-annotator/scripts/check_incomplete_headers.py <files-or-dirs> --root <repo-root>

Or use

--verify

flag for automatic verification:

bash

python code-header-annotator/scripts/annotate_code_headers.py <files-or-dirs> --root <repo-root> --resolve-parents --verify

Key options:

```
--refresh
```
- Rebuild header from scratch (resets manual fields to TODO)
```
--resolve-parents
```
- Resolve external parents to cross-file references
Default is non-destructive: preserves existing non-TODO manual fields

使用捆绑脚本插入/更新头注释：

bash

python code-header-annotator/scripts/annotate_code_headers.py <files-or-dirs> --root <repo-root> --resolve-parents

处理后必须验证：

bash

python code-header-annotator/scripts/check_incomplete_headers.py <files-or-dirs> --root <repo-root>

或使用

--verify

标志自动验证：

bash

python code-header-annotator/scripts/annotate_code_headers.py <files-or-dirs> --root <repo-root> --resolve-parents --verify

关键选项：

```
--refresh
```
- 从头重建头注释（将手动填写的字段重置为TODO）
```
--resolve-parents
```
- 将外部父类解析为跨文件引用
默认非破坏性：保留现有非TODO的手动填写字段

AGENTS.md Integration

与AGENTS.md的集成

For AI-optimized navigation, generate an AGENTS.md file that guides LLMs to read only file headers first:

bash

python code-header-annotator/scripts/annotate_code_headers.py <files> --root <repo-root> --update-agents-md

This creates/updates

AGENTS.md

in the repo root with:

Reading pattern instructions: Read first 20 lines only, then jump to specific lines using
```
@L<line>
```
syntax
Indexed table: All annotated files with their purposes and key symbols (types, functions)
Navigation syntax reference: How to use
```
Name@L<line>
```
addressing for fast navigation

Why AGENTS.md?

Reduces LLM context bloat by teaching it to read headers first
Provides a quick overview of all annotated files without reading each one
Improves output accuracy by focusing on structure before diving into implementation

Combined usage:

bash

python code-header-annotator/scripts/annotate_code_headers.py <files> --root <repo-root> --resolve-parents --verify --update-agents-md

为了优化AI导航，生成AGENTS.md文件指导LLM先阅读文件头注释：

bash

python code-header-annotator/scripts/annotate_code_headers.py <files> --root <repo-root> --update-agents-md

此命令会在代码库根目录创建/更新

AGENTS.md

，包含：

阅读模式说明：仅阅读前20行，然后使用
```
@L<line>
```
语法跳转到特定行
索引表：所有带注释的文件及其用途和关键符号（类型、函数）
导航语法参考：如何使用
```
Name@L<line>
```
地址实现快速导航

为什么需要AGENTS.md？

通过指导LLM先阅读头注释减少上下文冗余
无需逐个阅读文件即可快速了解所有带注释文件的概况
通过先关注结构再深入实现来提高输出准确性

组合使用命令：

bash

python code-header-annotator/scripts/annotate_code_headers.py <files> --root <repo-root> --resolve-parents --verify --update-agents-md

Critical: Update Index on File Changes

重要提示：文件变更时更新索引

MANDATORY: When this skill is active, you MUST maintain the header index every time you modify a file.

强制要求：当启用此工具时，每次修改文件后必须维护头注释索引。

Index Maintenance Rules

索引维护规则

After every file edit - Update the affected file's header:
- Add new symbols with their correct line numbers
- Remove deleted symbols from the header
- Update
```
Purpose
```
  if file responsibility changed
- Update
```
Public API
```
  if exports changed
- Update
```
Inheritance
```
  if relationships changed
After code changes - Adjust line numbers:
- Moving code changes line numbers → update all affected
```
@L<line>
```
  addresses
- Insertions/deletions shift subsequent lines → recalculate and update addresses
- Check
```
Index:
```
  section anchors and update if sections moved
After adding new files - Always add headers:
- Any new source file should get a 20-line header
- Include all concrete symbols with correct line numbers
- Set
```
TODO
```
  for fields that need manual completion
Before committing/pushing - Final verification:
- Run
```
check_incomplete_headers.py
```
  to ensure no incomplete fields
- Re-run
```
annotate_code_headers.py
```
  with
```
--verify
```
  flag
- Fix any line number mismatches or missing symbols

每次文件编辑后 - 更新受影响文件的头注释：
- 添加新符号并填写正确的行号
- 从头注释中删除已删除的符号
- 如果文件职责变更，更新
```
Purpose
```
- 如果导出内容变更，更新
```
Public API
```
- 如果继承关系变更，更新
```
Inheritance
```
代码变更后 - 调整行号：
- 移动代码会改变行号→更新所有受影响的
```
@L<line>
```
  地址
- 插入/删除代码会导致后续行偏移→重新计算并更新地址
- 检查
```
Index:
```
  章节锚点，若章节移动则更新
添加新文件后 - 必须添加头注释：
- 任何新源文件都应添加20行头注释
- 包含所有具体符号并填写正确的行号
- 对需要手动完成的字段设置
```
TODO
```
提交/推送前 - 最终验证：
- 运行
```
check_incomplete_headers.py
```
  确保无未完成字段
- 使用
```
--verify
```
  标志重新运行
```
annotate_code_headers.py
```
- 修复任何行号不匹配或缺失的符号

Anti-Pattern: Stale Indexes

反模式：过时的索引

Never do this:

Modify code without updating the header
Leave
```
TODO
```
in fields that are now known
Ignore line number drift after refactoring
Add new files without headers

Always do this:

Update header in the same edit as code changes
Run verification after batch changes
Treat the header as live documentation, not one-time annotation

绝对禁止：

修改代码但不更新头注释
已知内容的字段仍保留
```
TODO
```
重构后忽略行号偏移
添加新文件但不添加头注释

必须遵守：

在同一编辑操作中更新代码和头注释
批量变更后运行验证
将头注释视为实时文档，而非一次性注释

Verification Workflow

验证工作流

For any codebase modification task:

bash

undefined

对于任何代码库修改任务：

bash

undefined

1. Make your code changes

1. 进行代码变更

2. Update headers for modified files

2. 更新修改文件的头注释

python code-header-annotator/scripts/annotate_code_headers.py <modified-files> --root <repo-root> --resolve-parents

3. Verify no incomplete fields

3. 验证无未完成字段

python code-header-annotator/scripts/check_incomplete_headers.py <modified-files> --root <repo-root>

4. If incomplete, re-process

4. 若存在未完成字段，重新处理

python code-header-annotator/scripts/annotate_code_headers.py <incomplete-files> --root <repo-root>

5. Repeat until clean

5. 重复直到验证通过


**Key principle**: The header index must always reflect the current state of the file. A stale index is worse than no index because it misleads navigation.


**核心原则**：头注释索引必须始终反映文件的当前状态。过时的索引比没有索引更糟，因为它会误导导航。

Core Rules

核心规则

Fixed size: Always 20 lines per file
High-signal: One-line fields, compress lists, truncate long text
Indexing first: Include exported/public API, key types, key functions, entrypoints
Addresses: Use
```
Name@L<line>
```
for all concrete symbols
Concrete names only: Never use abstract descriptions like "data models"

固定大小：每个文件的头注释始终为20行
高信息密度：单行字段，压缩列表，截断长文本
优先索引：包含导出/公共API、关键类型、关键函数、入口点
地址格式：所有具体符号使用
```
Name@L<line>
```
格式
仅用具体名称：绝不使用抽象描述，如“数据模型”

References

参考资料

Field guidelines: See guidelines.md for detailed requirements per field
Examples: See examples.md for good vs bad patterns
Format spec: See header-format.md for canonical field structure

字段指南：详见guidelines.md了解每个字段的详细要求
示例：详见examples.md查看正确与错误的模式
格式规范：详见header-format.md查看标准字段结构