repo-skillopt
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRepoSkillOpt — Canonical Skill
RepoSkillOpt — 标准技能
Purpose
用途
This skill helps a coding agent understand a legacy repository through evidence-grounded analysis and human-feedback-driven refinement. It produces a structured Repository Specification, accepts human corrections as a first-class input, and supports a bounded loop in which recurrent feedback is summarized into reviewable edits to the skill itself. The skill is vendor-neutral: it does not depend on any particular coding-agent runtime.
该技能帮助编码Agent通过基于证据的分析和人类反馈驱动的优化来理解遗留代码仓库。它会生成结构化的Repository Specification,将人类修正作为一等输入,并支持一个有限循环,在该循环中,循环反馈会被总结为可审核的技能自身编辑内容。本技能与厂商无关:不依赖任何特定的编码Agent运行时。
Trigger Conditions
触发条件
Activate this skill when the user asks to understand, map, document, onboard to, refactor, modify, or assess a repository — or uses recognizable equivalents of those verbs (e.g., "explain this codebase," "summarize the architecture," "what would change if we…", "is it safe to…").
Do not activate this skill for requests that fall outside the verbs above (e.g., generic chat, unrelated writing tasks, single-file edits that do not require repository-level understanding). Ordinary agent behavior continues in those cases.
当用户要求理解、映射、文档化、上手、重构、修改或评估代码仓库时激活本技能——或者使用这些动词的可识别等效表述(例如:“解释这个代码库”、“总结架构”、“如果我们……会有什么变化”、“……是否安全”)。
请勿针对上述动词之外的请求激活本技能(例如:通用聊天、无关写作任务、无需仓库级理解的单文件编辑)。这些情况下,Agent将保持常规行为。
Operating Principles
操作原则
- Do not rely only on README files. README contents are evidence, not the whole truth. They may be outdated, aspirational, or wrong. Always corroborate against code, configs, tests, and command outputs.
- Ground every major claim in repository evidence. Prefer concrete paths, symbols, configs, tests, and command outputs over generic descriptions. A major claim is any assertion that fills a non-trivial slot in a Repository Specification section (entrypoint identity, technology choice, dependency relationship, control- or data-flow step, integration target, risk statement) or any standalone assertion a maintenance reader would act on. Trivial recitations (literal file contents, raw config dumps, syntactic restatements) are not major claims.
- Separate verified facts from hypotheses. Every major claim carries exactly one of four labels (defined under Output Discipline below): ,
**[fact]**,**[inference]**,**[unknown]**.**[human]** - Identify uncertainty explicitly. When information cannot be determined from inspection, record the gap under Unknowns and unresolved questions rather than guessing. An honest "unknown" is more useful than a confident wrong answer.
- Prefer concrete evidence over training-data pattern matching. Two repositories in the same ecosystem can differ in ways that pattern matching will miss. Inspect this repository.
- Do not overclaim architecture from shallow inspection. If you have only read manifests and a handful of entrypoints, do not describe layers and data flows as if you had traced them.
- Preserve human feedback as reusable knowledge only when properly scoped. Repository-specific facts stay repository-scoped; only feedback explicitly marked candidate-for-generic enters the convergence pipeline that targets this skill itself, and even then only via an accepted Skill Edit Proposal.
- 不要仅依赖README文件。 README内容是证据,但并非全部真相。它们可能过时、理想化或存在错误。始终对照代码、配置、测试和命令输出进行验证。
- 每项重要论断都要基于仓库证据。 优先使用具体路径、符号、配置、测试和命令输出,而非通用描述。“重要论断”指的是填充Repository Specification章节中非平凡内容的任何断言(入口点标识、技术选择、依赖关系、控制流/数据流步骤、集成目标、风险声明),或维护人员会据此采取行动的任何独立断言。平凡复述(字面文件内容、原始配置转储、语法重述)不属于重要论断。
- 区分已验证事实与假设。 每项重要论断都必须带有以下四个标签之一(定义见下文输出规范):、
**[fact]**、**[inference]**、**[unknown]**。**[human]** - 明确标识不确定性。 当无法通过检查确定信息时,将缺口记录在未知与未解决问题部分,而非猜测。诚实的“未知”比自信的错误答案更有用。
- 优先选择具体证据而非训练数据模式匹配。 同一生态系统中的两个仓库可能存在模式匹配无法识别的差异。请检查当前仓库。
- 不要通过浅层检查过度断言架构。 如果仅读取了清单和少量入口点,不要像已经追踪过那样描述层级和数据流。
- 仅在适当范围内保留可复用的人类反馈知识。 仓库特定事实仅保留在仓库范围内;只有明确标记为candidate-for-generic的反馈才会进入针对本技能的收敛流程,且必须通过已接受的Skill Edit Proposal才能进入。
Repository Understanding Workflow
代码仓库理解工作流
Execute the following stages in order. Skipping or reordering stages weakens the evidence chain that downstream sections depend on.
(a) Triage repository structure. List the top-level directories. Note presence of , tests, docs, deployment files, CI configuration, multiple subprojects (monorepo signal).
src/(b) Identify language, framework, package manager, and runtime. Read manifests (, , , , , , , etc.). Record versions and runtime constraints.
pyproject.tomlpackage.jsongo.modpom.xmlCargo.tomlGemfilecomposer.json(c) Inspect manifests, configs, tests, deployment files, and entrypoints. Read at least: build/runtime configs, environment templates, CI workflow definitions, test runner configuration, deployment manifests, and the files identified as entrypoints (CLI scripts, HTTP route registrations, library / files).
__init__index(d) Map modules, layers, dependencies, domain concepts, and data flow. Walk the source tree; assign each module a role; identify the dependency direction between layers; record the names domain experts in this codebase use. Enumerate every function and class defined in the repository so none is silently skipped (see the symbol-accounting rule under Repository Specification Format).
(e) Trace specific behavior from entrypoint to core logic to persistence or side effects. For at least one user-relevant behavior, produce a numbered trace listing every hop (entrypoint → middleware → service → repository → side effect/storage), citing the file and symbol at each hop.
(f) Produce a Repository Specification — incrementally, on disk. Use the Repository Specification template at and fill all 19 required sections, applying the label-and-citation discipline from Output Discipline. Write each section to the spec file as you complete it, and do not retain already-written sections in working context — once a section is persisted, drop it from your working set and keep only the section you are currently writing plus the evidence it needs. When a later section must consult an earlier one (e.g. de-duplicating the Evidence index), read it back from the file rather than holding the whole document in context. This keeps working context bounded by the current section instead of the whole growing spec, so the workflow scales to large repositories within a small context window. The authoritative specification is always the file on disk, never an in-context copy.
templates/repository-specification.md(g) Identify risks, unknowns, and safe next steps. Populate Known risks with repository-specific (not generic) risks, each tied to evidence. Populate Unknowns and unresolved questions with every gap surfaced during stages (a)–(f). Suggest concrete next steps a human can take to resolve key unknowns.
(h) Guarantee symbol completeness (deterministic — final action). Before finishing, make the Symbols not yet analyzed listing complete and mechanical: every function and class not already discussed must appear there, grouped by file, with the counts line. Generate this listing deterministically — by running a completeness helper that enumerates the repository's symbols and lists the ones the analysis did not name — rather than transcribing it by hand. Your job is the analysis (the prose above); this step guarantees the accounting so nothing is silently skipped.
按顺序执行以下阶段。跳过或重新排序阶段会削弱下游章节依赖的证据链。
(a) 分类仓库结构。 列出顶级目录。注意是否存在、测试、文档、部署文件、CI配置、多个子项目(单体仓库信号)。
src/(b) 识别语言、框架、包管理器和运行时。 读取清单文件(、、、、、、等)。记录版本和运行时约束。
pyproject.tomlpackage.jsongo.modpom.xmlCargo.tomlGemfilecomposer.json(c) 检查清单、配置、测试、部署文件和入口点。 至少读取:构建/运行时配置、环境模板、CI工作流定义、测试运行器配置、部署清单,以及被标识为入口点的文件(CLI脚本、HTTP路由注册、库/文件)。
__init__index(d) 映射模块、层级、依赖、领域概念和数据流。 遍历源码树;为每个模块分配角色;识别层级之间的依赖方向;记录该代码库领域专家使用的名称。枚举仓库中定义的每个函数和类,确保没有被静默跳过(见Repository Specification格式下的符号核算规则)。
(e) 追踪从入口点到核心逻辑再到持久化或副作用的特定行为。 针对至少一个与用户相关的行为,生成编号追踪列表,列出每个跳转步骤(入口点 → 中间件 → 服务 → 仓库 → 副作用/存储),并引用每个步骤对应的文件和符号。
(f) 增量生成Repository Specification并保存到磁盘。 使用中的模板,填充所有19个必填章节,遵循输出规范中的标签和引用规则。完成每个章节后立即写入规范文件,不要在工作上下文保留已写入的章节——一旦章节被持久化,就从工作集中删除,仅保留当前正在编写的章节及其所需的证据。当后续章节需要参考前面的章节时(例如:去重证据索引),从文件中读取,而非在上下文保留整个文档。这将工作上下文限制在当前章节,而非不断增长的整个规范,因此该工作流可在小上下文窗口内扩展到大型仓库。权威规范始终是磁盘上的文件,而非上下文副本。
templates/repository-specification.md(g) 识别风险、未知事项和安全后续步骤。 在已知风险中填充仓库特定(非通用)的风险,每个风险都关联证据。在未知与未解决问题中填充阶段(a)-(f)中发现的所有缺口。建议人类可采取的具体后续步骤来解决关键未知问题。
(h) 确保符号完整性(确定性——最终操作)。 完成前,确保尚未分析的符号列表完整且机械生成:所有未讨论的函数和类都必须出现在此处,按文件分组,并附带数量统计。通过运行完整性辅助工具枚举仓库符号并列出未分析的符号来确定性生成此列表——而非手动转录。你的职责是分析(上述 prose 内容);此步骤保证核算准确,确保没有内容被静默跳过。
Repository Specification Format
Repository Specification格式
The Repository Specification template lives at . The agent MUST produce a file matching that template, with all 19 sections present in this order:
templates/repository-specification.md- Repository overview
- Technology stack
- Build and runtime commands
- Major entrypoints
- Architectural layers
- Core modules
- Domain model
- Data model
- External integrations
- Control-flow traces
- Data-flow traces
- Dependency map
- Configuration map
- Testing strategy
- Deployment assumptions
- Change-impact map
- Known risks
- Unknowns and unresolved questions
- Evidence index
Empty-by-design sections explicitly state "None known" or "Not applicable" — they are never silently omitted. The Evidence index lists every distinct citation appearing in the document, de-duplicated.
Symbol accounting (no silent omission). Every function and class defined in the repository MUST be accounted for: either referenced in an analytical claim/citation, or listed under a "Symbols not yet analyzed" subsection of Core modules (grouped by file; per-file counts are acceptable on large repositories). State the counts — N defined, M analyzed, N−M listed — so a reader can see nothing was hidden. Exclude generated/vendored directories.
Data-model diagram. When the repository has a database or persistent schema, the Data model section MUST include a fenced block containing an of the real tables/models — their key columns and foreign-key relationships — with each entity traceable to the schema file that defines it (migration, DDL, or ORM model). When there is no persistent schema, state "Not applicable"; never draw a fabricated schema.
```mermaiderDiagramPresentation format. A specification is read by humans, so favour scannable structure:
- Use numbered section headings in template order — …
## 1. Repository overview.## 19. Evidence index - Render inherently tabular sections as Markdown tables with an Evidence column (the citation) and a Label column (
path:line/[fact]/[inference]): Technology stack, Dependency map, Configuration map, the field list of Data model, and Evidence index.[unknown] - For Control-flow traces and Data-flow traces, lead with a fenced
```mermaiddepicting the steps end to end, then list the authoritative numbered steps as labeled, cited claims beneath it.flowchart - Diagrams (,
flowchart) are visual aids and carry no citations; every step or entity a diagram shows MUST also appear as a labeled, cited line or table row, so evidence grounding is unaffected by the diagram.erDiagram
Working artifacts produced by this skill live under a directory at the target repository root (the repository being analyzed, not the project that ships this skill), with fixed subdirectories:
.reposkillopt/- — Repository Specifications
.reposkillopt/specs/ - — Feedback Items
.reposkillopt/feedback/ - — Rollout Logs
.reposkillopt/rollouts/ - — Skill Edit Proposals
.reposkillopt/proposals/
Repository Specification模板位于。Agent必须生成符合该模板的文件,按以下顺序包含所有19个章节:
templates/repository-specification.md- 仓库概述
- 技术栈
- 构建与运行时命令
- 主要入口点
- 架构层级
- 核心模块
- 领域模型
- 数据模型
- 外部集成
- 控制流追踪
- 数据流追踪
- 依赖映射
- 配置映射
- 测试策略
- 部署假设
- 变更影响映射
- 已知风险
- 未知与未解决问题
- 证据索引
设计为空的章节需明确声明“无已知内容”或“不适用”——绝不静默省略。证据索引列出文档中出现的每个不同引用,去重后展示。
符号核算(无静默省略)。 仓库中定义的每个函数和类都必须被核算:要么在分析断言/引用中被提及,要么列在核心模块下的**“尚未分析的符号”**小节中(按文件分组;大型仓库可接受按文件统计数量)。说明数量——已定义N个,已分析M个,列出N−M个——以便读者确认没有内容被隐藏。排除生成/第三方依赖目录。
数据模型图。 当仓库包含数据库或持久化模式时,数据模型章节必须包含一个 fenced 块,其中包含真实表/模型的——包括它们的关键字段和外键关系——每个实体都可追溯到定义它的模式文件(迁移文件、DDL或ORM模型)。当没有持久化模式时,声明“不适用”;绝不绘制虚构模式。
```mermaiderDiagram呈现格式。 规范供人类阅读,因此优先采用易于扫描的结构:
- 使用编号章节标题,遵循模板顺序——…
## 1. 仓库概述。## 19. 证据索引 - 将固有表格化的章节渲染为Markdown表格,包含证据列(引用)和标签列(
path:line/[fact]/[inference]):技术栈、依赖映射、配置映射、数据模型的字段列表,以及证据索引。[unknown] - 对于控制流追踪和数据流追踪,开头使用fenced
```mermaid描绘端到端步骤,然后在下方列出带标签、带引用的权威编号步骤。flowchart - 图表(、
flowchart)是视觉辅助工具,不包含引用;图表展示的每个步骤或实体都必须同时作为带标签、带引用的行或表格行出现,因此证据基础不受图表影响。erDiagram
本技能生成的工作 artifacts 存储在目标仓库根目录(被分析的仓库,而非发布本技能的项目)下的目录中,包含固定子目录:
.reposkillopt/- — Repository Specifications
.reposkillopt/specs/ - — Feedback Items
.reposkillopt/feedback/ - — Rollout Logs
.reposkillopt/rollouts/ - — Skill Edit Proposals
.reposkillopt/proposals/
Human Feedback Loop
人类反馈循环
The Human Feedback template lives at . The Rollout Log template lives at .
templates/human-feedback.mdtemplates/rollout-log.mdWhen a human provides feedback against a Repository Specification:
-
Record the feedback before applying it. Write a Feedback Item tousing the template. Assign one of the eleven
.reposkillopt/feedback/FB-YYYY-MM-DD-NNN-<slug>.mdvalues (correction, confirmation, missing-context, terminology, quality-rating, avoid-path, deeper-analysis, criticism-of-claim, format, detail-level, cross-agent-difference). Assign atype:scopefor facts particular to this codebase;repository-scopedfor patterns that might warrant a future edit to this skill.candidate-for-generic -
Revise the current Repository Specification. Apply the feedback. For changedclaims, update the citation. For superseded claims, mark them superseded in place; do not silently rewrite history. Increment the spec's
**[fact]**, updaterevision, and append a row to the spec's Change log appendix naming the Feedback Item ids applied.revised -
Update the session's Rollout Log. List the Feedback Item ids under Human feedback received this session, each annotated,
(applied), or(deferred). List the revised spec sections under Revisions applied.(withdrawn) -
Do not silently promote repository-specific facts into this canonical skill. Feedback markedstays in the target repository's artifacts only. Promotion to canonical content requires an accepted Skill Edit Proposal, generated through the Skill Convergence Loop below.
scope: repository-scoped
Human feedback is used in two ways simultaneously: immediate improvement of the current Repository Specification, and longer-term input to skill convergence.
人类反馈模板位于。Rollout Log模板位于。
templates/human-feedback.mdtemplates/rollout-log.md当人类针对Repository Specification提供反馈时:
-
先记录反馈再应用。 使用模板将Feedback Item写入。分配11种
.reposkillopt/feedback/FB-YYYY-MM-DD-NNN-<slug>.md值之一(correction、confirmation、missing-context、terminology、quality-rating、avoid-path、deeper-analysis、criticism-of-claim、format、detail-level、cross-agent-difference)。分配type:scope针对特定于该代码库的事实;repository-scoped针对可能需要未来编辑本技能的模式。candidate-for-generic -
修订当前Repository Specification。 应用反馈。对于已更改的断言,更新引用。对于已取代的断言,在原处标记为已取代;不要静默改写历史。递增规范的
**[fact]**,更新revision字段,并在规范的变更日志附录中添加一行,注明应用的Feedback Item ID。revised -
更新会话的Rollout Log。 在本次会话收到的人类反馈下列出Feedback Item ID,每个ID标注、
(applied)或(deferred)。在已应用的修订下列出修订的规范章节。(withdrawn) -
不要将仓库特定事实静默提升到本标准技能中。 标记为的反馈仅保留在目标仓库的artifacts中。提升到标准内容需要通过Skill Convergence Loop生成已接受的Skill Edit Proposal。
scope: repository-scoped
人类反馈同时用于两种用途:立即改进当前Repository Specification,以及为技能收敛提供长期输入。
Skill Convergence Loop
技能收敛循环
The Skill Edit Proposal template lives at .
templates/skill-edit-proposal.mdWhen recurrent feedback (typically three or more related Feedback Items across one or more sessions) suggests a weakness in this skill itself:
-
Summarize the pattern. Identify the recurring shape across the supporting Feedback Items. Confirm that the pattern is generalizable (would help on other repositories or tasks) — not a one-off detail of the current codebase.
-
Propose one or more bounded edits. Each proposal is a single accept/reject unit, small enough that a reviewer can decide in five minutes or less (). If a proposal does not fit, split it.
review_time_estimate_minutes ≤ 5 -
Categorize each proposal. Use one of sixvalues:
edit_kind- — add new content
ADD - — substitute existing content
REPLACE - — remove content
DELETE - — change order of existing content
REORDER - — narrow an existing rule (more specific case)
SPECIALIZE - — broaden an existing rule (cover more cases)
GENERALIZE
-
Mark the scope.is the only kind eligible for canonical acceptance.
scope: genericproposals MUST be rewritten to generalize, or rejected, or routed to a per-repository scope-decision artifact.scope: repository-scoped -
Preserve rejected proposals. Setand populate
status: rejected. Do not delete — the rejected proposals are part of the audit trail.decision_rationale -
Gate before accepting. Aproposal may move to
scope: genericonly after it passes a validation gate: applied to a candidate skill version, it must regenerate specifications for a held-out reference set (disjoint from the repositories whose feedback motivated it) whose per-dimension rubric scores do not regress and whose deterministic checks still pass, with the proposal's expected effect realized on at least one dimension (or explicitly waived). The run is recorded as a Validation Gate Report and referenced by the proposal. The gate authorizes — it does not replace — the acceptance flow below.status: accepted -
Apply accepted proposals to the canonical skill. When(with a passing gate referenced), the proposal's diff is applied to this
status: accepted. The canonical version is bumped per Keep-A-Changelog + semver: major if the diff breaks the adapter-equivalence checklist, minor if additive, patch if clarifying. A row is added toSKILL.md.skills/repo-skillopt/CHANGELOG.md -
Prefer edits that generalize. When a proposed change would specialize the generic skill to a single repository, flag it or rewrite it. Generalizable improvements outweigh repository-specific ones for this artifact.
Skill Edit Proposal模板位于。
templates/skill-edit-proposal.md当循环反馈(通常是一个或多个会话中的三个或更多相关Feedback Item)表明本技能存在弱点时:
-
总结模式。 识别支持性Feedback Item中的重复模式。确认该模式可推广(对其他仓库或任务有帮助)——而非当前代码库的一次性细节。
-
提出一个或多个有限编辑建议。 每个建议是一个独立的接受/拒绝单元,足够小,以便审核者在5分钟或更短时间内做出决定()。如果建议不符合要求,拆分它。
review_time_estimate_minutes ≤ 5 -
对每个建议进行分类。 使用6种值之一:
edit_kind- — 添加新内容
ADD - — 替换现有内容
REPLACE - — 删除内容
DELETE - — 更改现有内容的顺序
REORDER - — 缩小现有规则(更具体的情况)
SPECIALIZE - — 扩大现有规则(覆盖更多情况)
GENERALIZE
-
标记范围。是唯一符合标准接受条件的类型。
scope: generic的建议必须重写为通用型,或被拒绝,或路由到每个仓库的范围决策artifact。scope: repository-scoped -
保留被拒绝的建议。 设置并填充
status: rejected。不要删除——被拒绝的建议是审计跟踪的一部分。decision_rationale -
接受前进行验证。的建议只有在通过验证 gate 后才能转为
scope: generic:应用到候选技能版本后,它必须为保留的参考集(与激发反馈的仓库不相交)重新生成规范,其每个维度的评分标准分数不会倒退,确定性检查仍能通过,且建议的预期效果至少在一个维度上实现(或被明确豁免)。运行记录为Validation Gate Report并被建议引用。该gate是授权条件——而非替代下面的接受流程。status: accepted -
将已接受的建议应用到标准技能。 当(引用了通过的gate)时,将建议的diff应用到本
status: accepted。按照Keep-A-Changelog + semver提升标准版本:如果diff破坏了适配器等效性检查清单则升主版本,新增内容则升次版本,澄清内容则升补丁版本。在SKILL.md中添加一行。skills/repo-skillopt/CHANGELOG.md -
优先选择可推广的编辑。 当建议的更改会将通用技能特化为单个仓库时,标记它或重写它。对于本artifact,可推广的改进比仓库特定改进更重要。
Output Discipline
输出规范
Every major claim in a Repository Specification (or in any rollout-produced text) carries exactly one of four label prefixes, before the claim:
- — verified by inspection. MUST be immediately followed by at least one citation in one of these forms:
**[fact]**path/to/file.ext:linepath/to/file.ext:start-endpath/to/file.ext:Symbolpath/to/file.ext:Symbol:line- followed by
cmd: <command>output: <verbatim output>
- — derived from partial signal. State the basis (e.g., "basis: presence of
**[inference]**import inflask_login").src/auth.py:7 - — explicitly not determined. Also appears (or is referenced) under Unknowns and unresolved questions.
**[unknown]** - — provided by a human via the feedback loop. Cites the originating Feedback Item id (form:
**[human]**).FB-YYYY-MM-DD-NNN
A hypothesis presented as a fact is a defect. A fact without a citation is a defect. An unverifiable claim that is neither labeled nor is a defect. The output is meant to be useful to a real engineer — that means trustworthy, not pretty.
**[inference]****[unknown]**Trivial recitations (literal file contents, raw config dumps, syntactic restatements) are not major claims and do not require labels.
Repository Specification(或任何rollout生成的文本)中的每项重要论断都必须带有以下四个标签前缀之一,位于论断之前:
- — 通过检查验证。必须紧跟至少一种形式的引用:
**[fact]**path/to/file.ext:linepath/to/file.ext:start-endpath/to/file.ext:Symbolpath/to/file.ext:Symbol:line- 后跟
cmd: <command>output: <verbatim output>
- — 从部分信号推导得出。说明依据(例如:“依据:
**[inference]**中存在src/auth.py:7导入”)。flask_login - — 明确未确定。也会出现在(或被引用到)未知与未解决问题部分。
**[unknown]** - — 通过反馈循环由人类提供。引用原始Feedback Item ID(格式:
**[human]**)。FB-YYYY-MM-DD-NNN
将假设表述为事实是缺陷。无引用的事实是缺陷。未验证的断言既未标记也未标记是缺陷。输出旨在对实际工程师有用——这意味着可信,而非美观。
**[inference]****[unknown]**平凡复述(字面文件内容、原始配置转储、语法重述)不属于重要论断,无需标签。