loci-post-edit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

loci-post-edit

loci-post-edit

This skill merges execution-trace (timing/energy) and control-flow (CFG) analysis into a single post-edit report. It compares pre-edit and post-edit compiled artifacts to show exactly how the change affects hardware execution.
Check that loci MCP is connected and authenticated, you see the tools before running the preflight steps that require it. If the MCP is unavailable, tell the user:
LOCI MCP server is not connected. Please run
/mcp
in Claude Code to manage MCP servers, then approve the loci server. If it does not appear, restart Claude Code — the plugin registers it automatically on startup.
For plugin to work mcp should be authenticated and connected.
该技能将执行追踪(时间/能耗)和控制流(CFG)分析整合为一份编辑后报告。它会对比编辑前后的编译产物,精准展示代码变更对硬件执行的影响。
请先确认 loci MCP 已连接并完成认证,在运行需要依赖它的前置步骤前能看到对应的工具。如果 MCP 不可用,请告知用户:
LOCI MCP 服务器未连接,请在 Claude Code 中执行
/mcp
管理 MCP 服务器,然后授权 loci 服务器。如果未找到该服务器,请重启 Claude Code — 插件会在启动时自动注册该服务。
插件要正常运行,必须保证 MCP 已完成认证并处于连接状态。

Step 0: Check session context

步骤0:检查会话上下文

Read the persisted detection results from
state/project-context.json
in the plugin directory. This file is written once by setup.sh at session start and is the single source of truth for compiler, architecture, and build system. Do NOT re-run detection or fall back to ELF/build-system sniffing.
json
{
  "compiler": "...",
  "build_system": "...",
  "architecture": "...",
  "loci_target": "...",
  ...
}
If the file does not exist, stop and tell the user:
LOCI session context not found. Please restart Claude Code so the plugin setup runs and detects the project environment.
Also check the
system-reminder
block emitted at session start for:
Target: <target>, Compiler: <compiler>, Build: <build>
LOCI target: <loci_target>
Map the LOCI target to loci MCP supported architectures and binary targets:
LOCI targetTime from CPU
aarch64A53
armv7e-mCortexM4
armv6-mCortexM0P
tc3xxTC399
If the architecture is not in this table, emit and stop:
Supported: aarch64 , armv7e-m , armv6-m , tc399
读取插件目录下
state/project-context.json
中持久化存储的检测结果。该文件会在会话启动时由 setup.sh 一次性写入,是编译器、架构、构建系统的唯一可信数据源。请勿重新运行检测,也不要回退到 ELF/构建系统嗅探模式。
json
{
  "compiler": "...",
  "build_system": "...",
  "architecture": "...",
  "loci_target": "...",
  ...
}
如果该文件不存在,请终止流程并告知用户:
未找到 LOCI 会话上下文,请重启 Claude Code 以运行插件初始化流程,完成项目环境检测。
同时请检查会话启动时输出的
system-reminder
块,确认包含以下内容:
Target: <target>, Compiler: <compiler>, Build: <build>
LOCI target: <loci_target>
将 LOCI target 映射到 loci MCP 支持的架构和二进制目标:
LOCI targetTime from CPU
aarch64A53
armv7e-mCortexM4
armv6-mCortexM0P
tc3xxTC399
如果架构不在该表中,请输出以下内容并终止流程:
Supported: aarch64 , armv7e-m , armv6-m , tc399

Step 1: Identify pre-edit and post-edit artifacts

步骤1:识别编辑前后的产物

Locate the compiled artifact (
.o
or linked binary) for the edited source. Check build output directories from the project's build system, not just the source directory.
If no post-edit
.o
exists, compile the edited source with
-c
using the compiler and flags from step 0 (same as preflight Step 1). Always include
-g
to emit DWARF debug info (required by asm-analyze):
<compiler> -g <flags> -c <source> -o <basename>.o
Validate the .o after compilation — a standalone
-c
compile can exit 0 yet produce an empty object file when the source is wrapped in
#if
/
#ifdef
guards whose defines (
-D
) were not on the command line. After compiling, run:
<asm-analyze-cmd> extract-symbols --elf-path <basename>.o --arch <loci_target>
If the result shows 0 symbols or returns an error mentioning "no code" or "preprocessor", the target function was compiled out. In that case fall back to the existing linked binary (
.elf
,
.out
) for analysis instead. If no linked binary exists, report that standalone compilation produced an empty object and the full project build system is required.
For the pre-edit artifact: the preflight hook saves
<name>.o.prev
automatically. If preflight did not run (no
.o.prev
), proceed with absolute timing only — no % diff.
定位被编辑源码对应的编译产物(
.o
目标文件或链接后的二进制文件)。请从项目构建系统的输出目录中查找,不要仅搜索源码目录。
如果不存在编辑后的
.o
文件,请使用步骤0中获取的编译器和编译参数(和前置步骤1一致)加上
-c
参数编译被编辑的源码。始终添加
-g
参数生成 DWARF 调试信息(asm-analyze 要求依赖该信息):
<compiler> -g <flags> -c <source> -o <basename>.o
编译完成后请验证 .o 文件有效性 — 当源码被
#if
/
#ifdef
guard 包裹,且命令行中未传递对应的宏定义 (
-D
) 时,单独执行
-c
编译可能返回0退出码但生成空目标文件。编译完成后执行:
<asm-analyze-cmd> extract-symbols --elf-path <basename>.o --arch <loci_target>
如果结果显示0个符号,或者返回包含“no code”或“preprocessor”的错误,说明目标函数被编译剔除。这种情况请回退到已有的链接后二进制文件(
.elf
,
.out
)进行分析。如果也不存在链接后的二进制文件,请告知用户单独编译生成了空目标文件,需要完整运行项目构建系统。
对于编辑前的产物:前置钩子会自动保存
<name>.o.prev
文件。如果前置步骤未运行(没有
.o.prev
文件),则仅输出绝对执行时间,不计算百分比差异。

Step 2: diff-elfs — find modified/added functions

步骤2:diff-elfs — 查找修改/新增的函数

Read
asm-analyze command:
,
venv python:
, and
plugin dir:
from the LOCI session context (system-reminder at session start). Use these as
<asm-analyze-cmd>
,
<venv-python>
, and
<plugin-dir>
in the commands below.
<asm-analyze-cmd> diff-elfs --elf-path <pre.o> --comparing-elf-path <post.o> --arch <loci_target>
This returns lists of
modified
and
added
functions. Only these functions need analysis — skip unchanged code entirely.
If there is no pre-edit artifact, treat all functions in the post-edit artifact as "added".
从 LOCI 会话上下文(会话启动时的 system-reminder)中读取
asm-analyze command:
,
venv python:
, 和
plugin dir:
,在下方命令中分别作为
<asm-analyze-cmd>
,
<venv-python>
, 和
<plugin-dir>
使用。
<asm-analyze-cmd> diff-elfs --elf-path <pre.o> --comparing-elf-path <post.o> --arch <loci_target>
该命令会返回
modified
(修改)和
added
(新增)的函数列表。仅需要分析这些函数,完全跳过未变更的代码。
如果不存在编辑前的产物,则将编辑后产物中的所有函数视为“新增”。

Step 3: extract-assembly (pre + post)

步骤3:extract-assembly(编辑前+编辑后)

For modified functions, extract assembly from both artifacts:
<asm-analyze-cmd> extract-assembly --elf-path <pre.o> --functions <func1>,<func2> --arch <loci_target>
<asm-analyze-cmd> extract-assembly --elf-path <post.o> --functions <func1>,<func2> --arch <loci_target>
For added functions, extract from post-edit only:
<asm-analyze-cmd> extract-assembly --elf-path <post.o> --functions <new_func> --arch <loci_target>
The JSON output contains
timing_csv
and
timing_architecture
fields needed for the MCP call. The JSON also contains the
control_flow_graph
field that contains annotated CFG's in text-format optimized for LLM analysis.
the calls for extracting fields from the json output:
data = json.load(...) cfg_text = data["control_flow_graph"] # all functions, annotated CFG blocks timing_csv_chunks = data["timing_csv_chunks"] # list of per-block CSV chunks for MCP timing_architecture = data["timing_architecture"] # timing architecture
对于修改过的函数,从两个产物中分别提取汇编代码:
<asm-analyze-cmd> extract-assembly --elf-path <pre.o> --functions <func1>,<func2> --arch <loci_target>
<asm-analyze-cmd> extract-assembly --elf-path <post.o> --functions <func1>,<func2> --arch <loci_target>
对于新增的函数,仅从编辑后的产物中提取:
<asm-analyze-cmd> extract-assembly --elf-path <post.o> --functions <new_func> --arch <loci_target>
JSON 输出中包含 MCP 调用所需的
timing_csv
timing_architecture
字段,同时还包含
control_flow_graph
字段,存储了针对 LLM 分析优化的文本格式带注解 CFG。
从 JSON 输出中提取字段的代码示例:
data = json.load(...) cfg_text = data["control_flow_graph"] # 所有函数的带注解 CFG 块 timing_csv_chunks = data["timing_csv_chunks"] # 供 MCP 调用的按块拆分的 CSV 列表 timing_architecture = data["timing_architecture"] # 计时架构

Step 4: LOCI MCP timing — compute % diff

步骤4:LOCI MCP 计时 — 计算差异百分比

Call
mcp__loci__get_assembly_block_exec_behavior
for all chunks in parallel (one call per chunk, all in the same response):
  • csv_text
    : the chunk
  • architecture
    : the
    timing_architecture
    value from step 3
IMPORTANT: Issue all chunk calls simultaneously in a single message — do NOT call them sequentially. Concatenate the result CSVs (skip duplicate headers) before computing metrics.
Do this for both pre-edit and post-edit assembly of modified functions, and for post-edit only of added functions.
From the MCP response and also using the annotated CFG's from step 3, compute:
  • Happy path =
    execution_time_ns
    -
    std_dev
  • Worst path =
    execution_time_ns
    +
    std_dev
  • Energy =
    energy_ws
    (report in uWs)
For modified functions, compute % diff:
diff_pct = ((post_value - pre_value) / pre_value) * 100
并行调用所有 chunk 对应的
mcp__loci__get_assembly_block_exec_behavior
接口(每个chunk一个调用,全部放在同一个响应中):
  • csv_text
    : 对应 chunk 内容
  • architecture
    : 步骤3中获取的
    timing_architecture
重要提示:请在同一条消息中同时发起所有 chunk 的调用,不要顺序调用。计算指标前请拼接所有返回的 CSV(跳过重复的表头)。
对于修改的函数,需要同时处理编辑前和编辑后的汇编;对于新增的函数,仅处理编辑后的汇编即可。
基于 MCP 响应和步骤3中获取的带注解 CFG,计算以下指标:
  • 最优路径 =
    execution_time_ns
    -
    std_dev
  • 最差路径 =
    execution_time_ns
    +
    std_dev
  • 能耗 =
    energy_ws
    (单位为 uWs)
对于修改的函数,计算差异百分比:
diff_pct = ((post_value - pre_value) / pre_value) * 100

Graceful degradation

优雅降级策略

  • LOCI MCP unavailable — report CFG analysis only, note "(timing unavailable — MCP not connected)"
  • No pre-edit artifact — report absolute timing only, no % diff
  • LOCI MCP 不可用 — 仅报告 CFG 分析结果,标注「(计时不可用 — MCP 未连接)」
  • 无编辑前产物 — 仅报告绝对计时数据,不计算百分比差异

Step 5: Emit report

步骤5:输出报告

Modified functions

修改的函数

undefined
undefined

Post-Edit: <FunctionName>

编辑后:<FunctionName>

Execution (<loci_target>)

执行表现 (<loci_target>)

Before After Diff Happy path: XXX.XX ns XXX.XX ns +X.X% | -X.X% Worst path: XXX.XX ns XXX.XX ns +X.X% | -X.X% Energy: XXX.XX uWs XXX.XX uWs +X.X% | -X.X%
编辑前 编辑后 差异 最优路径: XXX.XX ns XXX.XX ns +X.X% | -X.X% 最差路径: XXX.XX ns XXX.XX ns +X.X% | -X.X% 能耗: XXX.XX uWs XXX.XX uWs +X.X% | -X.X%

Control Flow

控制流

<brief CFG analysis from step 4>
<步骤4中生成的简要 CFG 分析结果>

Reasoning

分析结论

<implementation verification — see below>
undefined
<实现正确性验证 — 参考下方规则>
undefined

New/added functions

新增的函数

undefined
undefined

Post-Edit: <FunctionName> (NEW)

编辑后:<FunctionName> (新增)

Execution (<loci_target>)

执行表现 (<loci_target>)

Happy path: XXX.XX ns Worst path: XXX.XX ns Energy: XXX.XX uWs
最优路径: XXX.XX ns 最差路径: XXX.XX ns 能耗: XXX.XX uWs

Control Flow

控制流

<CFG analysis from step 4>
<步骤4中生成的 CFG 分析结果>

Reasoning

分析结论

<implementation verification — see below>
undefined
<实现正确性验证 — 参考下方规则>
undefined

No pre-edit artifact (absolute only)

无编辑前产物(仅显示绝对值)

undefined
undefined

Post-Edit: <FunctionName>

编辑后:<FunctionName>

Execution (<loci_target>)

执行表现 (<loci_target>)

Happy path: XXX.XX ns Worst path: XXX.XX ns Energy: XXX.XX uWs (no pre-edit artifact — showing absolute values only)
最优路径: XXX.XX ns 最差路径: XXX.XX ns 能耗: XXX.XX uWs (无编辑前产物 — 仅显示绝对值)

Control Flow

控制流

<CFG analysis>
<CFG 分析结果>

Reasoning

分析结论

<implementation verification — see below>
undefined
<实现正确性验证 — 参考下方规则>
undefined

Reasoning section guidelines

分析结论部分编写规范

The Reasoning section verifies whether the implementation is sound based on the LOCI timing, energy, and CFG data above. Address each of these:
  1. Timing impact — Is the timing diff expected given the code change? Flag unexpected regressions (e.g. a "simple guard" that adds >10% worst path). Note when the change is timing-neutral or improves performance.
  2. Hotspot check — Using the CFG and per-block timing, identify the hottest block(s). Does the new/changed code sit on the hot path? If yes, call it out.
  3. Std-dev confidence — High
    std_dev
    means the assembly pattern is underrepresented in LCLM training data. Flag any block where
    std_dev > execution_time_ns
    as low-confidence.
  4. Energy budget — Is the energy delta acceptable for the target? For battery-powered / embedded targets, flag increases above 5%.
  5. Verdict — One line: does the implementation look correct from an execution perspective? Use: OK, CAUTION (with reason), or FLAG (with specific concern).
分析结论部分需要基于上述 LOCI 计时、能耗和 CFG 数据验证实现是否合理,需要覆盖以下几点:
  1. 时序影响 — 结合代码变更,时序差异是否符合预期?标记非预期的性能退化(比如一个“简单的guard”导致最差路径耗时增加超过10%)。如果变更对时序无影响或提升了性能,也需要注明。
  2. 热点检查 — 结合 CFG 和块级计时数据,识别最热的代码块。新增/修改的代码是否位于热路径上?如果是,请明确标注。
  3. 标准差置信度 — 高
    std_dev
    说明该汇编模式在 LCLM 训练数据中占比很低。如果任意块的
    std_dev > execution_time_ns
    ,请标记为低置信度。
  4. 能耗预算 — 能耗变化对于目标设备是否可接受?对于电池供电/嵌入式目标,标记能耗提升超过5%的情况。
  5. 最终判定 — 一行总结:从执行角度来看实现是否正确?可选值:OK、CAUTION(附带原因)、FLAG(附带具体问题)。

Action on CAUTION or FLAG

CAUTION 或 FLAG 判定的处理动作

When the verdict is CAUTION or FLAG, do not just report — act on it:
  1. Propose a fix — based on the LOCI timing, energy, and CFG data, describe a specific code change that would resolve the concern (e.g., cache a result, use a lighter callee, move work off the hot path, flatten the call chain).
  2. Ask the user — present the concern and proposed fix, and ask whether to apply the rewrite. Do not silently proceed or ignore the finding.
Example:
Verdict: FLAG — worst path regressed +42% due to new snprintf call on hot path.
Proposed fix: replace snprintf with a bounded itoa + memcpy (saves ~180 ns worst case).
Apply this rewrite? [user decides]
当判定结果为 CAUTIONFLAG 时,不要仅输出报告 — 请采取对应动作:
  1. 提出修复方案 — 基于 LOCI 计时、能耗和 CFG 数据,描述可以解决问题的具体代码变更(比如缓存计算结果、使用更轻量的被调函数、将逻辑移出热路径、扁平化调用链)。
  2. 询问用户 — 说明问题和 proposed 修复方案,询问是否要应用重写。不要静默推进或忽略发现的问题。
示例:
最终判定:FLAG — 热路径上新增的 snprintf 调用导致最差路径性能退化42%。
修复建议:用有边界的 itoa + memcpy 替换 snprintf(最差场景可节省约180 ns)。
是否应用该重写?[用户确认]

LOCI voice remark

LOCI 短评

Before the footer, add one short LOCI voice remark (max 15 words) that acknowledges the user's work grounded in a specific number from the analysis. Attribute improvements to the user ("clean work", "smart move", "tight code"). For concerns, be honest and constructive with specifics. Skip if the analysis produced no results or the user needs raw data only.
在页脚前添加一条简短的 LOCI 短评(最多15字),基于分析中的具体数值认可用户的工作。如果是优化可以归功于用户(「干得漂亮」、「巧妙的优化」、「简洁的代码」)。如果是问题,请给出诚实有建设性的具体反馈。如果分析没有产出结果或用户仅需要原始数据,可以跳过该部分。

LOCI footer

LOCI 页脚

After emitting all per-function reports, append this footer once as the very last thing printed — only if N > 0. If no functions were processed, do NOT emit the footer.
Record cumulative stats (run via Bash before rendering the footer):
<venv-python> <plugin-dir>/lib/loci_stats.py record --skill post-edit --functions <N> --mcp-calls <M> --co-reasoning <R>
Read cumulative summary (run via Bash; capture output):
<venv-python> <plugin-dir>/lib/loci_stats.py summary
Render the footer — include the summary line only if the command produced output:
─── LOCI · post-edit ───────────────────
  <N> functions · <M> MCP calls · <R> co-reasoning
  Verdict: <OK | CAUTION | FLAG> — <one-line summary>
    <cumulative-summary-output>        ← omit if empty
────────────────────────────────────────
  • N = unique functions (modified + added) whose assembly was sent to LOCI
  • M = MCP calls to
    mcp__loci__get_assembly_block_exec_behavior
    (exec-behaviors) (typically 2 for modified functions: pre + post; 1 for added functions)
  • R = co-reasoning (one per function that has a Reasoning section)
输出完所有函数的报告后,仅当 N > 0 时在最后追加该页脚。如果没有处理任何函数,不要输出页脚。
记录累计统计数据(渲染页脚前通过 Bash 执行):
<venv-python> <plugin-dir>/lib/loci_stats.py record --skill post-edit --functions <N> --mcp-calls <M> --co-reasoning <R>
读取累计汇总(通过 Bash 执行,捕获输出):
<venv-python> <plugin-dir>/lib/loci_stats.py summary
渲染页脚 — 仅当命令有输出时包含汇总行:
─── LOCI · post-edit ───────────────────
  <N> 个函数 · <M> 次 MCP 调用 · <R> 次协同分析
  最终判定:<OK | CAUTION | FLAG> — <一行总结>
    <cumulative-summary-output>        ← 为空则省略
────────────────────────────────────────
  • N = 汇编已发送给 LOCI 分析的唯一函数总数(修改+新增)
  • M = 对
    mcp__loci__get_assembly_block_exec_behavior
    的 MCP 调用次数 (修改的函数通常为2次:编辑前+编辑后;新增的函数为1次)
  • R = 协同分析次数(每个包含分析结论的函数计1次)