ton-bug-triage

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

TON Bug Triage

TON Bug排查

Use this skill when the job is to prove something on a local
tontester
network, not just to launch validators.
Typical triggers:
  • deploy a contract and trigger a bug with an internal message
  • compare baseline and probing validator builds
  • verify a crash, liveness failure, malformed-packet path, or compatibility claim
  • collect maintainer-ready evidence for a local TON repro
The standard is: choose the smallest topology that answers the question, define the success condition before running, and collect enough evidence that the result is interpretable.
仅当需要在本地
tontester
网络上验证某些问题时使用此技能,而不只是启动验证器。
典型触发场景:
  • 部署合约并通过内部消息触发Bug
  • 对比基准版本与测试版本的验证器构建
  • 验证崩溃、活性故障、畸形数据包路径或兼容性声明
  • 收集可提交给维护者的本地TON复现证据
标准流程:选择能解答问题的最小拓扑结构,在运行前定义成功条件,并收集足够的证据以确保结果可解释。

Working Model

工作模型

Keep these paths distinct:
  • Skill scripts: files under this skill directory, such as
    scripts/run_basic_network.py
  • Source tree: the real TON checkout passed as
    --repo-root
  • Build directory: binaries and libraries such as
    validator-engine
    ,
    create-state
    ,
    tonlibjson
    , and
    tolk
  • Work directory: per-run state, logs, configs, and emitted artifacts
Do not assume the skill directory and the repo are the same thing. The scripts live in the skill. They operate on the repo and build you pass in.
wallet-env.txt
is the main handoff artifact between the launcher and follow-up helpers.
These helpers depend on
tontester
internals and private APIs. If
tontester
changes, expect to adjust helper behavior, generated bindings, or command assumptions.
请区分以下路径:
  • 技能脚本:此技能目录下的文件,例如
    scripts/run_basic_network.py
  • 源码树:通过
    --repo-root
    传入的真实TON代码仓库
  • 构建目录:二进制文件和库,如
    validator-engine
    create-state
    tonlibjson
    tolk
  • 工作目录:每次运行的状态、日志、配置和生成的产物
不要假设技能目录与代码仓库是同一目录。脚本存放在技能目录中,它们会对传入的代码仓库和构建版本进行操作。
wallet-env.txt
是启动器与后续辅助工具之间的主要交接产物。
这些辅助工具依赖
tontester
的内部实现和私有API。如果
tontester
发生变更,需要调整辅助工具的行为、生成的绑定或命令假设。

Workflow Selection

工作流选择

Choose one workflow before running anything:
  • Workflow A — trigger via transaction
    Use this when the bug is reached by deploying a contract, sending an internal message, or delivering a malformed/custom payload to a contract account.
  • Workflow B — trigger via validator behavior
    Use this when the bug requires patched validator code, mixed builds, consensus interference, malformed protocol packets, reordered traffic, or deliberately invalid block behavior.
If a repro touches both, ask one question first: can you trigger it on an unmodified network after a normal deploy/send path? If yes, start with Workflow A. If no, treat it as Workflow B.
在运行任何操作前选择一个工作流:
  • 工作流A — 通过交易触发
    当Bug需要通过部署合约、发送内部消息或向合约账户传递畸形/自定义负载来触发时使用此工作流。
  • 工作流B — 通过验证器行为触发
    当Bug需要已打补丁的验证器代码、混合构建版本、共识干扰、畸形协议数据包、重排流量或故意的无效区块行为来触发时使用此工作流。
如果某个复现场景同时涉及两种情况,请先回答一个问题:在未修改的网络中,通过常规部署/发送流程能否触发该Bug?如果可以,从工作流A开始;如果不行,则按工作流B处理。

Core Rules

核心规则

  1. Start with the smallest network that can answer the question. Use one validator unless the path needs peers, consensus, or mixed builds.
  2. Prefer ordinary deploy/send flow over zerostate edits. If the bug only reproduces after zerostate mutation, say that clearly.
  3. Keep baseline and probing builds separate. Usually
    vanilla-build/
    is baseline and
    build/
    or
    build-probing/
    is the modified build.
  4. Decide the success condition before running. Examples: target
    mc_seqno
    , explicit crash marker, process death, active contract account, inspected transaction, or honest-node rejection of a malformed packet.
  5. For Workflow B, make the probing node self-immune. Operational meaning: the probing node may emit malformed or adversarial behavior, but it must stay alive until the target effect is observed on the honest nodes. A run is invalid if the probing node dies first and that death could explain the outcome.
  6. Record enough evidence to rerun the exact scenario. Keep the run directory, the build directories, the commands, the relevant addresses, and the exported artifacts.
  7. Before calling something maintainer-ready, rerun it from a clean checkout or detached worktree with only the intended artifacts.
  1. 从能解答问题的最小网络规模开始。 除非场景需要节点对等、共识或混合构建版本,否则仅使用一个验证器。
  2. 优先使用常规部署/发送流程,而非零状态编辑。 如果Bug仅能在零状态修改后复现,请明确说明这一点。
  3. 分开存放基准版本与测试版本的构建产物。 通常
    vanilla-build/
    为基准版本,
    build/
    build-probing/
    为修改后的版本。
  4. 在运行前定义成功条件。 示例:目标
    mc_seqno
    、明确的崩溃标记、进程终止、活跃的合约账户、已检查的交易、诚实节点拒绝畸形数据包等。
  5. 对于工作流B,确保测试节点具备自我免疫能力。 操作层面的含义:测试节点可能会产生畸形或对抗性行为,但必须在诚实节点观察到目标效果前保持存活。如果测试节点先死亡,且该死亡可能影响结果,则本次运行无效。
  6. 记录足够的证据以重新运行完全相同的场景。 保留运行目录、构建目录、命令、相关地址和导出的产物。
  7. 在将证据提交给维护者前,从干净的代码检出或独立工作树中重新运行,仅保留必要的产物。

Core Scripts

核心脚本

Prefer the bundled scripts over one-off shell sequences.
  • scripts/run_basic_network.py
    Launch one or more validators from a single build. Supports
    --emit-wallet-env
    ,
    --base-port
    ,
    --validators
    , and
    --keep-alive
    .
  • scripts/run_mixed_network.py
    Launch baseline and probing validators from different build directories. Use this for malicious-vs-honest experiments, log-based crash detection, and probing-node survival checks.
  • scripts/compile_tolk.py
    Compile a
    .tolk
    source to
    .fif
    and materialize the contract code BoC. It hard-fails if the
    tolk
    binary is stale relative to repo
    HEAD
    .
  • scripts/build_stateinit.py
    Build a deployable
    StateInit
    BoC from code plus optional data and library-dictionary BoCs.
  • scripts/run_fift_script.py
    Run a
    .fif
    script with the correct include paths.
  • scripts/wallet_send.py
    Build and optionally send a wallet-signed message. Use
    --init-boc
    for deployment and
    --body-boc
    for arbitrary internal payloads.
  • scripts/send_boc.py
    Send a prebuilt serialized external message BoC through tonlib without rebuilding it via
    wallet_send.py
    .
  • scripts/address_info.py
    Normalize and inspect raw and friendly TON address forms when helper inputs need to be cross-checked.
  • scripts/account_state.py
    Fetch raw account state and dump code/data BoCs for inspection.
  • scripts/get_method.py
    Run a get-method through tonlib and print JSON.
  • scripts/inspect_latest_transaction.py
    Fetch the latest account transaction, export raw transaction data, and save message body/init-state BoCs when tonlib returns them as
    msg.dataRaw
    .
  • scripts/run_liteclient.py
    Run a
    lite-client
    command using
    wallet-env.txt
    or explicit repo/build/config inputs.
  • scripts/dump_boc.py
    Print a BoC cell tree through Fift. Use this for payloads,
    StateInit
    , exported transaction data, and dumped account code/data.
  • scripts/summarize_run.py
    Summarize node liveness and crash markers for a finished or live run directory.
  • scripts/demo_wallet_flow.py
    Known-good end-to-end verifier for the simple wallet path.
优先使用捆绑脚本而非一次性shell命令序列。
  • scripts/run_basic_network.py
    从单个构建版本启动一个或多个验证器。支持
    --emit-wallet-env
    --base-port
    --validators
    --keep-alive
    参数。
  • scripts/run_mixed_network.py
    从不同的构建目录启动基准版本和测试版本的验证器。用于恶意节点与诚实节点的对比实验、基于日志的崩溃检测和测试节点存活检查。
  • scripts/compile_tolk.py
    .tolk
    源码编译为
    .fif
    并生成合约代码BoC。如果
    tolk
    二进制文件相对于代码仓库
    HEAD
    版本过期,会直接执行失败。
  • scripts/build_stateinit.py
    从代码(可选包含数据和库字典BoC)构建可部署的
    StateInit
    BoC。
  • scripts/run_fift_script.py
    使用正确的包含路径运行
    .fif
    脚本。
  • scripts/wallet_send.py
    构建并可选发送钱包签名的消息。使用
    --init-boc
    进行部署,使用
    --body-boc
    发送任意内部负载。
  • scripts/send_boc.py
    通过tonlib发送预构建的序列化外部消息BoC,无需通过
    wallet_send.py
    重新构建。
  • scripts/address_info.py
    当需要交叉检查辅助工具的输入时,标准化并检查原始和友好格式的TON地址。
  • scripts/account_state.py
    获取原始账户状态并导出代码/数据BoC以供检查。
  • scripts/get_method.py
    通过tonlib运行get方法并打印JSON结果。
  • scripts/inspect_latest_transaction.py
    获取账户的最新交易,导出原始交易数据,并在tonlib返回
    msg.dataRaw
    时保存消息体/初始状态BoC。
  • scripts/run_liteclient.py
    使用
    wallet-env.txt
    或显式的代码仓库/构建/配置输入运行
    lite-client
    命令。
  • scripts/dump_boc.py
    通过Fift打印BoC单元格树。用于负载、
    StateInit
    、导出的交易数据和导出的账户代码/数据。
  • scripts/summarize_run.py
    对已完成或正在运行的目录中的节点活性和崩溃标记进行汇总。
  • scripts/demo_wallet_flow.py
    针对简单钱包流程的已知可用端到端验证工具。

Workflow A

工作流A

Use Workflow A when the trigger is “deploy contract, then send message”.
  1. Launch a small network with
    --emit-wallet-env
    .
  2. Compile your contract with
    scripts/compile_tolk.py
    , or provide a hand-built code BoC from Fift assembly if you are not using Tolk.
  3. Build deployable
    StateInit
    with
    scripts/build_stateinit.py
    . Pass
    --data-boc
    and
    --library-boc
    only when the contract actually needs them.
  4. Deploy from the funded built-in
    main-wallet
    with
    scripts/wallet_send.py --init-boc
    . A deploy message that uses
    --init-boc
    may also invoke the contract's internal message handler. Check whether the deploy transaction itself changed contract state before sending a separate trigger.
  5. Wait for activation. Prefer
    account_state.py
    plus a contract-specific get-method over assuming one masterchain advance is enough.
  6. Build the trigger body as a BoC when the payload is custom or binary.
    run_fift_script.py
    is the easiest way to emit a one-off
    body.boc
    .
  7. Send the trigger with
    scripts/wallet_send.py --body-boc
    .
    --body-boc
    is the authoritative payload path and overrides
    --comment
    . If you already have a signed external message BoC, use
    scripts/send_boc.py --boc
    instead of rebuilding it with
    wallet_send.py
    .
    --wait-mc-advance
    proves the network is alive. It does NOT prove the transaction succeeded, the deploy activated, or the contract state changed. Always inspect the target account directly.
  8. Observe with
    inspect_latest_transaction.py --out-dir
    and
    dump_boc.py
    . The exported
    transaction-data.boc
    and
    in-msg-body.boc
    are the starting point for cell-level debugging. If the tonlib-based helpers crash (common symptom:
    KeyError: '@extra'
    ), use
    run_liteclient.py
    as the fallback for all observation steps. See
    references/liteclient_fallback.md
    .
For the full worked sequence, read
references/contract_deploy_flow.md
.
For the simple wallet smoke path only, read
references/wallet_and_deploy_helpers.md
.
当触发条件为“部署合约,然后发送消息”时使用工作流A。
  1. 使用
    --emit-wallet-env
    启动一个小型网络。
  2. 使用
    scripts/compile_tolk.py
    编译合约,若不使用Tolk,则提供通过Fift汇编手工构建的代码BoC。
  3. 使用
    scripts/build_stateinit.py
    构建可部署的
    StateInit
    。 仅当合约确实需要时才传入
    --data-boc
    --library-boc
    参数。
  4. 使用
    scripts/wallet_send.py --init-boc
    从内置的已资助
    main-wallet
    进行部署。 使用
    --init-boc
    的部署消息也可能调用合约的内部消息处理程序。在发送单独的触发消息前,检查部署交易本身是否已更改合约状态。
  5. 等待激活完成。 优先使用
    account_state.py
    加上合约特定的get方法,而非假设主链一次推进就足够。
  6. 当负载为自定义或二进制格式时,将触发消息体构建为BoC。
    run_fift_script.py
    是生成一次性
    body.boc
    的最简单方式。
  7. 使用
    scripts/wallet_send.py --body-boc
    发送触发消息。
    --body-boc
    是权威的负载路径,会覆盖
    --comment
    参数。 若已有签名的外部消息BoC,请使用
    scripts/send_boc.py --boc
    而非通过
    wallet_send.py
    重新构建。
    --wait-mc-advance
    仅用于证明网络处于活跃状态,不能证明交易成功、部署已激活或合约状态已更改。请始终直接检查目标账户。
  8. 使用
    inspect_latest_transaction.py --out-dir
    dump_boc.py
    进行观察。 导出的
    transaction-data.boc
    in-msg-body.boc
    是单元格级调试的起点。 若基于tonlib的辅助工具崩溃(常见症状:
    KeyError: '@extra'
    ),请使用
    run_liteclient.py
    作为所有观察步骤的备选方案。详见
    references/liteclient_fallback.md
完整的操作流程请参阅
references/contract_deploy_flow.md
仅针对简单钱包冒烟测试流程,请参阅
references/wallet_and_deploy_helpers.md

Workflow B

工作流B

Use Workflow B when the trigger is “modify validator behavior, then observe honest-node reaction”.
  1. Snapshot the baseline build before probing changes.
  2. Patch the probing build only, and gate every behavioral change behind explicit
    TON_PROBING_*
    environment variables.
  3. Choose a mixed topology and explicit success and failure conditions. Use
    --success-log
    ,
    --failure-log
    ,
    --require-node-alive
    , and
    --require-node-dead
    deliberately.
  4. Run the mixed network with
    scripts/run_mixed_network.py
    .
  5. Confirm both sides of the claim: probing markers were hit, and the honest-node effect happened or did not happen.
  6. Summarize the run with
    scripts/summarize_run.py
    , then inspect specific logs manually only where the summary points.
For common patch shapes, read
references/probing_patterns.md
.
For an end-to-end example, read
references/bug_hunt_example.md
.
当触发条件为“修改验证器行为,然后观察诚实节点的反应”时使用工作流B。
  1. 在进行测试修改前,对基准版本的构建产物创建快照。
  2. 仅修改测试版本的构建产物,并将所有行为变更通过显式的
    TON_PROBING_*
    环境变量进行控制。
  3. 选择混合拓扑结构,并明确成功和失败条件。 有意使用
    --success-log
    --failure-log
    --require-node-alive
    --require-node-dead
    参数。
  4. 使用
    scripts/run_mixed_network.py
    启动混合网络。
  5. 验证声明的正反两面: 测试标记已触发,且诚实节点的效果已发生或未发生。
  6. 使用
    scripts/summarize_run.py
    汇总运行结果,仅在汇总结果指向的位置手动检查特定日志。
常见的补丁形式请参阅
references/probing_patterns.md
端到端示例请参阅
references/bug_hunt_example.md

Negative Result

阴性结果

A valid “did not reproduce” is still useful evidence. Treat a run as a real negative result only if all of the following are true:
  • the selected workflow actually executed its trigger path
  • the relevant probing or trigger markers were present
  • the network stayed healthy enough that the absence of the bug is meaningful
  • the observation step looked at the right account, transaction, or node logs
  • the timeout or seqno target was reached without the target effect
Stop iterating and report a negative result when a clean rerun gives the same outcome and the remaining changes are only minor parameter churn.
有效的“未复现”结果也是有用的证据。仅当满足以下所有条件时,才将某次运行视为真实的阴性结果:
  • 所选工作流已实际执行其触发路径
  • 相关的测试或触发标记已存在
  • 网络保持足够健康,Bug未出现的结果具备意义
  • 观察步骤已检查正确的账户、交易或节点日志
  • 已达到超时或seqno目标,但未出现预期效果
当干净的重新运行给出相同结果,且剩余变更仅为次要参数调整时,停止迭代并报告阴性结果。

Iteration And Debugging

迭代与调试

  • If
    compile_tolk.py
    fails with a stale-build error, rebuild
    tolk
    first. The exact fix is usually
    ninja -C <build-dir> tolk
    .
  • If a deploy appears to succeed but the destination still looks empty, check too-early observation first. Wait for activation instead of assuming the deploy path is broken.
  • If a contract is active but the expected get-method fails, verify the method id and the initial data layout before changing the network topology.
  • If
    inspect_latest_transaction.py
    shows the wrong payload, dump both the original trigger BoC and the exported
    in-msg-body.boc
    and compare their cell trees.
  • If probing markers are missing in Workflow B, the environment variables did not reach the node or the patched code path never executed.
  • If the probing node dies before the target effect is observed, the run is invalid.
  • compile_tolk.py
    因构建过期错误失败,请先重新构建
    tolk
    。 通常的修复命令是
    ninja -C <build-dir> tolk
  • 若部署看似成功,但目标账户仍为空,请首先检查是否观察过早。 等待激活完成,而非假设部署流程存在问题。
  • 若合约已激活,但预期的get方法执行失败,请先验证方法ID和初始数据布局,再更改网络拓扑结构。
  • inspect_latest_transaction.py
    显示错误的负载,请导出原始触发BoC和导出的
    in-msg-body.boc
    并对比它们的单元格树。
  • 若工作流B中测试标记缺失,说明环境变量未传递到节点,或补丁代码路径从未执行。
  • 若测试节点在观察到目标效果前死亡,则本次运行无效。

Evidence Standard

证据标准

Record enough to support the claim:
  • exact run directory
  • build directories used
  • success condition used
  • relevant launcher and helper commands
  • target addresses or node names
  • exported transaction and BoC artifacts for Workflow A
  • node liveness and log markers for Workflow B
For Workflow A, remember that
--wait-mc-advance
is only a liveness hint; confirm success from the target account or transaction.
记录足够的信息以支持声明:
  • 精确的运行目录
  • 使用的构建目录
  • 使用的成功条件
  • 相关的启动器和辅助工具命令
  • 目标地址或节点名称
  • 工作流A的导出交易和BoC产物
  • 工作流B的节点活性和日志标记
对于工作流A,请记住
--wait-mc-advance
仅为活性提示;需从目标账户或交易确认是否成功。

Troubleshooting

故障排除

  • If
    python
    is missing, use
    python3
    .
  • If
    tonapi
    bindings are missing, the helpers usually generate them from
    test/tontester/generate_tl.py
    .
  • If a helper is copied outside the repo, keep passing the real repo as
    --repo-root
    ; includes and generated artifacts come from the repo, not the skill folder.
  • If tonlib-based scripts (
    account_state.py
    ,
    get_method.py
    ,
    inspect_latest_transaction.py
    ,
    wallet_send.py --auto-seqno
    ) crash with
    KeyError: '@extra'
    or similar tonlib wrapper errors, the local tonlib build has a compatibility issue. Fall back to
    lite-client
    for inspection and use
    wallet_send.py
    with manual
    --seqno
    instead of
    --auto-seqno
    . See
    references/liteclient_fallback.md
    for the shared fallback flow.
  • If
    vanilla-build/CMakeCache.txt
    points at
    build/
    , treat
    vanilla-build
    as invalid and recreate it. Do not trust a build tree that reconfigures into the wrong directory.
  • Do not assume
    lite-client
    is present in the passed build directory. Some checkouts only have
    create-state
    ,
    tonlibjson
    , and
    tolk
    built.
  • If disk is full, clear old
    run-*
    directories before retrying.
  • 若缺少
    python
    命令,请使用
    python3
  • 若缺少
    tonapi
    绑定,辅助工具通常会从
    test/tontester/generate_tl.py
    生成它们。
  • 若辅助工具被复制到代码仓库外,请继续传入真实的代码仓库路径作为
    --repo-root
    ;包含文件和生成的产物来自代码仓库,而非技能目录。
  • 若基于tonlib的脚本(
    account_state.py
    get_method.py
    inspect_latest_transaction.py
    wallet_send.py --auto-seqno
    )因
    KeyError: '@extra'
    或类似的tonlib包装器错误崩溃,说明本地tonlib构建存在兼容性问题。请改用
    lite-client
    进行检查,并使用
    wallet_send.py
    的手动
    --seqno
    参数而非
    --auto-seqno
    。 共享的备选流程请参阅
    references/liteclient_fallback.md
  • vanilla-build/CMakeCache.txt
    指向
    build/
    ,则
    vanilla-build
    视为无效,需重新创建。不要信任配置到错误目录的构建树。
  • 不要假设传入的构建目录中存在
    lite-client
    。部分代码仓库仅构建了
    create-state
    tonlibjson
    tolk
  • 若磁盘已满,请在重试前清理旧的
    run-*
    目录。

References

参考文档

  • references/wallet_and_deploy_helpers.md
    Read this for the proven simple-wallet smoke path.
  • references/contract_deploy_flow.md
    Read this for the generic Workflow A compile, deploy, trigger, inspect path.
  • references/liteclient_fallback.md
    Read this when tonlib-based helpers fail and Workflow A needs
    lite-client
    inspection or manual wallet seqno handling.
  • references/probing_patterns.md
    Read this when designing Workflow B code changes.
  • references/diagnostic_checklist.md
    Read this when a run completed but the outcome is unclear.
  • references/bug_hunt_example.md
    Read this for a concrete mixed-network negative-result example.
  • references/wallet_and_deploy_helpers.md
    针对已验证的简单钱包冒烟测试流程的文档。
  • references/contract_deploy_flow.md
    通用工作流A的编译、部署、触发、检查流程文档。
  • references/liteclient_fallback.md
    当基于tonlib的辅助工具失败时,工作流A使用
    lite-client
    进行检查和手动钱包seqno处理的备选流程文档。
  • references/probing_patterns.md
    设计工作流B代码变更时的参考文档。
  • references/diagnostic_checklist.md
    当运行完成但结果不明确时的参考文档。
  • references/bug_hunt_example.md
    混合网络阴性结果的具体示例文档。