pt2-bug-basher

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PT2 Bug Basher

PT2 故障调试指南

Debug test failures and runtime errors in the PyTorch 2 compiler stack (Dynamo, Inductor, AOTAutograd, FX graphs).
调试PyTorch 2编译器栈(Dynamo、Inductor、AOTAutograd、FX图)中的测试失败和运行时错误。

Workflow Summary

工作流程概述

  1. Reproduce -- Get a consistent reproduction of the failure
  2. Minimize -- Reduce the repro to the smallest possible standalone case. Strip away unrelated model logic, use minimal tensor shapes, and isolate the specific op or pattern that triggers the bug.
  3. Add a unit test -- Do this BEFORE diving into code search or root cause investigation. Add a failing test to the codebase that captures the bug. Place it in a specific, topic-appropriate test file (e.g.,
    test/dynamo/test_repros.py
    ,
    test/inductor/test_torchinductor.py
    ,
    test/export/test_export.py
    ). Avoid
    test/dynamo/test_misc.py
    — it is already oversized; find a more specific test file that matches the area of the bug. Use
    torch.testing._internal.common_utils.TestCase
    and
    run_tests
    . The test must fail before the fix and pass after. Having the test first keeps you grounded — you know exactly what "fixed" looks like before you start exploring the codebase.
  4. Gather logs -- Run with appropriate
    TORCH_LOGS
    settings
  5. Classify -- Use the Error Triage table to identify the category
  6. Inspect artifacts -- Check FX graphs, IR, and generated code via
    TORCH_COMPILE_DEBUG=1
  7. Identify root cause -- Trace from the error back through the compilation pipeline
  8. Fix -- Apply the fix
  9. Verify -- Run the new unit test AND nearby related existing tests (e.g., if you changed how
    is_exporting
    works, also run the existing
    test_is_exporting
    export test). Use
    pytest -k
    to quickly run related tests by name. The task is not complete until all pass.
  10. Self-review -- Use the
    /pr-review
    skill to review your own changes before presenting them. Fix any issues it flags.
  11. Celebrate -- Summarize the changes: explain the root cause, what was changed and why, and which tests were added/verified. Then tell the user the bug is squashed. Include a fun, varied motivational message or easter egg to keep spirits high (e.g., a pun, a quote, an ASCII art bug getting squashed). Keep it short and different each time.
  1. 复现问题 —— 稳定复现故障场景
  2. 最小化复现用例 —— 将复现场景简化为最小的独立案例。剥离无关的模型逻辑,使用最小的张量形状,定位触发故障的特定算子或模式。
  3. 添加单元测试 —— 在深入代码搜索或根因分析前务必完成此步骤。向代码库中添加一个能复现该故障的测试用例。将其放在特定的、与主题匹配的测试文件中(例如
    test/dynamo/test_repros.py
    test/inductor/test_torchinductor.py
    test/export/test_export.py
    )。避免使用
    test/dynamo/test_misc.py
    —— 该文件已过于庞大,请找到与故障领域更匹配的特定测试文件。使用
    torch.testing._internal.common_utils.TestCase
    run_tests
    。测试必须在修复前失败,修复后通过。先编写测试能让你目标明确——在开始探索代码库前,你就清楚知道“修复完成”的标准是什么。
  4. 收集日志 —— 使用合适的
    TORCH_LOGS
    配置运行程序
  5. 分类故障 —— 使用错误分类表确定故障类别
  6. 检查产物 —— 通过
    TORCH_COMPILE_DEBUG=1
    查看FX图、IR和生成的代码
  7. 定位根因 —— 从错误反向追踪编译流水线
  8. 修复问题 —— 应用修复方案
  9. 验证修复 —— 运行新添加的单元测试以及附近相关的现有测试(例如,如果你修改了
    is_exporting
    的工作方式,还需运行现有的
    test_is_exporting
    导出测试)。使用
    pytest -k
    按名称快速运行相关测试。所有测试通过后,任务才算完成。
  10. 自我评审 —— 在提交变更前,使用
    /pr-review
    工具对自己的修改进行评审。修复它指出的所有问题。
  11. 总结成果 —— 总结变更内容:解释根因、修改的内容及原因,以及添加/验证了哪些测试。然后告知用户故障已解决。可以加入有趣多样的激励话语或彩蛋来提升士气(例如双关语、名言、被压扁的ASCII艺术虫子)。保持简短且每次内容不同。

Investigation Strategy

调查策略

Prefer direct tools over meta_codesearch

优先使用直接工具而非meta_codesearch

Use
Grep
,
Glob
, and
Read
directly for code exploration. Do not spawn
meta_codesearch
agents
— they are slow and expensive. The Architectural Knowledge and Key Source Files sections below should give you enough context to know where to look. A targeted
Grep
for a function name is always faster.
直接使用
Grep
Glob
Read
进行代码探索。不要启动
meta_codesearch
代理
——它们速度慢且成本高。下方的架构知识关键源文件部分应能为你提供足够的上下文,让你知道该从何处入手。针对函数名的定向
Grep
总是更快。

Know which compilation mode you're in

明确当前的编译模式

Before reading implementation code, determine the compilation mode. These share code but diverge in important ways:
  • torch.compile
    -- Dynamo + Inductor.
    tx.export=False
    , no
    _compiling_state_context()
    .
  • torch.export
    (strict)
    --
    tx.export=True
    ,
    _compiling_state_context()
    active.
  • torch.export
    (non-strict, the default)
    -- Uses Dynamo via
    fullgraph_capture
    but
    tx.export
    may differ from strict.
    _compiling_state_context()
    active. Check
    torch._export.config.use_new_tracer_experimental
    — it changes which code path is used.
在阅读实现代码前,先确定当前的编译模式。这些模式共享部分代码,但在重要环节存在差异:
  • torch.compile
    —— Dynamo + Inductor。
    tx.export=False
    ,无
    _compiling_state_context()
  • torch.export
    (严格模式)
    ——
    tx.export=True
    _compiling_state_context()
    处于激活状态。
  • torch.export
    (非严格模式,默认)
    —— 通过
    fullgraph_capture
    使用 Dynamo,但
    tx.export
    可能与严格模式不同。
    _compiling_state_context()
    处于激活状态。检查
    torch._export.config.use_new_tracer_experimental
    —— 它会改变使用的代码路径。

Distinguish trace-time vs runtime

区分追踪时与运行时

Many PT2 bugs come from confusing these two:
  • Trace-time: Inside Dynamo's symbolic interpreter. Dynamo intercepts function calls and may constant-fold them (e.g.,
    is_exporting()
    ConstantVariable(True)
    ).
  • Runtime: Real tensors, real Python calls, module-level flags like
    torch.compiler._is_exporting_flag
    .
When debugging, add temporary
print()
statements directly in the source file rather than monkey-patching from outside — dispatch chains make monkey-patching unreliable.
PT2的许多故障源于混淆了这两个阶段:
  • 追踪时:在Dynamo的符号解释器内部。Dynamo拦截函数调用,并可能对其进行常量折叠(例如
    is_exporting()
    ConstantVariable(True)
    )。
  • 运行时:使用真实张量、真实Python调用,以及模块级标志如
    torch.compiler._is_exporting_flag
调试时,直接在源文件中添加临时
print()
语句,而非从外部进行猴子补丁——调度链会让猴子补丁不可靠。

Gathering Information

信息收集

Pick the right diagnostic tool based on the error category:
  • Quick overview:
    TORCH_LOGS="+dynamo,graph_breaks,recompiles" python your_script.py
  • Full debug artifacts:
    TORCH_COMPILE_DEBUG=1 python your_script.py
    — creates
    torch_compile_debug/
    with FX graphs, Inductor IR, and generated code
  • Generated code only:
    TORCH_LOGS="output_code" python your_script.py
  • Structured tracing:
    TORCH_TRACE=/path/to/trace python your_script.py
    then
    tlparse /path/to/trace
  • Single-threaded (for pdb):
    TORCHINDUCTOR_COMPILE_THREADS=1 python your_script.py
根据错误类别选择合适的诊断工具:
  • 快速概览
    TORCH_LOGS="+dynamo,graph_breaks,recompiles" python your_script.py
  • 完整调试产物
    TORCH_COMPILE_DEBUG=1 python your_script.py
    —— 创建
    torch_compile_debug/
    目录,包含FX图、Inductor IR和生成的代码
  • 仅查看生成的代码
    TORCH_LOGS="output_code" python your_script.py
  • 结构化追踪
    TORCH_TRACE=/path/to/trace python your_script.py
    然后运行
    tlparse /path/to/trace
  • 单线程模式(用于pdb调试)
    TORCHINDUCTOR_COMPILE_THREADS=1 python your_script.py

Error Triage

错误分类

Classify the failure using the error message and traceback:
Error PatternCategoryJump To
Unsupported: ...
or
graph break
in logs
Graph breakGraph Breaks
BackendCompilerFailed
Inductor/backend crashBackend Failures
RecompileError
or
cache_size_limit
RecompilationRecompilation
Accuracy mismatch / wrong numerical outputAccuracyAccuracy
InternalTorchDynamoError
Dynamo bugInternal Errors
Segfault or CUDA IMARuntime crashRuntime Crashes
Triton assertion / index out of boundsTriton kernel bugTriton Failures
根据错误消息和回溯信息对故障进行分类:
错误模式类别跳转至
日志中出现
Unsupported: ...
graph break
图中断图中断
BackendCompilerFailed
Inductor/后端崩溃后端故障
RecompileError
cache_size_limit
重新编译问题重新编译
精度不匹配 / 数值输出错误精度问题精度问题
InternalTorchDynamoError
Dynamo故障内部错误
段错误或CUDA IMA运行时崩溃运行时崩溃
Triton断言失败 / 索引越界Triton内核故障Triton故障

Debugging by Category

按类别调试

Graph Breaks

图中断

Graph breaks split the compiled graph into smaller subgraphs, often causing performance regressions or unexpected behavior.
Diagnosis:
bash
TORCH_LOGS="graph_breaks" python your_script.py
Key files:
  • torch/_dynamo/exc.py
    --
    Unsupported
    exception class
  • torch/_dynamo/variables/
    -- where most graph break decisions happen
Common causes:
  • Unsupported Python constructs (data-dependent control flow, unsupported builtins)
  • Tensor operations that can't be traced (in-place ops on inputs, unsupported dtypes)
  • Calls to non-traceable functions
Fix approach:
  1. Read the graph break message to identify the unsupported operation
  2. Check if there's a decomposition or supported alternative
  3. If the operation genuinely can't be traced, consider
    torch._dynamo.allow_in_graph
    or restructuring user code
图中断会将编译后的图拆分为更小的子图,通常会导致性能下降或意外行为。
诊断方法:
bash
TORCH_LOGS="graph_breaks" python your_script.py
关键文件:
  • torch/_dynamo/exc.py
    ——
    Unsupported
    异常类
  • torch/_dynamo/variables/
    —— 大多数图中断决策的发生位置
常见原因:
  • 不支持的Python构造(依赖数据的控制流、不支持的内置函数)
  • 无法被追踪的张量操作(对输入的原地操作、不支持的数据类型)
  • 调用不可追踪的函数
修复方法:
  1. 查看图中断消息,识别不支持的操作
  2. 检查是否存在分解函数或支持的替代方案
  3. 如果该操作确实无法被追踪,可考虑使用
    torch._dynamo.allow_in_graph
    或重构用户代码

Backend Compiler Failures

后端编译器故障

BackendCompilerFailed
means Inductor (or another backend) crashed during compilation.
Diagnosis:
bash
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=2 python your_script.py
This generates
minifier_launcher.py
that isolates the minimal failing graph.
Key files:
  • torch/_dynamo/repro/after_aot.py
    -- repro/minifier for post-AOT failures
  • torch/_inductor/
    -- the backend itself
Fix approach:
  1. Run the minifier to get a minimal reproduction
  2. Inspect the FX graph (
    TORCH_COMPILE_DEBUG=1
    ) to understand what ops are involved
  3. Check if it's a lowering issue (
    torch/_inductor/lowering.py
    ), scheduling issue, or codegen issue
  4. Look at the generated output code if the error is in codegen
BackendCompilerFailed
表示Inductor(或其他后端)在编译过程中崩溃。
诊断方法:
bash
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=2 python your_script.py
这会生成
minifier_launcher.py
,用于隔离最小的故障图。
关键文件:
  • torch/_dynamo/repro/after_aot.py
    —— AOT后故障的复现/最小化工具
  • torch/_inductor/
    —— 后端本身
修复方法:
  1. 运行最小化工具,获取最小复现用例
  2. 检查FX图(通过
    TORCH_COMPILE_DEBUG=1
    )以了解涉及的算子
  3. 检查是否是Lowering问题(
    torch/_inductor/lowering.py
    )、调度问题或代码生成问题
  4. 如果错误出现在代码生成阶段,查看生成的输出代码

Recompilation Issues

重新编译问题

Excessive recompilation happens when guards are too specific, causing cache misses.
Diagnosis:
bash
TORCH_LOGS="recompiles,recompiles_verbose,guards" python your_script.py
Key config:
  • torch._dynamo.config.recompile_limit
    (default: 8)
  • torch._dynamo.config.fail_on_recompile_limit_hit
    -- set to
    True
    to get a hard error
Common causes:
  • Changing tensor shapes without marking them dynamic
  • Python scalar values that change between calls
  • Global state mutations between calls
Fix approach:
  1. Read the recompilation reason from logs
  2. Identify the failing guard
  3. Either mark the relevant dimension as dynamic with
    torch._dynamo.mark_dynamic()
    or fix the source of guard instability
当防护条件过于严格导致缓存未命中时,会发生过度重新编译。
诊断方法:
bash
TORCH_LOGS="recompiles,recompiles_verbose,guards" python your_script.py
关键配置:
  • torch._dynamo.config.recompile_limit
    (默认值:8)
  • torch._dynamo.config.fail_on_recompile_limit_hit
    —— 设置为
    True
    以触发硬错误
常见原因:
  • 未标记为动态的张量形状变化
  • 调用之间Python标量值发生变化
  • 调用之间全局状态被修改
修复方法:
  1. 从日志中查看重新编译的原因
  2. 识别失效的防护条件
  3. 使用
    torch._dynamo.mark_dynamic()
    将相关维度标记为动态,或修复防护条件不稳定的根源

Accuracy Issues

精度问题

The compiled model produces different numerical results than eager mode.
Diagnosis:
bash
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=4 python your_script.py
This compares compiled vs. eager with an fp64 reference and dumps a repro if accuracy fails.
Key utilities:
  • torch/_dynamo/debug_utils.py
    --
    same_two_models()
    ,
    backend_accuracy_fails()
    ,
    cast_to_fp64()
  • torch._dynamo.config.repro_tolerance
    (default: 1e-3)
Fix approach:
  1. Get the minimal failing graph from the minifier
  2. Compare eager vs. compiled output at fp64 precision
  3. Binary search through ops to find the diverging operation
  4. Check for known numerical issues (reduction order, fused kernels, dtype promotions)
编译后的模型产生与eager模式不同的数值结果。
诊断方法:
bash
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=4 python your_script.py
这会将编译后结果与eager模式结果进行fp64精度对比,若精度不匹配则生成复现用例。
关键工具:
  • torch/_dynamo/debug_utils.py
    ——
    same_two_models()
    backend_accuracy_fails()
    cast_to_fp64()
  • torch._dynamo.config.repro_tolerance
    (默认值:1e-3)
修复方法:
  1. 从最小化工具获取最小故障图
  2. 在fp64精度下对比eager模式与编译模式的输出
  3. 通过二分法查找导致结果差异的算子
  4. 检查已知的数值问题(归约顺序、融合内核、数据类型提升)

Internal Dynamo Errors

内部Dynamo错误

InternalTorchDynamoError
indicates a bug in Dynamo itself.
Diagnosis:
bash
TORCHDYNAMO_VERBOSE=1 python your_script.py
InternalTorchDynamoError
表示Dynamo本身存在故障。
诊断方法:
bash
TORCHDYNAMO_VERBOSE=1 python your_script.py

or equivalently:

或等价命令:

TORCH_LOGS="+dynamo" python your_script.py

**Key files:**
- `torch/_dynamo/symbolic_convert.py` -- bytecode interpreter
- `torch/_dynamo/variables/` -- variable tracking system
- `torch/_dynamo/guards.py` -- guard generation

**Fix approach:**
1. Get the full stack trace with `TORCHDYNAMO_VERBOSE=1`
2. Identify which bytecode instruction or variable type caused the crash
3. Create a minimal repro (the error message often includes a minifier path)
4. Debug with `TORCHINDUCTOR_COMPILE_THREADS=1` and pdb if needed
TORCH_LOGS="+dynamo" python your_script.py

**关键文件:**
- `torch/_dynamo/symbolic_convert.py` —— 字节码解释器
- `torch/_dynamo/variables/` —— 变量跟踪系统
- `torch/_dynamo/guards.py` —— 防护条件生成

**修复方法:**
1. 使用 `TORCHDYNAMO_VERBOSE=1` 获取完整堆栈跟踪
2. 识别导致崩溃的字节码指令或变量类型
3. 创建最小复现用例(错误消息通常包含最小化工具的路径)
4. 必要时使用 `TORCHINDUCTOR_COMPILE_THREADS=1` 和pdb进行调试

Runtime Crashes

运行时崩溃

Segfaults and CUDA illegal memory access errors during execution of compiled code.
Diagnosis (make crash deterministic):
bash
PYTORCH_NO_CUDA_MEMORY_CACHING=1 CUDA_LAUNCH_BLOCKING=1 python your_script.py
For CUDA IMA, add NaN checks:
bash
TORCHINDUCTOR_NAN_ASSERTS=1 python your_script.py
For Inductor-level sync debugging:
python
torch._inductor.config.triton.debug_sync_kernel = True  # sync after every kernel
torch._inductor.config.triton.debug_sync_graph = True   # sync before/after graph
Fix approach:
  1. Make the crash deterministic with
    PYTORCH_NO_CUDA_MEMORY_CACHING=1 CUDA_LAUNCH_BLOCKING=1
  2. Check if it's an input mismatch (shapes, devices, dtypes)
  3. Inspect the generated kernel code with
    TORCH_LOGS="output_code"
  4. Use
    TORCHINDUCTOR_NAN_ASSERTS=1
    to find the first kernel producing bad values
  5. Check for dynamic shapes issues (historically a common source of IMA)
编译后代码执行过程中出现段错误或CUDA非法内存访问错误。
诊断方法(使崩溃可复现):
bash
PYTORCH_NO_CUDA_MEMORY_CACHING=1 CUDA_LAUNCH_BLOCKING=1 python your_script.py
对于CUDA IMA,添加NaN检查:
bash
TORCHINDUCTOR_NAN_ASSERTS=1 python your_script.py
对于Inductor级同步调试:
python
torch._inductor.config.triton.debug_sync_kernel = True  # 每个内核执行后同步
torch._inductor.config.triton.debug_sync_graph = True   # 图执行前后同步
修复方法:
  1. 使用
    PYTORCH_NO_CUDA_MEMORY_CACHING=1 CUDA_LAUNCH_BLOCKING=1
    使崩溃可复现
  2. 检查是否是输入不匹配(形状、设备、数据类型)
  3. 使用
    TORCH_LOGS="output_code"
    查看生成的内核代码
  4. 使用
    TORCHINDUCTOR_NAN_ASSERTS=1
    找到第一个产生错误值的内核
  5. 检查动态形状问题(历史上是IMA的常见根源)

Triton Kernel Failures

Triton内核故障

Triton assertion failures or index-out-of-bounds in generated kernels.
Diagnosis:
bash
TORCH_LOGS="output_code,schedule" python your_script.py
Key files:
  • torch/_inductor/codegen/triton.py
    -- Triton codegen
  • torch/_inductor/scheduler.py
    -- kernel fusion decisions
Fix approach:
  1. Get the generated Triton kernel from
    output_code
    logs
  2. Check index computations for off-by-one or wrong stride calculations
  3. Look at the IR (
    TORCH_COMPILE_DEBUG=1
    ) to trace back to the FX op
  4. Check if fusion decisions created invalid index combinations
生成的内核中出现Triton断言失败或索引越界。
诊断方法:
bash
TORCH_LOGS="output_code,schedule" python your_script.py
关键文件:
  • torch/_inductor/codegen/triton.py
    —— Triton代码生成
  • torch/_inductor/scheduler.py
    —— 内核融合决策
修复方法:
  1. output_code
    日志中获取生成的Triton内核
  2. 检查索引计算是否存在差一错误或步长计算错误
  3. 通过IR(
    TORCH_COMPILE_DEBUG=1
    )追溯到对应的FX算子
  4. 检查融合决策是否导致了无效的索引组合

Key Source Files

关键源文件

FilePurpose
torch/_dynamo/exc.py
Exception hierarchy and error formatting
torch/_dynamo/debug_utils.py
Minifier support, accuracy checking, input serialization
torch/_dynamo/repro/after_dynamo.py
Repro/minifier for Dynamo-stage failures
torch/_dynamo/repro/after_aot.py
Repro/minifier for post-AOTAutograd failures
torch/_dynamo/repro/aoti.py
Repro/minifier for AOTI failures
torch/_dynamo/config.py
Dynamo config (repro levels, recompile limits)
torch/_dynamo/variables/torch.py
Torch function handling, tracing state functions
torch/_dynamo/variables/higher_order_ops.py
HOP tracing (cond, map, etc.)
torch/_dynamo/symbolic_convert.py
Bytecode interpreter, InstructionTranslator
torch/_dynamo/convert_frame.py
Frame compilation,
fullgraph_capture
entry point
torch/_dynamo/functional_export.py
New export tracer (
_dynamo_graph_capture_for_export
)
torch/_dynamo/eval_frame.py
torch._dynamo.export
,
optimize_assert
torch/_export/_trace.py
Export pipeline (
_export
,
_strict_export
,
_non_strict_export
,
_export_to_aten_ir
)
torch/_export/utils.py
_compiling_state_context()
torch/compiler/__init__.py
is_compiling()
,
is_exporting()
, runtime flags
torch/_higher_order_ops/cond.py
torch.cond
implementation and proxy tracing
torch/_higher_order_ops/utils.py
reenter_make_fx
for HOP branch tracing
torch/_inductor/config.py
Inductor config (debug flags, trace settings)
torch/_inductor/debug.py
DebugContext, graph visualization, IR logging
torch/_logging/_registrations.py
All registered log aliases and artifacts
文件用途
torch/_dynamo/exc.py
异常层级结构和错误格式化
torch/_dynamo/debug_utils.py
最小化工具支持、精度检查、输入序列化
torch/_dynamo/repro/after_dynamo.py
Dynamo阶段故障的复现/最小化工具
torch/_dynamo/repro/after_aot.py
AOTAutograd后故障的复现/最小化工具
torch/_dynamo/repro/aoti.py
AOTI故障的复现/最小化工具
torch/_dynamo/config.py
Dynamo配置(复现级别、重新编译限制)
torch/_dynamo/variables/torch.py
Torch函数处理、追踪状态函数
torch/_dynamo/variables/higher_order_ops.py
HOP追踪(cond、map等)
torch/_dynamo/symbolic_convert.py
字节码解释器、InstructionTranslator
torch/_dynamo/convert_frame.py
帧编译、
fullgraph_capture
入口点
torch/_dynamo/functional_export.py
新的导出追踪器(
_dynamo_graph_capture_for_export
torch/_dynamo/eval_frame.py
torch._dynamo.export
optimize_assert
torch/_export/_trace.py
导出流水线(
_export
_strict_export
_non_strict_export
_export_to_aten_ir
torch/_export/utils.py
_compiling_state_context()
torch/compiler/__init__.py
is_compiling()
is_exporting()
、运行时标志
torch/_higher_order_ops/cond.py
torch.cond
实现和代理追踪
torch/_higher_order_ops/utils.py
HOP分支追踪的
reenter_make_fx
torch/_inductor/config.py
Inductor配置(调试标志、追踪设置)
torch/_inductor/debug.py
DebugContext、图可视化、IR日志
torch/_logging/_registrations.py
所有已注册的日志别名和产物

Using the Minifier

使用最小化工具

The minifier reduces a failing graph to the smallest reproduction:
bash
undefined
最小化工具可将故障图简化为最小复现用例:
bash
undefined

Step 1: Generate the minifier launcher

步骤1:生成最小化工具启动器

TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=2 python your_script.py
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=2 python your_script.py

Step 2: Run the minifier

步骤2:运行最小化工具

python minifier_launcher.py minify
python minifier_launcher.py minify

Step 3: Run the minimized repro

步骤3:运行最小化后的复现用例

python minifier_launcher.py run

For accuracy issues, use level 4:
```bash
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=4 python your_script.py
python minifier_launcher.py run

对于精度问题,使用级别4:
```bash
TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=4 python your_script.py