dse-loop

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

DSE Loop: Autonomous Design Space Exploration

DSE循环:自主设计空间探索

Autonomously explore a design space: run → analyze → pick next parameters → repeat, until the objective is met or timeout is reached. Designed for computer architecture and EDA problems.
自主探索设计空间:运行→分析→选择下一组参数→重复,直至达成目标或达到超时时间。本工具专为计算机架构与EDA领域的问题设计。

Context: $ARGUMENTS

上下文:$ARGUMENTS

Safety Rules — READ FIRST

安全规则 — 请先阅读

NEVER do any of the following:
  • sudo
    anything
  • rm -rf
    ,
    rm -r
    , or any recursive deletion
  • rm
    any file you did not create in this session
  • Overwrite existing source files without reading them first
  • git push
    ,
    git reset --hard
    , or any destructive git operation
  • Kill processes you did not start
If a step requires any of the above, STOP and report to the user.
绝对禁止以下操作:
  • 使用
    sudo
    执行任何命令
  • 执行
    rm -rf
    rm -r
    或任何递归删除操作
  • 删除任何非本次会话创建的文件
  • 未先阅读现有源文件就覆盖它们
  • 执行
    git push
    git reset --hard
    或任何破坏性Git操作
  • 终止任何非你启动的进程
如果某步骤需要执行上述任一操作,请立即停止并向用户报告。

Constants (override via $ARGUMENTS)

常量(可通过$ARGUMENTS覆盖)

ConstantDefaultDescription
TIMEOUT
2hTotal wall-clock budget. Stop exploring after this.
MAX_ITERATIONS
50Hard cap on number of design points evaluated.
PATIENCE
10Stop early if no improvement for this many consecutive iterations.
OBJECTIVE
minimize
minimize
or
maximize
the target metric.
Override inline:
/dse-loop "task desc — timeout: 4h, max_iterations: 100, patience: 15"
常量默认值描述
TIMEOUT
2h总时钟预算,超过后停止探索
MAX_ITERATIONS
50评估设计点数量的硬上限
PATIENCE
10若连续多次迭代无改进则提前停止
OBJECTIVE
minimize对目标指标进行
minimize(最小化)
maximize(最大化)
行内覆盖示例:
/dse-loop "任务描述 — timeout: 4h, max_iterations: 100, patience: 15"

Typical Use Cases

典型用例

ProblemProgramParametersObjective
Microarch DSEgem5 simulationcache size, assoc, pipeline width, ROB size, branch predictormaximize IPC or minimize area×delay
Synthesis tuningyosys/DC scriptoptimization passes, target freq, effort levelminimize area at timing closure
RTL parameterizationverilator simdata width, FIFO depth, pipeline stages, buffer sizesmeet throughput target at min area
Compiler flagsgcc/llvm build + benchmark-O levels, unroll factor, vectorization, schedulingminimize runtime or code size
Placement/routingopenroad/innovusutilization, aspect ratio, layer configminimize wirelength / timing
Formal verificationabc/sbybound depth, engine, timeout per propertymaximize coverage in time budget
Memory subsystemcacti / ramulatorbank count, row buffer policy, schedulingoptimize bandwidth/energy
问题程序参数目标
微架构DSEgem5仿真缓存大小、关联度、流水线宽度、ROB大小、分支预测器最大化IPC或最小化面积×延迟
综合调优yosys/DC脚本优化流程、目标频率、努力等级在满足时序收敛的前提下最小化面积
RTL参数化verilator仿真数据宽度、FIFO深度、流水线级数、缓冲区大小以最小面积达成吞吐量目标
编译器标志gcc/llvm构建+基准测试-O级别、循环展开因子、向量化、调度最小化运行时间或代码大小
布局布线openroad/innovus利用率、宽高比、层配置最小化线长/时序
形式化验证abc/sby边界深度、引擎、每个属性的超时时间在时间预算内最大化覆盖率
内存子系统cacti / ramulator存储体数量、行缓冲区策略、调度优化带宽/能耗

Workflow

工作流

Phase 0: Parse Task & Setup

阶段0:解析任务与设置

  1. Parse $ARGUMENTS to extract:
    • Program: what to run (command, script, or Makefile target)
    • Parameter space: which knobs to tune and their ranges/options (may be incomplete — see step 2)
    • Objective metric: what to optimize (and how to extract it from output)
    • Constraints: hard limits that must not be violated (e.g., timing must close)
    • Timeout: wall-clock budget
    • Success criteria: when is the result "good enough" to stop early?
  2. Infer missing parameter ranges — If the user provides parameter names but NOT ranges/options, you MUST infer them before exploring:
    a. Read the source code — search for the parameter names in the codebase:
    • Look for argparse/click definitions, config files, Makefile variables, module parameters,
      #define
      ,
      parameter
      (SystemVerilog),
      localparam
      , etc.
    • Extract defaults, types, and any comments hinting at valid values
    b. Apply domain knowledge to set reasonable ranges:
    Parameter typeInference strategy
    Cache/memory sizesPowers of 2, typically 1KB–16MB
    AssociativityPowers of 2: 1, 2, 4, 8, 16
    Pipeline width / issue widthSmall integers: 1, 2, 4, 8
    Buffer/queue/FIFO depthPowers of 2: 4, 8, 16, 32, 64
    Clock period / frequencyBased on technology node; try ±50% from default
    Bound depth (BMC/formal)Geometric: 5, 10, 20, 50, 100
    Timeout valuesGeometric: 10s, 30s, 60s, 120s, 300s
    Boolean/enum flagsEnumerate all options found in source
    Continuous (learning rate, threshold)Log-scale sweep: 5 points spanning 2 orders of magnitude around default
    Integer counts (threads, cores)Linear: from 1 to hardware max
    c. Start conservative — begin with 3-5 values per parameter. Expand range later if the best result is at a boundary.
    d. Log inferred ranges — write the inferred parameter space to
    dse_results/inferred_params.md
    so the user can review:
    markdown
    # Inferred Parameter Space
    
    | Parameter | Source | Default | Inferred Range | Reasoning |
    |-----------|--------|---------|---------------|-----------|
    | CACHE_SIZE | config.py:42 | 32768 | [8192, 16384, 32768, 65536, 131072] | powers of 2, ±2x from default |
    | ASSOC | config.py:43 | 4 | [1, 2, 4, 8] | standard associativities |
    | BMC_DEPTH | run_bmc.py:15 | 10 | [5, 10, 20, 50] | geometric, common BMC depths |
    e. Boundary expansion — during the search, if the best result is at the min or max of a range, automatically extend that range by one step in that direction (but log the extension).
  3. Read the project to understand:
    • How to run the program
    • Where results are produced (stdout, log files, reports)
    • How to parse the objective metric from output
    • Current/baseline configuration (if any)
  4. Create working directory:
    dse_results/
    in project root
    • dse_results/dse_log.csv
      — one row per design point
    • dse_results/DSE_REPORT.md
      — final report
    • dse_results/DSE_STATE.json
      — state for recovery
    • dse_results/inferred_params.md
      — inferred parameter space (if ranges were not provided)
    • dse_results/configs/
      — config files for each run
    • dse_results/outputs/
      — raw output for each run
  5. Write a parameter extraction script (
    dse_results/parse_result.py
    or similar) that takes a run's output and returns the objective metric as a number. Test it on a baseline run first.
  6. Run baseline (iteration 0): run the program with default/current parameters. Record the baseline metric. This is the point to beat.
  1. 解析$ARGUMENTS以提取:
    • 程序:要运行的内容(命令、脚本或Makefile目标)
    • 参数空间:要调优的控制项及其范围/选项(可能不完整——见步骤2)
    • 目标指标:要优化的对象(以及如何从输出中提取它)
    • 约束条件:必须满足的硬限制(例如,时序必须收敛)
    • 超时时间:时钟预算
    • 成功标准:结果达到何种程度可提前停止?
  2. 推断缺失的参数范围——如果用户提供了参数名称但未提供范围/选项,必须先推断范围再开始探索:
    a. 阅读源代码——在代码库中搜索参数名称:
    • 查找argparse/click定义、配置文件、Makefile变量、模块参数、
      #define
      、SystemVerilog의
      parameter
      localparam
    • 提取默认值、类型以及任何提示有效值的注释
    b. 运用领域知识设置合理范围
    参数类型推断策略
    缓存/内存大小2的幂,通常为1KB–16MB
    关联度2的幂:1, 2, 4, 8, 16
    流水线宽度/发射宽度小整数:1, 2, 4, 8
    缓冲区/队列/FIFO深度2的幂:4, 8, 16, 32, 64
    时钟周期/频率基于工艺节点;尝试默认值的±50%
    边界深度(BMC/形式化验证)几何级数:5, 10, 20, 50, 100
    超时值几何级数:10s, 30s, 60s, 120s, 300s
    布尔/枚举标志枚举源代码中找到的所有选项
    连续值(学习率、阈值)对数尺度扫描:围绕默认值的2个数量级范围内取5个点
    整数计数(线程、核心)线性:从1到硬件最大值
    c. 从保守范围开始——每个参数先取3-5个值。若最佳结果出现在范围边界,后续再扩展范围。
    d. 记录推断的范围——将推断的参数空间写入
    dse_results/inferred_params.md
    ,以便用户查看:
    markdown
    # 推断的参数空间
    
    | 参数 | 来源 | 默认值 | 推断范围 | 推理依据 |
    |-----------|--------|---------|---------------|-----------|
    | CACHE_SIZE | config.py:42 | 32768 | [8192, 16384, 32768, 65536, 131072] | 2的幂,默认值的±2倍 |
    | ASSOC | config.py:43 | 4 | [1, 2, 4, 8] | 标准关联度值 |
    | BMC_DEPTH | run_bmc.py:15 | 10 | [5, 10, 20, 50] | 几何级数,常见BMC深度 |
    e. 边界扩展——在搜索过程中,若最佳结果出现在某范围的最小值或最大值处,自动将该范围向对应方向扩展一个步长(并记录扩展操作)。
  3. 了解项目以明确:
    • 如何运行程序
    • 结果生成位置(标准输出、日志文件、报告)
    • 如何从输出中解析目标指标
    • 当前/基线配置(若有)
  4. 创建工作目录:在项目根目录下创建
    dse_results/
    • dse_results/dse_log.csv
      — 每个设计点对应一行
    • dse_results/DSE_REPORT.md
      — 最终报告
    • dse_results/DSE_STATE.json
      — 用于恢复的状态文件
    • dse_results/inferred_params.md
      — 推断的参数空间(若未提供范围)
    • dse_results/configs/
      — 每次运行的配置文件
    • dse_results/outputs/
      — 每次运行的原始输出
  5. 编写参数提取脚本(如
    dse_results/parse_result.py
    ),该脚本接收某次运行的输出并返回目标指标的数值。先在基线运行上测试该脚本。
  6. 运行基线(迭代0):使用默认/当前参数运行程序。记录基线指标。这是后续要超越的基准。

Phase 1: Initial Exploration

阶段1:初始探索

Goal: Quickly survey the space to understand which parameters matter most.
Strategy: Latin Hypercube Sampling or structured sweep of key parameters.
  1. Pick 5-10 diverse design points that span the parameter ranges
  2. Run them (in parallel if independent, via background processes or sequential)
  3. Record all results in
    dse_log.csv
    :
    iteration,param1,param2,...,metric,constraint_met,timestamp,notes
    0,default,default,...,baseline_val,yes,2026-03-13T10:00:00,baseline
    1,val1a,val2a,...,result1,yes,2026-03-13T10:05:00,initial sweep
    ...
  4. Analyze: which parameters have the most impact on the objective?
  5. Narrow the search to the most sensitive parameters
目标:快速遍历参数空间,了解哪些参数影响最大。
策略:拉丁超立方采样或关键参数的结构化扫描。
  1. 选择5-10个覆盖参数范围的多样化设计点
  2. 运行这些设计点(若相互独立可并行运行,通过后台进程或顺序执行)
  3. 将所有结果记录到
    dse_log.csv
    iteration,param1,param2,...,metric,constraint_met,timestamp,notes
    0,default,default,...,baseline_val,yes,2026-03-13T10:00:00,baseline
    1,val1a,val2a,...,result1,yes,2026-03-13T10:05:00,initial sweep
    ...
  4. 分析:哪些参数对目标指标的影响最大?
  5. 将搜索范围缩小到最敏感的参数

Phase 2: Directed Search

阶段2:定向搜索

Goal: Converge toward the optimum by making informed choices.
Strategy: Adaptive — pick the approach that fits the problem:
  • Few parameters (≤3): Fine-grained grid search around the best region from Phase 1
  • Many parameters (>3): Coordinate descent — optimize one parameter at a time, holding others at current best
  • Binary/categorical params: Enumerate promising combinations
  • Continuous params: Binary search or golden section between best neighbors
  • Multi-objective: Track Pareto frontier, explore along the front
For each iteration:
  1. Select next design point based on results so far:
    • Look at the trend: which direction improves the metric?
    • Avoid re-running configurations already evaluated
    • Balance exploration (untested regions) vs exploitation (near current best)
  2. Modify parameters: edit config file, command-line args, or source constants
  3. Run the program: execute and capture output
  4. Parse results: extract the objective metric and check constraints
  5. Log to
    dse_log.csv
    : append the new row
  6. Check stopping conditions:
    • Timeout reached? → stop
    • Max iterations reached? → stop
    • Patience exhausted (no improvement in N iterations)? → stop
    • Success criteria met (metric is "good enough")? → stop
    • Constraint violation pattern detected? → adjust search bounds
  7. Update
    DSE_STATE.json
    :
    json
    {
      "iteration": 15,
      "status": "in_progress",
      "best_metric": 1.23,
      "best_params": {"cache_size": 32768, "assoc": 4, "pipeline_width": 2},
      "total_iterations": 15,
      "start_time": "2026-03-13T10:00:00",
      "timeout": "2h",
      "patience_counter": 3
    }
  8. Decide next step → back to step 1
目标:通过明智的选择向最优解收敛。
策略:自适应——选择适合问题的方法:
  • 参数较少(≤3个):在阶段1找到的最优区域附近进行细粒度网格搜索
  • 参数较多(>3个):坐标下降法——每次优化一个参数,其他参数保持当前最优值
  • 二进制/分类参数:枚举有前景的组合
  • 连续参数:在最优邻居之间进行二分搜索或黄金分割搜索
  • 多目标:跟踪帕累托前沿,沿前沿探索
每次迭代步骤:
  1. 根据已有结果选择下一个设计点
    • 观察趋势:哪个方向能改善指标?
    • 避免重新运行已评估过的配置
    • 平衡探索(未测试区域)与利用(当前最优附近区域)
  2. 修改参数:编辑配置文件、命令行参数或源常量
  3. 运行程序:执行并捕获输出
  4. 解析结果:提取目标指标并检查约束条件
  5. 记录到
    dse_log.csv
    :追加新行
  6. 检查停止条件
    • 是否已达到超时时间?→ 停止
    • 是否已达到最大迭代次数?→ 停止
    • 是否已耗尽耐心(连续N次迭代无改进)?→ 停止
    • 是否已满足成功标准(指标足够好)?→ 停止
    • 是否检测到约束违反模式?→ 调整搜索边界
  7. 更新
    DSE_STATE.json
    json
    {
      "iteration": 15,
      "status": "in_progress",
      "best_metric": 1.23,
      "best_params": {"cache_size": 32768, "assoc": 4, "pipeline_width": 2},
      "total_iterations": 15,
      "start_time": "2026-03-13T10:00:00",
      "timeout": "2h",
      "patience_counter": 3
    }
  8. 决定下一步 → 返回步骤1

Phase 3: Refinement (if time allows)

阶段3:优化(若时间允许)

If the search converged and there's still time budget:
  1. Local perturbation: try ±1 step on each parameter from the best point
  2. Sensitivity analysis: which parameters can be relaxed without hurting the metric?
  3. Constraint boundary: if a constraint is nearly binding, explore near-feasible points
若搜索已收敛且仍有时间预算:
  1. 局部扰动:在最佳点的基础上,对每个参数尝试±1个步长的调整
  2. 敏感性分析:哪些参数可以放松而不影响指标?
  3. 约束边界:若某约束接近临界值,探索接近可行的点

Phase 4: Report

阶段4:报告

Write
dse_results/DSE_REPORT.md
:
markdown
undefined
编写
dse_results/DSE_REPORT.md
markdown
undefined

Design Space Exploration Report

设计空间探索报告

Task: [description] Date: [start] → [end] Total iterations: N Wall-clock time: X hours Y minutes
任务:[描述] 日期:[开始时间] → [结束时间] 总迭代次数:N 时钟时间:X小时Y分钟

Objective

目标

  • Metric: [what was optimized]
  • Direction: minimize / maximize
  • Baseline: [value]
  • Best found: [value] ([improvement]% better than baseline)
  • 指标:[优化的对象]
  • 方向:最小化 / 最大化
  • 基线:[数值]
  • 找到的最优值:[数值](比基线提升[improvement]%)

Best Configuration

最优配置

ParameterBaselineBest
param1defaultbest_val
param2defaultbest_val
.........
参数基线最优值
param1defaultbest_val
param2defaultbest_val
.........

Search Trajectory

搜索轨迹

Iterationparam1param2...MetricNotes
0 (baseline)............baseline
1............initial sweep
..................
N (best)............★ best
迭代次数param1param2...指标备注
0(基线)............baseline
1............initial sweep
..................
N(最优)............★ 最优

Parameter Sensitivity

参数敏感性

  • param1: [high/medium/low impact] — [brief explanation]
  • param2: [high/medium/low impact] — [brief explanation]
  • param1:[高/中/低影响] — [简要说明]
  • param2:[高/中/低影响] — [简要说明]

Pareto Frontier (if multi-objective)

帕累托前沿(多目标场景)

[Table or description of non-dominated points]
[非支配点的表格或描述]

Stopping Reason

停止原因

[timeout / max_iterations / patience / success_criteria_met]
[超时 / 达到最大迭代次数 / 耗尽耐心 / 满足成功标准]

Recommendations

建议

  • [actionable insights from the exploration]
  • [which parameters matter most]
  • [suggested follow-up explorations]

Also generate a summary plot if matplotlib is available:
- Convergence curve (metric vs iteration)
- Parameter sensitivity bar chart
- Pareto frontier scatter (if multi-objective)
  • [从探索中得出的可操作见解]
  • [哪些参数影响最大]
  • [建议的后续探索方向]

若matplotlib可用,还需生成汇总图表:
- 收敛曲线(指标 vs 迭代次数)
- 参数敏感性柱状图
- 帕累托前沿散点图(多目标场景)

State Recovery

状态恢复

If the context window compacts mid-run, the loop recovers from
DSE_STATE.json
+
dse_log.csv
:
  1. Read
    DSE_STATE.json
    for current iteration, best params, patience counter
  2. Read
    dse_log.csv
    for full history
  3. Resume from next iteration
若在运行过程中上下文窗口被压缩,循环可从
DSE_STATE.json
+
dse_log.csv
恢复:
  1. 读取
    DSE_STATE.json
    获取当前迭代次数、最优参数、耐心计数器
  2. 读取
    dse_log.csv
    获取完整历史
  3. 从下一次迭代开始恢复运行

Key Rules

关键规则

  • Work AUTONOMOUSLY — do not ask the user for permission at each iteration
  • Every run must be logged — even failed runs, constraint violations, errors. The log is the ground truth.
  • Never re-run an identical configuration — check
    dse_log.csv
    before each run
  • Respect the timeout — check elapsed time before starting a new iteration. If the next run is likely to exceed the timeout, stop and report.
  • Parse metrics programmatically — write a parsing script, don't eyeball logs
  • Keep raw outputs — save each run's full output in
    dse_results/outputs/iter_N/
  • Constraint violations are not improvements — a design point that violates constraints is never "best", regardless of the metric
  • If a run crashes, log the error, skip that point, and continue with the next
  • If the same crash repeats 3 times with different configs, stop and report the issue
  • 自主工作——无需在每次迭代时请求用户许可
  • 每次运行都必须记录——即使是失败的运行、约束违反、错误。日志是唯一的事实依据。
  • 永远不要重新运行完全相同的配置——每次运行前检查
    dse_log.csv
  • 遵守超时时间——在开始新迭代前检查已用时间。若下一次运行可能超过超时时间,停止并报告。
  • 以编程方式解析指标——编写解析脚本,不要手动查看日志
  • 保留原始输出——将每次运行的完整输出保存到
    dse_results/outputs/iter_N/
  • 约束违反不算改进——违反约束条件的设计点无论指标如何,都不能被视为“最优”
  • 若某次运行崩溃,记录错误,跳过该点,继续下一个
  • 若不同配置连续3次崩溃,停止并报告问题

Example Invocations

调用示例

undefined
undefined

Minimal — just name the parameters, let the agent figure out ranges

Minimal — just name the parameters, let the agent figure out ranges

/dse-loop "Run gem5 mcf benchmark. Tune: L1D_SIZE, L2_SIZE, ROB_ENTRIES. Objective: maximize IPC. Timeout: 3h"
/dse-loop "Run gem5 mcf benchmark. Tune: L1D_SIZE, L2_SIZE, ROB_ENTRIES. Objective: maximize IPC. Timeout: 3h"

Partial — some ranges given, some not

Partial — some ranges given, some not

/dse-loop "Run make synth. Tune: CLOCK_PERIOD [5ns, 4ns, 3ns, 2ns], FLATTEN, ABC_SCRIPT. Objective: minimize area at timing closure. Timeout: 1h"
/dse-loop "Run make synth. Tune: CLOCK_PERIOD [5ns, 4ns, 3ns, 2ns], FLATTEN, ABC_SCRIPT. Objective: minimize area at timing closure. Timeout: 1h"

Fully specified — explicit ranges for everything

Fully specified — explicit ranges for everything

/dse-loop "Simulate processor with FIFO_DEPTH [4,8,16,32], ISSUE_WIDTH [1,2,4], PREFETCH [on,off]. Run: make sim. Objective: max throughput/area. Timeout: 2h"
/dse-loop "Simulate processor with FIFO_DEPTH [4,8,16,32], ISSUE_WIDTH [1,2,4], PREFETCH [on,off]. Run: make sim. Objective: max throughput/area. Timeout: 2h"

Real-world: PDAG-SFA formal verification tuning

Real-world: PDAG-SFA formal verification tuning

/dse-loop "Run python run_bmc.py. Tune: BMC_DEPTH, ENGINE, TIMEOUT_PER_PROP. Objective: maximize properties proved. Timeout: 2h"
undefined
/dse-loop "Run python run_bmc.py. Tune: BMC_DEPTH, ENGINE, TIMEOUT_PER_PROP. Objective: maximize properties proved. Timeout: 2h"
undefined