dse-loop

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

DSE Loop: Autonomous Design Space Exploration

DSE循环：自主设计空间探索

Autonomously explore a design space: run → analyze → pick next parameters → repeat, until the objective is met or timeout is reached. Designed for computer architecture and EDA problems.

自主探索设计空间：运行→分析→选择下一组参数→重复，直至达成目标或达到超时时间。本工具专为计算机架构与EDA领域的问题设计。

Context: $ARGUMENTS

上下文：$ARGUMENTS

Safety Rules — READ FIRST

安全规则 — 请先阅读

NEVER do any of the following:

```
sudo
```
anything
```
rm -rf
```
,
```
rm -r
```
, or any recursive deletion
```
rm
```
any file you did not create in this session
Overwrite existing source files without reading them first
```
git push
```
,
```
git reset --hard
```
, or any destructive git operation
Kill processes you did not start

If a step requires any of the above, STOP and report to the user.

绝对禁止以下操作：

使用
```
sudo
```
执行任何命令
执行
```
rm -rf
```
、
```
rm -r
```
或任何递归删除操作
删除任何非本次会话创建的文件
未先阅读现有源文件就覆盖它们
执行
```
git push
```
、
```
git reset --hard
```
或任何破坏性Git操作
终止任何非你启动的进程

如果某步骤需要执行上述任一操作，请立即停止并向用户报告。

Constants (override via $ARGUMENTS)

常量（可通过$ARGUMENTS覆盖）

Constant	Default	Description
`TIMEOUT`	2h	Total wall-clock budget. Stop exploring after this.
`MAX_ITERATIONS`	50	Hard cap on number of design points evaluated.
`PATIENCE`	10	Stop early if no improvement for this many consecutive iterations.
`OBJECTIVE`	minimize	`minimize` or `maximize` the target metric.

Override inline:

/dse-loop "task desc — timeout: 4h, max_iterations: 100, patience: 15"

常量	默认值	描述
`TIMEOUT`	2h	总时钟预算，超过后停止探索
`MAX_ITERATIONS`	50	评估设计点数量的硬上限
`PATIENCE`	10	若连续多次迭代无改进则提前停止
`OBJECTIVE`	minimize	对目标指标进行 `minimize（最小化）` 或 `maximize（最大化）`

行内覆盖示例：

/dse-loop "任务描述 — timeout: 4h, max_iterations: 100, patience: 15"

Typical Use Cases

典型用例

Problem	Program	Parameters	Objective
Microarch DSE	gem5 simulation	cache size, assoc, pipeline width, ROB size, branch predictor	maximize IPC or minimize area×delay
Synthesis tuning	yosys/DC script	optimization passes, target freq, effort level	minimize area at timing closure
RTL parameterization	verilator sim	data width, FIFO depth, pipeline stages, buffer sizes	meet throughput target at min area
Compiler flags	gcc/llvm build + benchmark	-O levels, unroll factor, vectorization, scheduling	minimize runtime or code size
Placement/routing	openroad/innovus	utilization, aspect ratio, layer config	minimize wirelength / timing
Formal verification	abc/sby	bound depth, engine, timeout per property	maximize coverage in time budget
Memory subsystem	cacti / ramulator	bank count, row buffer policy, scheduling	optimize bandwidth/energy

问题	程序	参数	目标
微架构DSE	gem5仿真	缓存大小、关联度、流水线宽度、ROB大小、分支预测器	最大化IPC或最小化面积×延迟
综合调优	yosys/DC脚本	优化流程、目标频率、努力等级	在满足时序收敛的前提下最小化面积
RTL参数化	verilator仿真	数据宽度、FIFO深度、流水线级数、缓冲区大小	以最小面积达成吞吐量目标
编译器标志	gcc/llvm构建+基准测试	-O级别、循环展开因子、向量化、调度	最小化运行时间或代码大小
布局布线	openroad/innovus	利用率、宽高比、层配置	最小化线长/时序
形式化验证	abc/sby	边界深度、引擎、每个属性的超时时间	在时间预算内最大化覆盖率
内存子系统	cacti / ramulator	存储体数量、行缓冲区策略、调度	优化带宽/能耗

Workflow

工作流

Phase 0: Parse Task & Setup

阶段0：解析任务与设置

Parse $ARGUMENTS to extract:
- Program: what to run (command, script, or Makefile target)
- Parameter space: which knobs to tune and their ranges/options (may be incomplete — see step 2)
- Objective metric: what to optimize (and how to extract it from output)
- Constraints: hard limits that must not be violated (e.g., timing must close)
- Timeout: wall-clock budget
- Success criteria: when is the result "good enough" to stop early?

Infer missing parameter ranges — If the user provides parameter names but NOT ranges/options, you MUST infer them before exploring:

a. Read the source code — search for the parameter names in the codebase:

Look for argparse/click definitions, config files, Makefile variables, module parameters,
```
#define
```
,
```
parameter
```
(SystemVerilog),
```
localparam
```
, etc.
Extract defaults, types, and any comments hinting at valid values

b. Apply domain knowledge to set reasonable ranges:

Parameter type	Inference strategy
Cache/memory sizes	Powers of 2, typically 1KB–16MB
Associativity	Powers of 2: 1, 2, 4, 8, 16
Pipeline width / issue width	Small integers: 1, 2, 4, 8
Buffer/queue/FIFO depth	Powers of 2: 4, 8, 16, 32, 64
Clock period / frequency	Based on technology node; try ±50% from default
Bound depth (BMC/formal)	Geometric: 5, 10, 20, 50, 100
Timeout values	Geometric: 10s, 30s, 60s, 120s, 300s
Boolean/enum flags	Enumerate all options found in source
Continuous (learning rate, threshold)	Log-scale sweep: 5 points spanning 2 orders of magnitude around default
Integer counts (threads, cores)	Linear: from 1 to hardware max

c. Start conservative — begin with 3-5 values per parameter. Expand range later if the best result is at a boundary.

d. Log inferred ranges — write the inferred parameter space to

dse_results/inferred_params.md

so the user can review:

markdown

# Inferred Parameter Space

| Parameter | Source | Default | Inferred Range | Reasoning |
|-----------|--------|---------|---------------|-----------|
| CACHE_SIZE | config.py:42 | 32768 | [8192, 16384, 32768, 65536, 131072] | powers of 2, ±2x from default |
| ASSOC | config.py:43 | 4 | [1, 2, 4, 8] | standard associativities |
| BMC_DEPTH | run_bmc.py:15 | 10 | [5, 10, 20, 50] | geometric, common BMC depths |

e. Boundary expansion — during the search, if the best result is at the min or max of a range, automatically extend that range by one step in that direction (but log the extension).

Read the project to understand:
- How to run the program
- Where results are produced (stdout, log files, reports)
- How to parse the objective metric from output
- Current/baseline configuration (if any)
Create working directory:
```
dse_results/
```
in project root
- ```
dse_results/dse_log.csv
```
  — one row per design point
- ```
dse_results/DSE_REPORT.md
```
  — final report
- ```
dse_results/DSE_STATE.json
```
  — state for recovery
- ```
dse_results/inferred_params.md
```
  — inferred parameter space (if ranges were not provided)
- ```
dse_results/configs/
```
  — config files for each run
- ```
dse_results/outputs/
```
  — raw output for each run
Write a parameter extraction script (
```
dse_results/parse_result.py
```
or similar) that takes a run's output and returns the objective metric as a number. Test it on a baseline run first.
Run baseline (iteration 0): run the program with default/current parameters. Record the baseline metric. This is the point to beat.

解析$ARGUMENTS以提取：
- 程序：要运行的内容（命令、脚本或Makefile目标）
- 参数空间：要调优的控制项及其范围/选项（可能不完整——见步骤2）
- 目标指标：要优化的对象（以及如何从输出中提取它）
- 约束条件：必须满足的硬限制（例如，时序必须收敛）
- 超时时间：时钟预算
- 成功标准：结果达到何种程度可提前停止？

推断缺失的参数范围——如果用户提供了参数名称但未提供范围/选项，必须先推断范围再开始探索：

a. 阅读源代码——在代码库中搜索参数名称：

查找argparse/click定义、配置文件、Makefile变量、模块参数、
```
#define
```
、SystemVerilog의
```
parameter
```
、
```
localparam
```
等
提取默认值、类型以及任何提示有效值的注释

b. 运用领域知识设置合理范围：

参数类型	推断策略
缓存/内存大小	2的幂，通常为1KB–16MB
关联度	2的幂：1, 2, 4, 8, 16
流水线宽度/发射宽度	小整数：1, 2, 4, 8
缓冲区/队列/FIFO深度	2的幂：4, 8, 16, 32, 64
时钟周期/频率	基于工艺节点；尝试默认值的±50%
边界深度（BMC/形式化验证）	几何级数：5, 10, 20, 50, 100
超时值	几何级数：10s, 30s, 60s, 120s, 300s
布尔/枚举标志	枚举源代码中找到的所有选项
连续值（学习率、阈值）	对数尺度扫描：围绕默认值的2个数量级范围内取5个点
整数计数（线程、核心）	线性：从1到硬件最大值

c. 从保守范围开始——每个参数先取3-5个值。若最佳结果出现在范围边界，后续再扩展范围。

d. 记录推断的范围——将推断的参数空间写入

dse_results/inferred_params.md

，以便用户查看：

markdown

# 推断的参数空间

| 参数 | 来源 | 默认值 | 推断范围 | 推理依据 |
|-----------|--------|---------|---------------|-----------|
| CACHE_SIZE | config.py:42 | 32768 | [8192, 16384, 32768, 65536, 131072] | 2的幂，默认值的±2倍 |
| ASSOC | config.py:43 | 4 | [1, 2, 4, 8] | 标准关联度值 |
| BMC_DEPTH | run_bmc.py:15 | 10 | [5, 10, 20, 50] | 几何级数，常见BMC深度 |

e. 边界扩展——在搜索过程中，若最佳结果出现在某范围的最小值或最大值处，自动将该范围向对应方向扩展一个步长（并记录扩展操作）。

了解项目以明确：
- 如何运行程序
- 结果生成位置（标准输出、日志文件、报告）
- 如何从输出中解析目标指标
- 当前/基线配置（若有）
创建工作目录：在项目根目录下创建
```
dse_results/
```
- ```
dse_results/dse_log.csv
```
  — 每个设计点对应一行
- ```
dse_results/DSE_REPORT.md
```
  — 最终报告
- ```
dse_results/DSE_STATE.json
```
  — 用于恢复的状态文件
- ```
dse_results/inferred_params.md
```
  — 推断的参数空间（若未提供范围）
- ```
dse_results/configs/
```
  — 每次运行的配置文件
- ```
dse_results/outputs/
```
  — 每次运行的原始输出
编写参数提取脚本（如
```
dse_results/parse_result.py
```
），该脚本接收某次运行的输出并返回目标指标的数值。先在基线运行上测试该脚本。
运行基线（迭代0）：使用默认/当前参数运行程序。记录基线指标。这是后续要超越的基准。

Phase 1: Initial Exploration

阶段1：初始探索

Goal: Quickly survey the space to understand which parameters matter most.

Strategy: Latin Hypercube Sampling or structured sweep of key parameters.

Pick 5-10 diverse design points that span the parameter ranges
Run them (in parallel if independent, via background processes or sequential)

Record all results in

dse_log.csv

iteration,param1,param2,...,metric,constraint_met,timestamp,notes
0,default,default,...,baseline_val,yes,2026-03-13T10:00:00,baseline
1,val1a,val2a,...,result1,yes,2026-03-13T10:05:00,initial sweep
...

Analyze: which parameters have the most impact on the objective?
Narrow the search to the most sensitive parameters

目标：快速遍历参数空间，了解哪些参数影响最大。

策略：拉丁超立方采样或关键参数的结构化扫描。

选择5-10个覆盖参数范围的多样化设计点
运行这些设计点（若相互独立可并行运行，通过后台进程或顺序执行）

将所有结果记录到

dse_log.csv

：

iteration,param1,param2,...,metric,constraint_met,timestamp,notes
0,default,default,...,baseline_val,yes,2026-03-13T10:00:00,baseline
1,val1a,val2a,...,result1,yes,2026-03-13T10:05:00,initial sweep
...

分析：哪些参数对目标指标的影响最大？
将搜索范围缩小到最敏感的参数

Phase 2: Directed Search

阶段2：定向搜索

Goal: Converge toward the optimum by making informed choices.

Strategy: Adaptive — pick the approach that fits the problem:

Few parameters (≤3): Fine-grained grid search around the best region from Phase 1
Many parameters (>3): Coordinate descent — optimize one parameter at a time, holding others at current best
Binary/categorical params: Enumerate promising combinations
Continuous params: Binary search or golden section between best neighbors
Multi-objective: Track Pareto frontier, explore along the front

For each iteration:

Select next design point based on results so far:
- Look at the trend: which direction improves the metric?
- Avoid re-running configurations already evaluated
- Balance exploration (untested regions) vs exploitation (near current best)
Modify parameters: edit config file, command-line args, or source constants
Run the program: execute and capture output
Parse results: extract the objective metric and check constraints
Log to
dse_log.csv
: append the new row
Check stopping conditions:
- Timeout reached? → stop
- Max iterations reached? → stop
- Patience exhausted (no improvement in N iterations)? → stop
- Success criteria met (metric is "good enough")? → stop
- Constraint violation pattern detected? → adjust search bounds

Update
DSE_STATE.json
:

json

{
  "iteration": 15,
  "status": "in_progress",
  "best_metric": 1.23,
  "best_params": {"cache_size": 32768, "assoc": 4, "pipeline_width": 2},
  "total_iterations": 15,
  "start_time": "2026-03-13T10:00:00",
  "timeout": "2h",
  "patience_counter": 3
}

Decide next step → back to step 1

目标：通过明智的选择向最优解收敛。

策略：自适应——选择适合问题的方法：

参数较少（≤3个）：在阶段1找到的最优区域附近进行细粒度网格搜索
参数较多（>3个）：坐标下降法——每次优化一个参数，其他参数保持当前最优值
二进制/分类参数：枚举有前景的组合
连续参数：在最优邻居之间进行二分搜索或黄金分割搜索
多目标：跟踪帕累托前沿，沿前沿探索

每次迭代步骤：

根据已有结果选择下一个设计点：
- 观察趋势：哪个方向能改善指标？
- 避免重新运行已评估过的配置
- 平衡探索（未测试区域）与利用（当前最优附近区域）
修改参数：编辑配置文件、命令行参数或源常量
运行程序：执行并捕获输出
解析结果：提取目标指标并检查约束条件
记录到
dse_log.csv
：追加新行
检查停止条件：
- 是否已达到超时时间？→ 停止
- 是否已达到最大迭代次数？→ 停止
- 是否已耗尽耐心（连续N次迭代无改进）？→ 停止
- 是否已满足成功标准（指标足够好）？→ 停止
- 是否检测到约束违反模式？→ 调整搜索边界

更新
DSE_STATE.json
：

json

{
  "iteration": 15,
  "status": "in_progress",
  "best_metric": 1.23,
  "best_params": {"cache_size": 32768, "assoc": 4, "pipeline_width": 2},
  "total_iterations": 15,
  "start_time": "2026-03-13T10:00:00",
  "timeout": "2h",
  "patience_counter": 3
}

决定下一步 → 返回步骤1

Phase 3: Refinement (if time allows)

阶段3：优化（若时间允许）

If the search converged and there's still time budget:

Local perturbation: try ±1 step on each parameter from the best point
Sensitivity analysis: which parameters can be relaxed without hurting the metric?
Constraint boundary: if a constraint is nearly binding, explore near-feasible points

若搜索已收敛且仍有时间预算：

局部扰动：在最佳点的基础上，对每个参数尝试±1个步长的调整
敏感性分析：哪些参数可以放松而不影响指标？
约束边界：若某约束接近临界值，探索接近可行的点

Phase 4: Report

阶段4：报告

Write

dse_results/DSE_REPORT.md

markdown

undefined

编写

dse_results/DSE_REPORT.md

：

markdown

undefined

Design Space Exploration Report

设计空间探索报告

Task: [description] Date: [start] → [end] Total iterations: N Wall-clock time: X hours Y minutes

任务：[描述] 日期：[开始时间] → [结束时间] 总迭代次数：N 时钟时间：X小时Y分钟

Objective

目标

Metric: [what was optimized]
Direction: minimize / maximize
Baseline: [value]
Best found: [value] ([improvement]% better than baseline)

指标：[优化的对象]
方向：最小化 / 最大化
基线：[数值]
找到的最优值：[数值]（比基线提升[improvement]%）

Best Configuration

最优配置

Parameter	Baseline	Best
param1	default	best_val
param2	default	best_val
...	...	...

参数	基线	最优值
param1	default	best_val
param2	default	best_val
...	...	...

Search Trajectory

搜索轨迹

Iteration	param1	param2	...	Metric	Notes
0 (baseline)	...	...	...	...	baseline
1	...	...	...	...	initial sweep
...	...	...	...	...	...
N (best)	...	...	...	...	★ best

迭代次数	param1	param2	...	指标	备注
0（基线）	...	...	...	...	baseline
1	...	...	...	...	initial sweep
...	...	...	...	...	...
N（最优）	...	...	...	...	★ 最优

Parameter Sensitivity

参数敏感性

param1: [high/medium/low impact] — [brief explanation]
param2: [high/medium/low impact] — [brief explanation]

param1：[高/中/低影响] — [简要说明]
param2：[高/中/低影响] — [简要说明]

Pareto Frontier (if multi-objective)

帕累托前沿（多目标场景）

[Table or description of non-dominated points]

[非支配点的表格或描述]

Stopping Reason

停止原因

[timeout / max_iterations / patience / success_criteria_met]

[超时 / 达到最大迭代次数 / 耗尽耐心 / 满足成功标准]

Recommendations

建议

[actionable insights from the exploration]
[which parameters matter most]
[suggested follow-up explorations]


Also generate a summary plot if matplotlib is available:
- Convergence curve (metric vs iteration)
- Parameter sensitivity bar chart
- Pareto frontier scatter (if multi-objective)

[从探索中得出的可操作见解]
[哪些参数影响最大]
[建议的后续探索方向]


若matplotlib可用，还需生成汇总图表：
- 收敛曲线（指标 vs 迭代次数）
- 参数敏感性柱状图
- 帕累托前沿散点图（多目标场景）

State Recovery

状态恢复

If the context window compacts mid-run, the loop recovers from

DSE_STATE.json

dse_log.csv

Read
```
DSE_STATE.json
```
for current iteration, best params, patience counter
Read
```
dse_log.csv
```
for full history
Resume from next iteration

若在运行过程中上下文窗口被压缩，循环可从

DSE_STATE.json

dse_log.csv

恢复：

读取
```
DSE_STATE.json
```
获取当前迭代次数、最优参数、耐心计数器
读取
```
dse_log.csv
```
获取完整历史
从下一次迭代开始恢复运行

Key Rules

关键规则

Work AUTONOMOUSLY — do not ask the user for permission at each iteration
Every run must be logged — even failed runs, constraint violations, errors. The log is the ground truth.
Never re-run an identical configuration — check
```
dse_log.csv
```
before each run
Respect the timeout — check elapsed time before starting a new iteration. If the next run is likely to exceed the timeout, stop and report.
Parse metrics programmatically — write a parsing script, don't eyeball logs
Keep raw outputs — save each run's full output in
```
dse_results/outputs/iter_N/
```
Constraint violations are not improvements — a design point that violates constraints is never "best", regardless of the metric
If a run crashes, log the error, skip that point, and continue with the next
If the same crash repeats 3 times with different configs, stop and report the issue

自主工作——无需在每次迭代时请求用户许可
每次运行都必须记录——即使是失败的运行、约束违反、错误。日志是唯一的事实依据。
永远不要重新运行完全相同的配置——每次运行前检查
```
dse_log.csv
```
遵守超时时间——在开始新迭代前检查已用时间。若下一次运行可能超过超时时间，停止并报告。
以编程方式解析指标——编写解析脚本，不要手动查看日志
保留原始输出——将每次运行的完整输出保存到
```
dse_results/outputs/iter_N/
```
约束违反不算改进——违反约束条件的设计点无论指标如何，都不能被视为“最优”
若某次运行崩溃，记录错误，跳过该点，继续下一个
若不同配置连续3次崩溃，停止并报告问题

Example Invocations

调用示例

undefined

undefined

Minimal — just name the parameters, let the agent figure out ranges

/dse-loop "Run gem5 mcf benchmark. Tune: L1D_SIZE, L2_SIZE, ROB_ENTRIES. Objective: maximize IPC. Timeout: 3h"

Partial — some ranges given, some not

/dse-loop "Run make synth. Tune: CLOCK_PERIOD [5ns, 4ns, 3ns, 2ns], FLATTEN, ABC_SCRIPT. Objective: minimize area at timing closure. Timeout: 1h"

Fully specified — explicit ranges for everything

/dse-loop "Simulate processor with FIFO_DEPTH [4,8,16,32], ISSUE_WIDTH [1,2,4], PREFETCH [on,off]. Run: make sim. Objective: max throughput/area. Timeout: 2h"

Real-world: PDAG-SFA formal verification tuning

/dse-loop "Run python run_bmc.py. Tune: BMC_DEPTH, ENGINE, TIMEOUT_PER_PROP. Objective: maximize properties proved. Timeout: 2h"

undefined

/dse-loop "Run python run_bmc.py. Tune: BMC_DEPTH, ENGINE, TIMEOUT_PER_PROP. Objective: maximize properties proved. Timeout: 2h"

undefined