organize-ml-workspace

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Organize ML Workspace

组织ML工作区

Where things live, when to create a new file, what each file is allowed to contain.
规定各类内容的存放位置、创建新文件的时机以及每个文件允许包含的内容。

Next-step pointers — where you go after this skill

下一步指引——掌握本技能后的后续操作

You came here for…→ next
Bootstrap a fresh workspace
python-env-manager
§ Bootstrap; then
iterate-ml-experiment
§ 0
First experiment script
iterate-ml-experiment
§ 0 for the design note
Add a new experiment iteration
iterate-ml-experiment
§ 1 (new vs edit decision)
Pipeline / evaluate / smoke-test content
build-ml-pipeline
/
evaluate-ml-pipeline
/
smoke-test-ml-pipeline
Always re-emit the Pre-flight checklist with evidence before declaring the turn done.
你来到这里的目的…→ 下一步操作
搭建全新工作区
python-env-manager
§ 引导搭建;随后执行
iterate-ml-experiment
§ 0
创建首个实验脚本
iterate-ml-experiment
§ 0 编写设计笔记
添加新的实验迭代
iterate-ml-experiment
§ 1(决定创建新文件还是编辑现有文件)
流水线/评估/冒烟测试相关内容
build-ml-pipeline
/
evaluate-ml-pipeline
/
smoke-test-ml-pipeline
在宣布当前任务完成前,务必重新输出带有证据的预检清单。

Sibling skills — open just-in-time

关联技能——按需调用

Don't pre-read all nine at session start (paralysis). Open each sibling SKILL.md when a step calls for it (e.g. open
python-env-manager
before G-ENV-MGR; open
iterate-ml-experiment
before handing off the design-note write). Emit this tracker once per turn:
Sibling skills (just-in-time):
  - data-science-python-stack, python-env-manager, python-api,
    python-code-style, iterate-ml-experiment, build-ml-pipeline,
    evaluate-ml-pipeline, test-ml-pipeline, smoke-test-ml-pipeline
不要在会话开始前预先阅读所有9项技能(避免决策瘫痪)。当步骤需要时再打开对应的SKILL.md文件(例如,在执行G-ENV-MGR前打开
python-env-manager
;在交付设计笔记编写任务前打开
iterate-ml-experiment
)。每次任务输出如下追踪信息:
关联技能(按需调用):
  - data-science-python-stack, python-env-manager, python-api,
    python-code-style, iterate-ml-experiment, build-ml-pipeline,
    evaluate-ml-pipeline, test-ml-pipeline, smoke-test-ml-pipeline

Stop conditions — read before anything else

停止条件——执行任何操作前请先阅读

  • Missing dependency. If
    import skore
    raises, STOP. Invoke
    python-env-manager
    for the install command. Do not drop
    skore.Project
    in favor of
    mlflow
    / pickles / "print metrics" — the workspace contract assumes a Project on disk.
  • Symbol from memory is forbidden. Any
    skore.Project
    /
    project.put
    /
    skore.evaluate
    signature must come from a
    python-api
    call this turn.
  • Existing layout wins — detect first. Run the Detection table before scaffolding. Don't rename, relocate, or "tidy up" existing folders.
  • Notebooks are not silent. Existing
    .ipynb
    files in the experiment folder → surface the convention shift and ask. Don't auto-convert.
  • Scratch is read-only against the skore Project. Probes under
    scratch/<ts>_<short>.py
    may call
    project.get(...)
    ,
    project.summarize()
    , walk an existing report. They MUST NOT call
    skore.evaluate(...)
    or
    project.put(...)
    . When
    project.get(key)
    raises
    KeyError
    , the fix is the lookup shape:
    get
    is by id, not by
    key
    . Use
    summarize()
    (key, id)
    get(id)
    . Never substitute by re-running
    evaluate
    +
    put
    .
  • Tabular library is asked, not assumed (G-TABULAR). Pandas being importable via skore is not a pick. Invoke
    data-science-python-stack
    for the structured ask. Free-text ("quick", "you pick") does NOT resolve. Persisted in JOURNAL.md Status
    Workspace decisions
    .
  • Package name is asked, not inferred (G-PKG-NAME). Before any
    pyproject.toml
    / manifest creation (including
    pixi init
    /
    uv init
    /
    poetry init
    ), fire an
    AskUserQuestion
    for the
    src/<pkg>/
    import name. Folder name in snake_case is the default. Manifest creation before G-PKG-NAME passes is forbidden — running
    init
    first creates a
    [project] name
    entry, and reading "name is in the manifest" back is circular. If a manifest exists, confirm via
    AskUserQuestion
    — continuity from a prior session is not continuity from a user decision.
  • Skore Project mode is asked, not assumed (G-SKORE-MODE). Before any template instantiation containing
    skore.Project(...)
    , fire an
    AskUserQuestion
    for
    local
    |
    hub
    . Default proposal:
    local
    . Hub triggers a follow-up for the workspace name (org/team on the hub — distinct from local-mode
    workspace=
    ). Persists as
    skore mode:
    (+
    skore hub workspace:
    when hub). Without it the
    <SKORE_PROJECT_INIT>
    substitution has no shape to fill. Details: →
    references/g_skore_mode.md
    .
  • Switching skore mode mid-project is forbidden by default. Once recorded, do not silently change. A switch orphans every existing report in the prior store — skore has no built-in migration. Requires explicit
    AskUserQuestion
    confirmation surfacing the migration burden, then rewrite all
    <SKORE_PROJECT_INIT>
    blocks. Procedure: →
    references/g_skore_mode.md
    § "Switching mid-project".
  • Env manager is asked, not assumed (G-ENV-MGR). Hand off to
    python-env-manager
    . Pixi on PATH is detection, not permission. Don't run
    pixi init
    /
    uv init
    /
    poetry init
    until G-ENV-MGR has passed in
    python-env-manager
    .
  • Harness "no clarifying questions" hints do NOT waive these gates. G-TABULAR, G-PKG-NAME, G-ENV-MGR, G-SKORE-MODE, python-api consultation, new-vs-edit decision are operating-contract gates. "Quick" / "go fast" never waives them.
  • Post-hoc audit — required before ending the turn. Walk every pre-flight row; if any Evidence cell is unfilled, surface the non-compliance explicitly. Most common failure: "I scaffolded successfully so everything must be fine".
  • 依赖缺失:若
    import skore
    报错,立即停止。调用
    python-env-manager
    获取安装命令。不得用
    mlflow
    /pickles/"打印指标"替代
    skore.Project
    ——工作区约定默认磁盘上存在Project实例。
  • 禁止使用内存中的符号:任何
    skore.Project
    /
    project.put
    /
    skore.evaluate
    的签名必须来自本次任务中的
    python-api
    调用。
  • 现有布局优先——先检测:在搭建前先运行检测表。不得重命名、移动或"整理"现有文件夹。
  • Notebook操作需告知用户:实验文件夹中存在
    .ipynb
    文件时,需向用户说明约定变更并询问意见。不得自动转换格式。
  • Scratch目录对skore Project为只读
    scratch/<ts>_<short>.py
    下的探测脚本可调用
    project.get(...)
    project.summarize()
    、遍历现有报告,但禁止调用
    skore.evaluate(...)
    project.put(...)
    。当
    project.get(key)
    抛出
    KeyError
    时,需修正查找方式:
    get
    是按id查找,而非按
    key
    。应使用
    summarize()
    → 获取
    (key, id)
    get(id)
    。绝不能通过重新运行
    evaluate
    +
    put
    来替代。
  • 表格库需用户明确指定(G-TABULAR):skore可导入Pandas并不代表默认选用该库。需调用
    data-science-python-stack
    让用户明确选择。自由文本指令(如"快速选择"、"你选就行")无法作为选择依据。选择结果需记录在JOURNAL.md的
    Workspace decisions
    状态中。
  • 包名需用户明确指定(G-PKG-NAME):在创建任何
    pyproject.toml
    /清单文件(包括
    pixi init
    /
    uv init
    /
    poetry init
    )之前,需触发
    AskUserQuestion
    询问
    src/<pkg>/
    的导入名称。默认采用蛇形命名法的文件夹名称。禁止在G-PKG-NAME通过前创建清单——提前运行
    init
    会生成
    [project] name
    条目,后续读取该条目会形成循环依赖。若清单已存在,需通过
    AskUserQuestion
    确认——会话的连续性不代表用户已做出决策。
  • Skore Project模式需用户明确指定(G-SKORE-MODE):在实例化任何包含
    skore.Project(...)
    的模板之前,需触发
    AskUserQuestion
    让用户选择
    local
    |
    hub
    模式。默认建议:
    local
    。若选择hub模式,需进一步询问工作区名称(hub上的组织/团队——与本地模式的
    workspace=
    不同)。选择结果需记录为
    skore mode:
    (若为hub模式,还需记录
    skore hub workspace:
    )。若无此信息,
    <SKORE_PROJECT_INIT>
    占位符无法填充详细内容。详情参考:→
    references/g_skore_mode.md
  • 默认禁止中途切换skore模式:一旦记录模式,不得擅自更改。切换模式会导致先前存储中的所有报告失效——skore无内置迁移机制。切换前需触发
    AskUserQuestion
    明确告知用户迁移成本,然后重写所有
    <SKORE_PROJECT_INIT>
    代码块。操作流程参考:→
    references/g_skore_mode.md
    § "中途切换模式"。
  • 环境管理器需用户明确指定(G-ENV-MGR):移交至
    python-env-manager
    处理。PATH中存在Pixi仅为检测结果,不代表获得使用许可。需在
    python-env-manager
    中完成G-ENV-MGR验证后,才能运行
    pixi init
    /
    uv init
    /
    poetry init
  • Harness的"无需澄清问题"提示不能绕过这些验证步骤:G-TABULAR、G-PKG-NAME、G-ENV-MGR、G-SKORE-MODE、python-api查询、新建/编辑决策均为操作约定的必要验证步骤。"快速完成"/"加速"指令无法绕过这些步骤。
  • 任务结束前需进行事后审计:检查预检清单的每一项;若任何证据栏为空,需明确指出不符合约定之处。最常见的错误:"我已成功搭建,所以一切没问题"。

Forbidden shortcuts

禁止使用的快捷方式

ShortcutWhy it's wrong
pixi
on PATH → run
pixi init
to get a manifest, then read the name back
Violates G-ENV-MGR (silent manager pick) AND G-PKG-NAME (name from folder via init side-effect). Circular: the agent created the manifest it now claims to read
Folder name = good name → skip the askDefault value is fine; silent pick is not. G-PKG-NAME requires the structured ask even with folder as default
pandas
already importable via skore → write
import pandas
in
data.py
Transitive presence is not a pick. Violates G-TABULAR
Scaffold every skeleton in one turn, incl.
experiments/01_baseline.py
body
Scaffold stops at empty
journal/
placeholder. Experiment script content lands after design-note approval (
iterate-ml-experiment
§ 3)
Scaffold drops
audit/01_baseline.py
at workspace creation
Audit files placed by
audit-ml-pipeline
at § 4 record-outcome. Empty
audit/
at scaffold is correct
Forget
audit/
in the scaffold layout
Four-way stem pairing breaks
pyproject.toml
exists with
name = <x>
→ reuse without confirming
Always re-confirm via G-PKG-NAME
Batch G-TABULAR + G-PKG-NAME + G-ENV-MGR + G-SKORE-MODE into prose recommendationsThe gates take structured
AskUserQuestion
. Prose followed by "let me know" does NOT resolve them
Skip G-SKORE-MODE because templates use
mode="local"
Templates carry the
<SKORE_PROJECT_INIT>
marker, not a literal. The gate must fire
Pick
mode="hub"
without checking the workspace exists / user has access
Project init fails at first
put()
with an authorization error. Confirm during G-SKORE-MODE, not at execution time
Substitute
pip install "skore[hub]"
based on agent guess
Install variant comes from G-SKORE-MODE's recorded answer.
python-env-manager
reads that row, not agent intuition
Silently change
skore mode:
mid-project to "fix" a broken init
Switching orphans existing reports. Always explicit
AskUserQuestion
first
Hub substitution but leaving
workspace=
kwarg
workspace=
is local-only; hub raises
TypeError
. Substitute the whole block, not just the mode literal
Local
workspace="reports"
(relative) instead of
str(PROJECT_ROOT / "reports")
(absolute)
Relative resolves against CWD; runs from other dirs write the store somewhere unexpected. Always absolute via
PROJECT_ROOT
Putting
skore.login(mode="hub")
after
skore.Project(...)
Project(...)
requires authenticated session in hub mode.
login
first
Substituting
<SKORE_PROJECT_INIT>
in
audit/<stem>.py
independently of
experiments/<stem>.py
Audit must open the same Project. Byte-identical copy from the experiment file is the rule
Hub workspace name contains
/
(e.g.
acme/datasci
)
The
/
is reserved as separator. Reject at G-SKORE-MODE follow-up
project.get(key)
raised
KeyError
→ re-run
evaluate
+
put
to "recover"
Lookup shape wrong (
get
is by id). Use
summarize()
get(id)
快捷方式错误原因
PATH中存在
pixi
→ 运行
pixi init
生成清单,然后从中读取名称
违反G-ENV-MGR(擅自选择环境管理器)和G-PKG-NAME(通过init副作用从文件夹获取名称)。循环依赖:Agent创建了自己随后要读取的清单
文件夹名称可用 → 跳过询问用户默认值可行,但擅自选择不可行。G-PKG-NAME要求即使默认使用文件夹名称,也需结构化询问用户
skore可导入
pandas
→ 在
data.py
中写入
import pandas
间接存在不代表用户选择了该库。违反G-TABULAR
一次性搭建所有骨架,包括
experiments/01_baseline.py
的内容
搭建仅需创建空的
journal/
占位符。实验脚本内容需在设计笔记通过审核后添加(
iterate-ml-experiment
§ 3)
搭建工作区时创建
audit/01_baseline.py
审计文件由
audit-ml-pipeline
在§ 4记录结果时创建。搭建时创建空的
audit/
目录是正确的
搭建布局时遗漏
audit/
目录
破坏了四元组骨架配对规则
pyproject.toml
中存在
name = <x>
→ 直接复用无需确认
始终需通过G-PKG-NAME重新确认
将G-TABULAR + G-PKG-NAME + G-ENV-MGR + G-SKORE-MODE批量转换为建议性文本这些验证步骤需要结构化的
AskUserQuestion
。仅提供文本然后等待用户回复无法完成验证
因模板使用
mode="local"
而跳过G-SKORE-MODE
模板包含
<SKORE_PROJECT_INIT>
标记,而非固定值。必须触发该验证步骤
选择
mode="hub"
但未检查工作区是否存在/用户是否有权限
Project初始化会在首次
put()
时因授权错误失败。需在G-SKORE-MGR步骤中确认,而非执行时确认
根据Agent猜测替换为
pip install "skore[hub]"
安装变体需根据G-SKORE-MODE记录的结果确定。
python-env-manager
读取该结果,而非依赖Agent直觉
擅自更改
skore mode:
以"修复"初始化失败问题
切换模式会导致现有报告失效。必须先触发
AskUserQuestion
获得用户明确确认
选择hub模式但保留
workspace=
参数
workspace=
仅适用于本地模式;hub模式会抛出
TypeError
。需替换整个代码块,而非仅替换mode字面量
本地模式使用相对路径
workspace="reports"
而非绝对路径
str(PROJECT_ROOT / "reports")
相对路径基于当前工作目录解析;从其他目录运行会将存储写入意外位置。始终通过
PROJECT_ROOT
使用绝对路径
skore.login(mode="hub")
放在
skore.Project(...)
之后
hub模式下
Project(...)
需要已认证的会话。需先执行
login
独立替换
audit/<stem>.py
中的
<SKORE_PROJECT_INIT>
,而非与
experiments/<stem>.py
保持一致
审计文件必须打开相同的Project。规则是直接复制实验文件中的对应代码块
Hub工作区名称包含
/
(例如
acme/datasci
/
是保留的分隔符。需在G-SKORE-MODE后续步骤中拒绝该名称
project.get(key)
抛出
KeyError
→ 重新运行
evaluate
+
put
以"恢复"
查找方式错误(
get
按id查找)。应使用
summarize()
get(id)

Pre-flight — emit before any code

预检——执行任何代码前输出

Each ticked box needs an Evidence line (format spec in
iterate-ml-experiment
§ "Pre-flight evidence requirements"; see also
python-env-manager/references/preflight_evidence.md
).
Pre-flight (organize-ml-workspace):
- [ ] `Workspace decisions` in `journal/JOURNAL.md` Status checked
      for pre-recorded gates (tabular, env_manager, package, skore mode)
      Evidence: lists each <gate>: <value | not recorded>
                | "n/a — JOURNAL.md does not exist yet (truly fresh)"
- [ ] Tier 1 mandatory libs importable: sklearn, skrub, skore
      Evidence: Write scratch/<ts>_check_tier1.py + `pixi run python …` output.
                **Inline `python -c` is NOT evidence**.
- [ ] Layout detection done: <existing | fresh>
      Evidence: ls/Glob on project root + matched signal from Detection
- [ ] G-TABULAR resolved: pandas | polars
      Evidence: AskUserQuestion id=<id> via data-science-python-stack |
                JOURNAL.md Status (Workspace decisions) | user quote turn N
- [ ] G-ENV-MGR resolved
      Evidence: AskUserQuestion id=<id> via python-env-manager |
                JOURNAL.md Status (Workspace decisions)
- [ ] G-PKG-NAME resolved: <name>
      Evidence: AskUserQuestion id=<id>, answer=<name> |
                JOURNAL.md Status (Workspace decisions) |
                existing manifest's [project].name **confirmed via AskUserQuestion**
                (reading the manifest alone is NOT sufficient)
- [ ] G-SKORE-MODE resolved: local | hub
      Evidence: AskUserQuestion id=<id>, answer=<local|hub> |
                JOURNAL.md Status (Workspace decisions) `skore mode:` row
      If hub: also captures `skore hub workspace:` row.
- [ ] `pyproject.toml` present at root declaring `src/<pkg>/`;
      editable install wired via `python-env-manager` § Editable workspace
      Evidence: Read pyproject.toml (this turn) + manager's editable-install call
- [ ] python-api consulted for: Project, put, evaluate
      Evidence: Read scratch/api/skore/<v>/{project_local,evaluate}.md
                | Write of the same files (this turn)
                | "n/a — symbols already in workspace cache"
- [ ] Decision: new experiment file vs edit existing
      Evidence: AskUserQuestion id=<id> | user quote turn N |
                "n/a — first experiment in a fresh workspace"
- [ ] `journal/` scaffolded with empty placeholder JOURNAL.md
      Evidence: Write journal/JOURNAL.md (this turn) | "already exists"
- [ ] Pre-flight re-emitted with evidence before final message.
      Evidence: this checklist appears in the end-of-turn summary.
每个勾选的选项都需要一条证据(格式要求见
iterate-ml-experiment
§ "预检证据要求";另见
python-env-manager/references/preflight_evidence.md
)。
预检(organize-ml-workspace):
- [ ] 检查`journal/JOURNAL.md`中的`Workspace decisions`状态,确认是否有预先记录的验证项(tabular、env_manager、package、skore mode)
      证据:列出每个<验证项>: <值 | 未记录>
                | "不适用——JOURNAL.md尚未创建(全新工作区)"
- [ ] 一级必备库可导入:sklearn, skrub, skore
      证据:创建scratch/<ts>_check_tier1.py并执行`pixi run python …`,输出结果。
                **直接运行`python -c`不能作为证据**。
- [ ] 已完成布局检测:<已有布局 | 全新布局>
      证据:在项目根目录执行ls/Glob命令,匹配检测表中的信号
- [ ] G-TABULAR已确认:pandas | polars
      证据:通过data-science-python-stack触发AskUserQuestion id=<id> |
                JOURNAL.md状态(Workspace decisions) | 用户在第N轮的回复内容
- [ ] G-ENV-MGR已确认
      证据:通过python-env-manager触发AskUserQuestion id=<id> |
                JOURNAL.md状态(Workspace decisions)
- [ ] G-PKG-NAME已确认:<名称>
      证据:触发AskUserQuestion id=<id>,回复=<名称> |
                JOURNAL.md状态(Workspace decisions) |
                现有清单中的[project].name **已通过AskUserQuestion确认**
                (仅读取清单内容不足以作为证据)
- [ ] G-SKORE-MODE已确认:local | hub
      证据:触发AskUserQuestion id=<id>,回复=<local|hub> |
                JOURNAL.md状态(Workspace decisions)中的`skore mode:`条目
      若为hub模式:还需记录`skore hub workspace:`条目。
- [ ] 项目根目录存在`pyproject.toml`,声明`src/<pkg>/`;
      已通过`python-env-manager` § 可编辑工作区配置可编辑安装
      证据:本次任务中读取pyproject.toml内容 + 环境管理器的可编辑安装命令
- [ ] 已查询python-api获取:Project, put, evaluate的签名
      证据:读取scratch/api/skore/<v>/{project_local,evaluate}.md
                | 本次任务中创建了上述文件
                | "不适用——符号已在工作区缓存中"
- [ ] 已决定:创建新实验文件还是编辑现有文件
      证据:触发AskUserQuestion id=<id> | 用户在第N轮的回复内容 |
                "不适用——全新工作区的首个实验"
- [ ] 已搭建`journal/`目录并创建空的JOURNAL.md占位符
      证据:本次任务中创建journal/JOURNAL.md | "已存在"
- [ ] 任务结束前重新输出带有证据的预检清单。
      证据:本清单出现在任务结束总结中。

Detection — existing workspace first

检测——优先适配现有工作区

SignalMeaning
pyproject.toml
with
[project] name
+
[tool.setuptools.packages.find]
(or poetry / hatch equivalents)
Package declared installable — use this name; verify editable install via
python-env-manager
pixi.toml
/
[tool.poetry]
/
[tool.uv]
with name but no
[project]
table
Manager knows the project but package isn't installable — flag, offer to add
pyproject.toml
src/<pkg>/__init__.py
or
<pkg>/__init__.py
at root
Package dir already chosen — keep it
<pkg>.egg-info/
at root or under
src/
Stale out-of-band
pip install -e .
— flag drift, offer to wire via manager
experiments/
,
notebooks/
,
scripts/
,
analyses/
Experiment location chosen — keep it
audit/
with
# %%
files
Audit location chosen — keep it; body owned by
audit-ml-pipeline
journal/
,
plans/
,
proposals/
Journal location chosen — keep it
reports/
,
results/
,
runs/
Report location chosen — keep it
tests/
Test location chosen — keep it
mlflow.db
/
mlruns/
at root
Prior tracker artifacts — leave alone; skore is canonical
.ipynb
files in experiment folder
User is on notebooks — surface the shift and ask; don't auto-switch
Any signal present → glue to existing convention. No renames, no relocates. None present → fresh scaffold (below).
→ next: G-PKG-NAME, then
python-env-manager
for G-ENV-MGR.
信号含义
pyproject.toml
包含
[project] name
+
[tool.setuptools.packages.find]
(或poetry/hatch等效配置)
已声明包可安装——使用该名称;通过
python-env-manager
验证可编辑安装
pixi.toml
/
[tool.poetry]
/
[tool.uv]
包含名称但
[project]
环境管理器知晓项目但包不可安装——标记该问题,提议添加
pyproject.toml
根目录存在
src/<pkg>/__init__.py
<pkg>/__init__.py
已选择包目录——保留该目录
根目录或
src/
下存在
<pkg>.egg-info/
过时的手动
pip install -e .
安装——标记差异,提议通过环境管理器重新配置
存在
experiments/
,
notebooks/
,
scripts/
,
analyses/
目录
已选择实验存放位置——保留该位置
audit/
目录下存在
# %%
文件
已选择审计存放位置——保留该位置;文件内容由
audit-ml-pipeline
管理
存在
journal/
,
plans/
,
proposals/
目录
已选择日志存放位置——保留该位置
存在
reports/
,
results/
,
runs/
目录
已选择报告存放位置——保留该位置
存在
tests/
目录
已选择测试存放位置——保留该位置
根目录存在
mlflow.db
/
mlruns/
存在旧的追踪器 artifacts——保留;skore为当前规范工具
实验文件夹中存在
.ipynb
文件
用户正在使用Notebook——告知用户约定变更并询问意见;不得自动切换
匹配到任何信号→适配现有约定。不得重命名、移动文件夹。未匹配到任何信号→搭建全新布局(如下)。
→ 下一步:完成G-PKG-NAME,然后调用
python-env-manager
完成G-ENV-MGR。

Default layout (fresh workspace)

默认布局(全新工作区)

project/
├── pyproject.toml          # declares src/<pkg>/ as installable
├── <manager manifest>      # pixi.toml / poetry / uv / hatch / environment.yml
├── src/<pkg>/
│   ├── __init__.py         # exposes PROJECT_ROOT
│   ├── data.py             # data loading, splits, split_kwargs
│   ├── features.py         # transformers, encoders, feature fns
│   ├── pipeline.py         # the learner declaration (skrub DataOps)
│   └── evaluate.py         # ONLY: CV strategy + optional metric overrides
├── journal/
│   ├── JOURNAL.md          # session-start log; index of experiments
│   └── 01_baseline.md      # one `.md` per planned experiment
├── experiments/
│   └── 01_baseline.py      # one `# %%` script per experiment
├── audit/
│   └── 01_baseline.py      # body owned by audit-ml-pipeline (read-only)
├── tests/
│   └── smoke/              # body owned by smoke-test-ml-pipeline
├── overview/
│   └── summary.md          # agent-authored narrative (iterate-ml-experiment § 4)
├── scratch/                # agent-only (gitignored entirely)
└── reports/                # skore Project lives here
The package is installable.
pyproject.toml
declares
src/<pkg>/
; the manager installs in editable mode so
from <pkg>.pipeline import build_learner
works from any CWD. Wiring per-manager:
python-env-manager
§ Editable workspace.
Runtime deps (sklearn, skrub, skore, tabular) live in the manager's manifest, not in
[project.dependencies]
.
Deliberately absent: no
data/
(user-owned), no
models/
(out of scope). Add later only on user request — don't pre-empt.
project/
├── pyproject.toml          # 声明src/<pkg>/为可安装包
├── <环境管理器清单>      # pixi.toml / poetry / uv / hatch / environment.yml
├── src/<pkg>/
│   ├── __init__.py         # 暴露PROJECT_ROOT
│   ├── data.py             # 数据加载、拆分、split_kwargs
│   ├── features.py         # 转换器、编码器、特征函数
│   ├── pipeline.py         # 学习器声明(skrub DataOps)
│   └── evaluate.py         # 仅包含:CV策略 + 可选指标覆盖
├── journal/
│   ├── JOURNAL.md          # 会话启动日志;实验索引
│   └── 01_baseline.md      # 每个计划实验对应一个.md文件
├── experiments/
│   └── 01_baseline.py      # 每个实验对应一个`# %%`脚本
├── audit/
│   └── 01_baseline.py      # 文件内容由audit-ml-pipeline管理(只读)
├── tests/
│   └── smoke/              # 文件内容由smoke-test-ml-pipeline管理
├── overview/
│   └── summary.md          # Agent编写的叙事摘要(iterate-ml-experiment § 4)
├── scratch/                # 仅Agent使用(完全被git忽略)
└── reports/                # skore Project存储位置
包为可安装状态
pyproject.toml
声明
src/<pkg>/
;环境管理器以可编辑模式安装,因此
from <pkg>.pipeline import build_learner
可在任意工作目录下生效。各环境管理器的配置方式参考:
python-env-manager
§ 可编辑工作区。
运行时依赖(sklearn, skrub, skore, 表格库)存放在环境管理器的清单中,而非
[project.dependencies]
刻意省略的目录:无
data/
目录(用户自主管理),无
models/
目录(超出范围)。仅在用户请求时添加——不得预先创建。

File-creation rules

文件创建规则

Design note first, then code

先写设计笔记,再编写代码

Before creating
experiments/NN_<short_name>.py
, the matching
journal/NN_<short_name>.md
must exist and have been validated by the user. Design-note content is owned by
iterate-ml-experiment
; this skill only enforces the pairing.
创建
experiments/NN_<short_name>.py
之前,必须存在对应的
journal/NN_<short_name>.md
且已通过用户验证。设计笔记内容由
iterate-ml-experiment
负责;本技能仅强制两者的配对关系。

Four-way stem pairing

四元组骨架配对

Every experiment is identified by
NN_<short_name>
in four places:
journal/NN_<short_name>.md            (design note)
experiments/NN_<short_name>.py        (script)
tests/smoke/test_NN_<short_name>.py   (smoke test)
audit/NN_<short_name>.py              (audit file — read-only)
By the time the experiment flips to
done
in JOURNAL.md AND its summary is refreshed in
overview/summary.md
, all four exist. The design note is written first; the script lands on approval; the smoke test body is filled by
smoke-test-ml-pipeline
; the audit file is placed and executed by
audit-ml-pipeline
at § 4 record-outcome.
The audit file is read-only against the workspace's skore Project and data — see
audit-ml-pipeline
§ Read-only contract.
每个实验通过
NN_<short_name>
标识,对应四个位置:
journal/NN_<short_name>.md            (设计笔记)
experiments/NN_<short_name>.py        (脚本)
tests/smoke/test_NN_<short_name>.py   (冒烟测试)
audit/NN_<short_name>.py              (审计文件——只读)
当实验在JOURNAL.md中标记为
done
且其摘要在
overview/summary.md
中更新后,这四个文件必须全部存在。设计笔记先编写;脚本在通过审核后创建;冒烟测试内容由
smoke-test-ml-pipeline
填充;审计文件由
audit-ml-pipeline
在§ 4记录结果时创建并执行。
审计文件对工作区的skore Project和数据为只读——详见
audit-ml-pipeline
§ 只读约定。

New experiment → new file. Iterating → ask first.

新实验→新文件。迭代实验→先询问用户。

Default: new file.
02_text_encoder.py
,
03_grouped_cv.py
. The numeric prefix preserves iteration order under
ls
.
When the user says "let's tweak experiment 02", do not assume. Fire
AskUserQuestion
:
Should this be a new experiment file (e.g.
04_text_encoder_v2.py
) or an in-place edit of
02_text_encoder.py
?
In-place edits overwrite the prior result in the skore Project if the same key is reused — flag this. In-place also requires revisiting the matching smoke test (→
smoke-test-ml-pipeline
).
默认规则:创建新文件。例如
02_text_encoder.py
03_grouped_cv.py
。数字前缀可确保
ls
命令按迭代顺序显示文件。
当用户说"我们调整一下实验02"时,不要擅自假设。需触发
AskUserQuestion
本次调整应创建新的实验文件(例如
04_text_encoder_v2.py
)还是就地编辑现有文件
02_text_encoder.py
就地编辑若重用相同key会覆盖skore Project中的先前结果——需向用户标记该风险。就地编辑还需重新检查对应的冒烟测试(→
smoke-test-ml-pipeline
)。

Decision flow (13 steps — full version in
references/scaffold_steps.md
)

决策流程(13步——完整版本见
references/scaffold_steps.md

#StepOwner
1Read project root; Detection table matches → glue (stop). No match → continuethis skill
2G-PKG-NAME structured ask. Record in
Workspace decisions
. No manager
init
until this passes
this skill
2aG-SKORE-MODE ask: localhub (+ hub workspace name if hub). Determines
<SKORE_PROJECT_INIT>
form + skore install variant. →
references/g_skore_mode.md
3Drop
pyproject.toml
from
templates/pyproject.toml
(substitute
<pkg>
). Hand off to
python-env-manager
for editable install
this skill → env-manager
4Create
src/<pkg>/
with skeletons from
templates/src_*.py
this skill
5Create
experiments/01_baseline.py
from
templates/experiment.py
(substitute
<pkg>
,
<SKORE_PROJECT_INIT>
per G-SKORE-MODE,
<project-name>
)
this skill
6Create empty
tests/smoke/
. Verify pytest on manifest
this skill
6aCreate empty
audit/
this skill
7Create
journal/JOURNAL.md
one-line placeholder;
iterate-ml-experiment
rewrites it
this skill
8Create
overview/summary.md
from
templates/summary.md
this skill
9Create empty
scratch/
(no README — owned by
python-api
)
this skill
10Create empty
reports/
this skill
11Touch
.gitignore
— drop template if none; else suggest patch (always ask about
reports/
)
this skill
12Hand off to
python-code-style
§ Initial setup for
ruff.toml
+ first pass — invoking the skill teaches NumPyDoc
this skill → python-code-style
13Hand back to the relevant sibling (
iterate-ml-experiment
for design note, etc.)
this skill → next caller
→ next:
iterate-ml-experiment
§ 0 (bootstrap) for the first design note.
序号步骤负责技能
1读取项目根目录;匹配检测表→适配现有布局(停止流程)。未匹配→继续本技能
2G-PKG-NAME结构化询问。记录在
Workspace decisions
中。未通过前不得运行环境管理器的
init
命令
本技能
2aG-SKORE-MODE询问:localhub(若选择hub模式,需询问hub工作区名称)。决定
<SKORE_PROJECT_INIT>
的形式 + skore安装变体。参考:→
references/g_skore_mode.md
3
templates/pyproject.toml
复制并创建
pyproject.toml
(替换
<pkg>
)。移交至
python-env-manager
配置可编辑安装
本技能 → 环境管理器
4
templates/src_*.py
复制骨架文件,创建
src/<pkg>/
目录
本技能
5
templates/experiment.py
创建
experiments/01_baseline.py
(替换
<pkg>
、根据G-SKORE-MODE替换
<SKORE_PROJECT_INIT>
<project-name>
本技能
6创建空的
tests/smoke/
目录。验证清单中是否包含pytest
本技能
6a创建空的
audit/
目录
本技能
7创建
journal/JOURNAL.md
单行占位符;
iterate-ml-experiment
会重写该文件
本技能
8
templates/summary.md
创建
overview/summary.md
本技能
9创建空的
scratch/
目录(无需README——由
python-api
管理)
本技能
10创建空的
reports/
目录
本技能
11创建
.gitignore
文件——若不存在则复制模板;否则建议补丁(始终询问用户是否忽略
reports/
本技能
12移交至
python-code-style
§ 初始配置
ruff.toml
+ 首次检查——调用该技能可学习NumPyDoc规范
本技能 → python-code-style
13移交至对应的关联技能(例如,编写设计笔记移交至
iterate-ml-experiment
本技能 → 后续调用方
→ 下一步:
iterate-ml-experiment
§ 0(引导搭建)编写首个设计笔记。

Files in src/<pkg>/

src/<pkg>/下的文件

Each has a narrow contract:
  • __init__.py
    — exposes
    PROJECT_ROOT
    (absolute, derived from
    __file__
    , not CWD). Modules needing project-relative paths import this constant. Requires editable install.
  • data.py
    — loaders, materialization of
    X
    ,
    y
    , any
    split_kwargs
    (groups, time, …) at the X marker. Pipeline mechanics in
    build-ml-pipeline
    .
  • features.py
    — feature functions and transformers.
  • pipeline.py
    — the learner declaration (a
    SkrubLearner
    ).
    build_learner
    exposes
    data_dir_preview=None
    so the experiment script can pass an absolute path from
    PROJECT_ROOT
    .
  • evaluate.py
    only the inputs to
    skore.evaluate
    : the cross-validator (
    splitter = ...
    ), optional metric overrides. Does NOT call
    skore.evaluate
    , does NOT open a Project, does NOT persist.
每个文件都有明确的职责范围:
  • __init__.py
    —— 暴露
    PROJECT_ROOT
    (绝对路径,由
    __file__
    派生,与当前工作目录无关)。需要项目相对路径的模块可导入该常量。需配置可编辑安装才能生效。
  • data.py
    —— 数据加载器、
    X
    y
    的实例化、X标记处的任何
    split_kwargs
    (分组、时间等)。流水线机制由
    build-ml-pipeline
    负责。
  • features.py
    —— 特征函数和转换器。
  • pipeline.py
    —— 学习器声明(
    SkrubLearner
    )。
    build_learner
    暴露
    data_dir_preview=None
    ,以便实验脚本可从
    PROJECT_ROOT
    传递绝对路径。
  • evaluate.py
    —— 仅包含
    skore.evaluate
    的输入参数:交叉验证器(
    splitter = ...
    )、可选指标覆盖。不得调用
    skore.evaluate
    、不得打开Project、不得持久化数据。

Experiment scripts —
experiments/NN_*.py

实验脚本——
experiments/NN_*.py

# %%
cell markers, not
.ipynb
. Template:
templates/experiment.py
. What the script does:
  1. Open / attach to the
    skore.Project
    at
    reports/
    (or hub).
  2. Import the learner from
    <pkg>.pipeline
    and CV from
    <pkg>.evaluate
    .
  3. Call
    skore.evaluate(...)
    .
  4. Call
    project.put("<experiment-key>", report)
    .
Confirm signatures via
python-api
. Cross-validator choice is
evaluate-ml-pipeline
.
Project init substitution — the
<SKORE_PROJECT_INIT>
marker in
templates/experiment.py
is replaced at scaffold time per the recorded
skore mode:
decision. Two forms (local vs hub), side-by-side anatomy, audit-file copy rule: →
references/g_skore_mode.md
.
Experiment scripts stay clean of agent-only
print(...)
.
Inspection lives in
scratch/
. One exception: a bare
report
expression — that's a notebook-display side effect.
Experiment key convention — the file's stem (e.g.
01_baseline.py
"01_baseline"
). One file → one key → one report.
使用
# %%
单元格标记,而非
.ipynb
格式。模板:
templates/experiment.py
。脚本功能:
  1. 打开/关联
    reports/
    (或hub)中的
    skore.Project
  2. <pkg>.pipeline
    导入学习器,从
    <pkg>.evaluate
    导入CV策略。
  3. 调用
    skore.evaluate(...)
  4. 调用
    project.put("<experiment-key>", report)
需通过
python-api
确认签名。交叉验证器的选择由
evaluate-ml-pipeline
负责。
Project初始化替换——
templates/experiment.py
中的
<SKORE_PROJECT_INIT>
标记会在搭建时根据记录的
skore mode:
决策替换。两种形式(本地vs hub)、对比解析、审计文件复制规则参考:→
references/g_skore_mode.md
实验脚本不得包含仅Agent使用的
print(...)
。调试内容需放在
scratch/
目录下。唯一例外:单独的
report
表达式——这是Notebook的显示副作用。
实验key约定——使用文件的主干名称(例如
01_baseline.py
"01_baseline"
)。一个文件对应一个key对应一份报告。

Companion skills

关联技能

SkillRelationship
iterate-ml-experiment
Owns
journal/JOURNAL.md
and per-experiment design notes. This skill places empty
journal/
; that skill fills it
build-ml-pipeline
Body of
pipeline.py
,
features.py
,
data.py
evaluate-ml-pipeline
Body of
evaluate.py
; CV strategy
test-ml-pipeline
Layout of
tests/<category>/
+ stem-pairing rule
smoke-test-ml-pipeline
Body of the smoke test once design note is approved
audit-ml-pipeline
Body of
audit/
. Read-only against the workspace
python-api
skore / skrub / sklearn signatures
python-env-manager
Detection + install commands + bootstrap
data-science-python-stack
What to install (Tier 1/2/3)
python-code-style
ruff.toml
drop + NumPyDoc convention (step 12)
技能关系
iterate-ml-experiment
负责
journal/JOURNAL.md
和每个实验的设计笔记。本技能创建空的
journal/
目录;该技能填充内容
build-ml-pipeline
负责
pipeline.py
features.py
data.py
的内容
evaluate-ml-pipeline
负责
evaluate.py
的内容;CV策略选择
test-ml-pipeline
负责
tests/<category>/
的布局 + 骨架配对规则
smoke-test-ml-pipeline
设计笔记通过审核后负责冒烟测试的内容
audit-ml-pipeline
负责
audit/
目录下的文件内容。对工作区为只读
python-api
负责skore/skrub/sklearn的签名查询
python-env-manager
负责依赖检测 + 安装命令 + 工作区引导搭建
data-science-python-stack
负责确定安装内容(一级/二级/三级依赖)
python-code-style
负责
ruff.toml
配置 + NumPyDoc规范(步骤12)

Templates

模板

  • templates/experiment.py
    — copied per new experiment
  • templates/summary.md
    — placeholder at scaffold; rewritten by
    iterate-ml-experiment
    § 4
  • templates/pyproject.toml
    — declares
    src/<pkg>/
    as installable
  • templates/src___init__.py
    — package init with
    PROJECT_ROOT
  • templates/src_data.py
    /
    src_features.py
    /
    src_pipeline.py
    /
    src_evaluate.py
    — one-time skeletons
  • templates/.gitignore
    — dropped at scaffold if none exists
Copy, don't rewrite. Section names encode contracts.
  • templates/experiment.py
    —— 每个新实验复制一份
  • templates/summary.md
    —— 搭建时创建占位符;由
    iterate-ml-experiment
    § 4重写
  • templates/pyproject.toml
    —— 声明
    src/<pkg>/
    为可安装包
  • templates/src___init__.py
    —— 包含
    PROJECT_ROOT
    的包初始化文件
  • templates/src_data.py
    /
    src_features.py
    /
    src_pipeline.py
    /
    src_evaluate.py
    —— 一次性骨架文件
  • templates/.gitignore
    —— 搭建时若不存在则复制模板
复制模板,不要重写。章节名称代表约定规则。

References (load on demand)

参考文档(按需加载)

  • references/scaffold_steps.md
    — full prose elaboration of the 13-step Decision flow with examples and rationale.
  • references/g_skore_mode.md
    — the G-SKORE-MODE gate in detail: project init forms side-by-side, anatomy of the
    <SKORE_PROJECT_INIT>
    substitution, switching mid-project, out-of-scope notes (MLflow mode, Skore Hub account creation).
  • references/scaffold_steps.md
    —— 13步决策流程的完整详细说明,包含示例和原理。
  • references/g_skore_mode.md
    —— G-SKORE-MODE验证步骤的详细说明:Project初始化形式对比、
    <SKORE_PROJECT_INIT>
    替换解析、中途切换模式流程、超出范围说明(MLflow模式、Skore Hub账户创建)。