converting-cutile-to-julia
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesecuTile Python → cuTile.jl (Julia) Conversion
cuTile Python → cuTile.jl(Julia)转换
Convert Python kernels to Julia cuTile.jl kernels.
@ct.kernelfunction ... end将 Python内核转换为Julia cuTile.jl内核。
@ct.kernelfunction ... endWorkflow Selection
工作流选择
- Standard conversion → Full workflow:
translations/workflow.md - Errors (,
MethodError, numerical mismatch) →IRErrorreferences/debugging.md - Quick reference → +
references/api-mapping.mdreferences/critical-rules.md - Test patterns →
references/testing.md
- 标准转换 → 完整工作流:
translations/workflow.md - 错误处理(、
MethodError、数值不匹配)→IRErrorreferences/debugging.md - 快速参考 → +
references/api-mapping.mdreferences/critical-rules.md - 测试模式 →
references/testing.md
Architecture
架构
Julia kernels are standalone — no Python bridge, no pytest integration. The Julia sub-project
lives in at the repo root with its own for dependency management.
julia/Project.tomljulia/ # Self-contained Julia sub-project
├── Project.toml # Dependencies: CUDA.jl, cuTile.jl, NNlib.jl, Test
├── kernels/ # cuTile.jl kernel implementations
│ ├── add.jl # ← Ground-truth: 1D element-wise with alpha scaling (tensor+tensor, tensor+scalar)
│ ├── matmul.jl # ← Ground-truth: 2D tiled MMA, standard Julia layout (M,K)×(K,N)→(M,N)
│ └── softmax.jl # ← Ground-truth: 3 strategies (TMA, online, chunked) using ct.load/ct.store
└── test/ # Julia-native tests (using Test stdlib)
├── runtests.jl # Test runner entry point
├── test_add.jl
├── test_matmul.jl
└── test_softmax.jlGround-truth reference: Always consult and for patterns that compile and pass tests. These are the canonical examples of working cuTile.jl code.
julia/kernels/*.jljulia/test/*.jlJulia内核是独立的——无需Python桥接,无需pytest集成。Julia子项目位于仓库根目录的文件夹中,通过自身的管理依赖。
julia/Project.tomljulia/ # 独立的Julia子项目
├── Project.toml # 依赖项:CUDA.jl, cuTile.jl, NNlib.jl, Test
├── kernels/ # cuTile.jl内核实现
│ ├── add.jl # ← 基准实现:带alpha缩放的1D元素级运算(张量+张量、张量+标量)
│ ├── matmul.jl # ← 基准实现:2D分块MMA,标准Julia布局 (M,K)×(K,N)→(M,N)
│ └── softmax.jl # ← 基准实现:使用ct.load/ct.store的3种策略(TMA、在线、分块)
└── test/ # Julia原生测试(使用Test标准库)
├── runtests.jl # 测试运行器入口
├── test_add.jl
├── test_matmul.jl
└── test_softmax.jl基准参考:始终参考和中的可编译并通过测试的模式。这些是cuTile.jl可运行代码的标准示例。
julia/kernels/*.jljulia/test/*.jlInstructions
操作步骤
- Analyze the Python kernel: identify patterns, shapes, dtypes, operations
- Write Julia kernel — with cuTile.jl kernel + bridge function(s)
julia/kernels/<op>.jl - Convert kernel signature (see Phase 2)
translations/workflow.md - Convert kernel body (apply +
references/api-mapping.md)references/critical-rules.md - Write Julia test — using
julia/test/test_<op>.jlstdlib +Testfor referenceNNlib.jl - Register test — add in
include(...)julia/test/runtests.jl - Validate — run the bundled validator:
python <skill-dir>/scripts/validate_cutile_jl.py <file.jl> - Test — run
julia --project=julia/ julia/test/runtests.jl
Full conversion checklist with post-conversion verification →
translations/workflow.md- 分析Python内核:识别模式、形状、数据类型、操作
- 编写Julia内核 — 在中编写cuTile.jl内核及桥接函数
julia/kernels/<op>.jl - 转换内核签名(参见第2阶段)
translations/workflow.md - 转换内核主体(应用+
references/api-mapping.md中的规则)references/critical-rules.md - 编写Julia测试 — 在中使用
julia/test/test_<op>.jl标准库 +Test作为参考实现NNlib.jl - 注册测试 — 在中添加
julia/test/runtests.jlinclude(...) - 验证 — 运行内置验证器:
python <skill-dir>/scripts/validate_cutile_jl.py <file.jl> - 测试 — 运行
julia --project=julia/ julia/test/runtests.jl
包含转换后验证步骤的完整转换检查清单 →
translations/workflow.md⚠️ Top Pitfalls
⚠️ 主要陷阱
The most dangerous translation errors. Full rules (17 total) in .
references/critical-rules.md| # | Pitfall | One-line fix |
|---|---|---|
| 1 | | Use |
| 2 | | Use |
| 3 | | Compiler bug — file upstream with minimal reproducer |
| 4 | | Args are positional — match kernel signature exactly |
| 5 | | |
最危险的翻译错误。完整规则(共17条)请参见。
references/critical-rules.md| # | 陷阱 | 单行修复方案 |
|---|---|---|
| 1 | Julia中不存在 | 使用 |
| 2 | 对tile使用 | 使用 |
| 3 | 提到 | 编译器bug — 提交上游问题并附带最小复现示例 |
| 4 | | 参数是位置参数 — 完全匹配内核签名 |
| 5 | 带 | |
Worked Examples
示例演示
Side-by-side Python → Julia conversions matching the released Julia kernels in . Each directory contains (before) and (after).
julia/kernels/cutile_python.pycutile_julia.jl| # | Example | Key Patterns | When to Reference |
|---|---|---|---|
| 01 | | 1D | Starting point; basic TMA + element-wise patterns |
| 02 | | | MMA / tensor core operations |
| 03 | | Persistent scheduling, | Large-tensor reduction patterns |
These match the released kernels in (, , ). The examples are simplified teaching versions — always consult for the canonical, tested implementations.
julia/kernels/add.jlmatmul.jlsoftmax.jljulia/kernels/*.jlPython → Julia的对比转换示例,与中已发布的Julia内核匹配。每个目录包含(转换前)和(转换后)。
julia/kernels/cutile_python.pycutile_julia.jl| # | 示例 | 核心模式 | 参考场景 |
|---|---|---|---|
| 01 | | 1D | 入门示例;基础TMA + 元素级运算模式 |
| 02 | | | MMA / 张量核运算 |
| 03 | | 持久调度、 | 大张量归约模式 |
这些示例与中的已发布内核(、、)匹配。示例是简化的教学版本 — 始终参考获取经过测试的标准实现。
julia/kernels/add.jlmatmul.jlsoftmax.jljulia/kernels/*.jlReference Documents
参考文档
| Category | Document | Content |
|---|---|---|
| Workflows | | Full conversion workflow with todo list, validation loop, checklist |
| Rules | | 17 Critical Rules for cuTile Python → Julia conversion |
| API | | Python↔Julia bidirectional API mapping + kernel patterns |
| Testing | | Julia-native test patterns, tolerances, failure diagnosis |
| Debugging | | Julia-specific error diagnosis + IR debug commands |
| Scripts | | Static validation for Julia anti-patterns (run it) |
| Ground Truth | | Actual working implementations in the codebase |
| 分类 | 文档 | 内容 |
|---|---|---|
| 工作流 | | 完整转换工作流,包含任务清单、验证循环、检查清单 |
| 规则 | | cuTile Python → Julia转换的17条关键规则 |
| API | | Python↔Julia双向API映射 + 内核模式 |
| 测试 | | Julia原生测试模式、容差设置、故障诊断 |
| 调试 | | Julia特定错误诊断 + IR调试命令 |
| 脚本 | | Julia反模式的静态验证工具(建议运行) |
| 基准实现 | | 代码库中的实际可运行实现 |
Environment Setup
环境设置
Prerequisite — Julia: this skill requires the Julia version declared in under . If is missing or older than that, install from the official Julia site at https://julialang.org/install/ following the verified installer instructions for your OS. Resume below once is compatible.
julia/Project.toml[compat] juliajulia --versionjulia --versionThen, from the repo root:
bash
undefined前置条件 — Julia:本工具要求使用中声明的Julia版本。如果显示版本缺失或低于要求版本,请从Julia官方网站https://julialang.org/install/下载对应操作系统的验证安装程序进行安装。待显示版本符合要求后,继续以下步骤。
julia/Project.toml[compat] juliajulia --versionjulia --version然后,从仓库根目录执行:
bash
undefinedInstall Julia dependencies declared in julia/Project.toml
安装julia/Project.toml中声明的Julia依赖
julia --project=julia/ -e 'using Pkg; Pkg.instantiate()'
julia --project=julia/ -e 'using Pkg; Pkg.instantiate()'
Run tests
运行测试
julia --project=julia/ julia/test/runtests.jl
Requirements:
- Julia (minimum version declared in `julia/Project.toml` under `[compat] julia`)
- CUDA 13.1+ driver
- Blackwell GPU (compute capability 10+)
- Dependencies managed via `julia/Project.toml`: CUDA.jl, cuTile.jl, NNlib.jl, Testjulia --project=julia/ julia/test/runtests.jl
要求:
- Julia(最低版本为`julia/Project.toml`中`[compat] julia`声明的版本)
- CUDA 13.1+ 驱动
- Blackwell GPU(计算能力10+)
- 通过`julia/Project.toml`管理的依赖项:CUDA.jl, cuTile.jl, NNlib.jl, Test