Loading...
Loading...
Found 2 Skills
CuTe Python DSL kernel workflow, CuteKernel runtime wrapper, suitability gate, tiling guidance, and CuTe-specific pitfalls. Use when: (1) planning or implementing a kernel in the CuTe Python DSL, (2) the optimization needs more explicit control than cuTile exposes but should remain in a Python-driven workflow, (3) defining package naming for cute-dsl kernels, (4) documenting CuTe Python DSL design choices, (5) recording language-specific knowledge for CuTe Python DSL.
Generate Triton operator requirement documents suitable for Ascend NPU. Used when users need to design new Triton operators, write operator requirement documents, or perform operator performance optimization design.