Loading...
Loading...
Found 43 Skills
Optimize code performance through iterative improvements (max 2 rounds). Benchmark execution time and memory usage, compare against baseline implementations, and generate detailed optimization reports. Supports C++, Python, Java, Rust, and other languages.
Conserve memory by reusing existing instances when working with large numbers of similar objects.
Validate and use CPU offloading in Megatron Bridge, including layer-level activation offloading and fractional optimizer state offloading with HybridDeviceOptimizer.
Techniques for reducing peak GPU memory in Megatron Bridge — expandable segments, parallelism resizing, activation recompute, CPU offloading constraints, and common OOM fixes.
Validate and use selective and full activation recompute in Megatron Bridge to reduce GPU memory usage at the cost of extra compute.
Analyze code performance, detect bottlenecks, suggest optimizations for algorithms, queries, and resource usage. Use when improving application performance or investigating slow code.
Troubleshoot and optimize the performance of Ascend C operators. This skill is applicable when users develop, review or optimize Ascend C kernel operators, or triggered when users mention keywords such as Ascend C performance optimization, operator optimization, tiling, pipeline, data copy, memory optimization, NPU/Ascend.
万行以上 Excel 数据集的高性能分析引擎。提供 openpyxl read_only 流式读取(iter_rows 支持 10 万行以上)、Parquet 转换加速、内存优化、分块处理和大文件写入模式。**遇到以下任一情况就主动使用本 skill**:①数据行数 ≥ 10k(由 sn-da-excel-workflow 的行数评估步骤触发);②用户出现触发词:大文件 / 大数据量 / 性能优化 / 内存不足 / OOM / 百万行 / 十万行 / 流式读取 / Parquet / 分块处理 / large file / big data / streaming read / chunked processing;③直接使用 pd.read_excel() 导致超时或内存溢出;④用户明确要求对大规模数据集进行高性能处理。仅不用于:小于 10k 行的常规 Excel 分析(使用 sn-da-excel-workflow 即可)。
Use when "training LLM", "finetuning", "RLHF", "distributed training", "DeepSpeed", "Accelerate", "PyTorch Lightning", "Ray Train", "TRL", "Unsloth", "LoRA training", "flash attention", "gradient checkpointing"
Expert knowledge of Godot performance optimization, profiling, bottleneck identification, and optimization techniques. Use when helping improve game performance or analyzing performance issues.
セッション管理の総合窓口。初期化・記憶・状態を一手に引き受けます。Use when managing Claude Code sessions, /session command. Do NOT load for: app user sessions, login state, authentication features.
Evidence-based memory optimization from real usage patterns. Analyzes recall performance, identifies bottlenecks, suggests consolidation/pruning/enrichment, and tracks improvement over time via checkpoint Q&A.