Loading...
Loading...
Found 300 Skills
Profile CPU performance of tests and browser tests in elements package. Use when investigating performance issues, optimizing test execution, or when the user mentions profiling, performance analysis, hotspots, or slow tests.
Nsight Systems (nsys) CLI for system-level timeline profiling. Use when the user wants to run nsys profile, analyze .nsys-rep reports, use nsys stats/analyze/recipe commands, diagnose GPU idle time from timeline traces, or profile distributed training with NCCL overlap analysis. NOT for kernel-level metrics like SOL%, occupancy, or roofline (use perf-nsight-compute-analysis for ncu). NOT for writing or generating kernels. NOT for applying optimizations like CUDA Graphs.
Performance analysis coordination workflow. Guides profiling delegation, bottleneck classification (compute/memory/launch/communication/sync), and structured report generation. Use when the user asks to analyze performance, profile a workload, check MFU/SOL, or diagnose bottlenecks.
Expert blueprint for performance profiling and optimization (frame drops, memory leaks, draw calls) using Godot Profiler, object pooling, visibility culling, and bottleneck identification. Use when diagnosing lag, optimizing for target FPS, or reducing memory usage. Keywords profiling, Godot Profiler, bottleneck, object pooling, VisibleOnScreenNotifier, draw calls, MultiMesh.
Optimizes ClickHouse queries for speed and efficiency. Helps with primary key design, sparse indexes, data skipping indexes (minmax, set, bloom filter, ngrambf_v1), partitioning strategies, projections, PREWHERE optimization, approximate functions, and query profiling with EXPLAIN. Use when writing ClickHouse queries, designing table schemas, analyzing slow queries, or implementing analytical aggregations. Works with columnar OLAP workloads.
Full Sentry SDK setup for React Native and Expo. Use when asked to "add Sentry to React Native", "install @sentry/react-native", "setup Sentry in Expo", or configure error monitoring, tracing, profiling, session replay, or logging for React Native applications. Supports Expo managed, Expo bare, and vanilla React Native.
Full Sentry SDK setup for Apple platforms (iOS, macOS, tvOS, watchOS, visionOS). Use when asked to "add Sentry to iOS", "add Sentry to Swift", "install sentry-cocoa", or configure error monitoring, tracing, profiling, session replay, or logging for Apple applications. Supports SwiftUI and UIKit.
Flutter DevTools, Profiling, Logging & Memory Management
Exploratory Data Analysis skill for CSV and parquet datasets with deterministic profiling, drift/anomaly scans, contract generation and validation, and optional memory writeback into skill-system-memory. The implementation is Polars-first (lazy scan for large files and early `--sample` head), includes high-cardinality guards for profile/importance/contract flows, and supports categorical correlation with Cramer's V. Use when building or reviewing tabular fraud/risk/data-quality workflows, profiling new datasets, checking leakage or drift, or saving/validating data contracts.
Deep SERP analysis and backlink profiling for high-priority keywords using DataForSEO
Route Unity and Unreal frame-time complaints into one bottleneck-first profiling brief. Use when the main job is interpreting profiler screenshots, `stat unit` / `stat gpu` output, benchmark-route complaints, or Steam Deck / target-device review packets; choosing the smallest useful next capture; naming one primary bottleneck family; and deciding whether to stay with quick packets, move to an engine-native profiler, or escalate further. Route generic app/service tuning to `performance-optimization`, build/editor/package failures to `game-build-log-triage`, broader game-production coordination to `bmad-gds`, and mixed demo/community feedback to `game-demo-feedback-triage`.
Iteratively optimize cuTile kernel performance through systematic profiling, bottleneck analysis, IR comparison, and targeted tuning. Covers tile sizes, occupancy, autotune configs, TMA, latency hints, persistent scheduling, num_ctas, flush_to_zero, and IR-level debugging. Use when asked to "optimize cutile kernel", "improve kernel perf", "tune cutile performance", "make kernel faster", or iteratively benchmark and refine a cuTile GPU kernel in the TileGym project.