Loading...
Loading...
Found 1,323 Skills
Iteratively optimize cuTile kernel performance through systematic profiling, bottleneck analysis, IR comparison, and targeted tuning. Covers tile sizes, occupancy, autotune configs, TMA, latency hints, persistent scheduling, num_ctas, flush_to_zero, and IR-level debugging. Use when asked to "optimize cutile kernel", "improve kernel perf", "tune cutile performance", "make kernel faster", or iteratively benchmark and refine a cuTile GPU kernel in the TileGym project.
Run, monitor, analyze, and debug LLM evaluations via nemo-evaluator-launcher. Covers running evaluations, checking status and live progress, debugging failed runs, exporting artifacts and logs, and analyzing results. ALWAYS triggers on mentions of running evaluations, checking progress, debugging failed evals, analyzing or analysing runs or results, run directories or artifact paths on clusters, Slurm job issues, invocation IDs, or inspecting logs (client logs, server logs, SSH to cluster, tail logs, grep logs). Do NOT use for creating or modifying evaluation configs.
Use when adding, modifying, optimizing, or debugging CuTile autotuning code. Trigger signals: `exhaustive_search` / `replace_hints` / `hints_fn` / `cuda.tile.tune` in code, `autotune` in filenames, or correctness/performance issues in autotuned CuTile kernels. Covers: tune-once/cache/launch pattern, per-architecture configs (sm80–sm120), parameter space design (tile sizes, occupancy, num_ctas), and 7 common pitfalls with solutions.
Use this skill when the user asks about Goldsky Compose — the offchain-to-onchain TypeScript framework for onchain oracles, keepers, circuit breakers, and cross-chain automation. Triggers on: 'goldsky compose', 'compose.yaml', 'compose deploy/init/dev', 'compose task', 'cron task onchain', 'sponsored gas', 'writeContract from TypeScript', 'build a price oracle', 'resolve prediction market', 'onchain event listener', 'HTTP-triggered task', 'smart wallet'. Also use when the user wants to run TypeScript against EVM chains with managed gas, schedule onchain writes via cron, react to onchain events, or deploy a serverless task with secrets and a smart wallet. For debugging a broken app, use /compose-doctor. For manifest/CLI/API lookups, use /compose-reference. Do NOT trigger on Goldsky Turbo, Mirror, Subgraphs, Edge, or Datasets — those belong to their respective skills.
Server-side tracking pipeline audit covering server-side Google Tag Manager (sGTM), Meta CAPI Gateway, Conversions API health, event deduplication via event_id, server-side hit ratio targets, pixel debugging, and PII hashing discipline. Use when user says server-side tracking, sGTM, server-side GTM, server-side tagging, CAPI, Conversions API, CAPI Gateway, Meta Conversions API, event deduplication, event_id, pixel debug, pixel health, Pixel/CAPI audit, first-party tracking, iOS 14.5 recovery, or server-side hit ratio.
Master control flow for complex Web/JS website restoration. Applicable to reverse engineering of sign/token/cookie/header/body/websocket fields, heavy obfuscation, junk code, control flow flattening, JSVMP, worker/wasm, browser vs. Node.js difference analysis, environment patching, local reproduction and regression. It also covers case clues such as Akamai/Kasada/PX/reese84/TongDun/a_bogus/Tencent slider/Alibaba slider/JSVMP/227/226/wasm/protobuf/rid/fuid/fs/bx-pp/run_js/storage.estimate/animationend. By default, it adopts three MCP collaborative debugging and analysis: jshook + js-reverse + chrome-devtools-mcp, and switches specialized skills in locate, recover, runtime, env-patch, replay stages.
Diagnoses and fixes flaky tests by identifying root causes (timing issues, shared state, randomness, network dependencies) and provides stabilization strategies. Use for "flaky tests", "test stability", "intermittent failures", or "test debugging".
Converts cuTile GPU kernels (@ct.kernel) to Triton (@triton.jit). Handles standard in-repo conversion, debugging (cudaErrorIllegalAddress, shape mismatch, numerical mismatch), and mapping cuTile idioms (ct.load/ct.store, ct.Constant, ct.launch) to Triton equivalents. Covers dual-kernel layout flags (e.g. transpose=True/False + autotune grid via META) per translations/advanced-patterns.md. Use when converting, porting, or translating cuTile kernels to Triton, or debugging existing Triton translations.
Expert Django backend development guidance. Use when creating Django models, views, serializers, or APIs; debugging ORM queries or migrations; optimizing database performance; implementing authentication; writing tests; or working with Django REST Framework. Follows Django best practices and modern patterns.
Playwright testing best practices for Next.js applications (formerly test-playwright). This skill should be used when writing, reviewing, or debugging E2E tests with Playwright. Triggers on tasks involving test selectors, flaky tests, authentication state, API mocking, hydration testing, parallel execution, CI configuration, or debugging test failures.
Implement distributed tracing with correlation IDs, trace propagation, and span tracking across microservices. Use when debugging distributed systems, monitoring request flows, or implementing observability.
Use when building SpriteKit games, implementing physics, actions, scene management, or debugging game performance. Covers scene graph, physics engine, actions system, game loop, rendering optimization.