Loading...
Loading...
Found 107 Skills
Envoy Gateway production deployment — deployment modes, performance tuning, observability, operational guidance
Diagnose, compare, and optimize Apache Spark applications and SQL queries using Spark History Server data. Use this skill whenever the user wants to understand why a Spark app is slow, compare two benchmark runs or TPC-DS results, find performance bottlenecks (skew, GC pressure, shuffle spill, straggler tasks), get tuning recommendations, or optimize Spark/Gluten configurations. Also trigger when the user mentions 'diagnose', 'compare runs', 'why is this query slow', 'tune my Spark job', 'benchmark comparison', 'performance regression', or asks about executor skew, shuffle overhead, AQE effectiveness, or Gluten offloading issues.
End-to-end SGLang SOTA performance workflow. Use when a user names an LLM model and wants SGLang to match or beat the best observed vLLM and TensorRT-LLM serving performance by searching each framework's best deployment command, benchmarking them fairly, profiling SGLang if it is slower, identifying kernel/overlap/fusion bottlenecks, patching SGLang code, and revalidating with real model runs.
Oracle Database skills for administration, SQL and PL/SQL development, performance tuning, security, ORDS, SQLcl, migrations, frameworks, Oracle Container Registry guidance, and agent-safe database workflows.
React-specific component, hook, and rendering patterns. Use when writing React components, hooks, JSX, or optimizing React performance.
Execute use when you need to work with query optimization. This skill provides query performance analysis with comprehensive guidance and automation. Trigger with phrases like "optimize queries", "analyze performance", or "improve query speed".
Universal Runtime best practices for PyTorch inference, Transformers models, and FastAPI serving. Covers device management, model loading, memory optimization, and performance tuning.
Expert blueprint for low-level server access (RenderingServer, PhysicsServer2D/3D, NavigationServer) using RIDs for maximum performance. Bypasses scene tree overhead for procedural generation, particle systems, and voxel engines. Use when nodes are too slow OR managing thousands of objects. Keywords RenderingServer, PhysicsServer, NavigationServer, RID, canvas_item, body_create, low-level, performance.
Expert knowledge for Azure Managed Lustre development including troubleshooting, best practices, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when mounting AML, integrating with Blob auto-import/export, AKS CSI, quotas, or performance tuning, and other Azure Managed Lustre related development tasks. Not for Azure HPC Cache (use azure-hpc-cache), Azure NetApp Files (use azure-netapp-files), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Network (use azure-virtual-network).
Use when you need to set up Java application profiling to detect and measure performance issues — including automated async-profiler v4.0 setup, problem-driven profiling (CPU, memory, threading, GC, I/O), interactive profiling scripts, JFR integration with Java 25 (JEP 518, JEP 520), or collecting profiling data with flamegraphs and JFR recordings. Part of the skills-for-java project
Use when the user wants to push past conventional workflow limits with advanced performance techniques like parallel orchestration, streaming pipelines, or adaptive routing.
Generate Triton kernel code for Ascend NPU based on operator design documents. Used when users need to implement Triton operator kernels and convert requirement documents into executable code. Core capabilities: (1) Parse requirement documents to confirm computing logic (2) Design tiling partitioning strategy (3) Generate high-performance kernel code (4) Generate test code to verify correctness.