Loading...
Loading...
Found 1,066 Skills
Maintain a reviewable LLM Wiki from immutable raw notes, including ingest planning, querying, linting, and guarded raw Graphify maps that help agents generate better wiki pages.
AI/LLM application security testing — prompt injection, jailbreaking, data exfiltration, and insecure output handling per OWASP LLM Top 10.
Evaluates ML models for performance, fairness, and reliability. Use for metric selection, cross-validation strategies, overfitting/underfitting diagnosis, hyperparameter tuning, LLM evaluation, A/B testing, and production monitoring for model drift.
Fetch any X/Twitter post as clean LLM-friendly JSON. Converts x.com, twitter.com, or adhx.com links into structured data with full article content, author info, and engagement metrics. No scraping or browser required.
Extract text from PDFs as structured, semantic Markdown. Use when converting a PDF to Markdown, extracting text from a PDF, processing one or more PDFs into Markdown output, reading PDF contents for analysis, ingesting documents for RAG pipelines, preparing PDFs for LLM context, or any task where PDF text needs to be in a machine-readable format. ALWAYS use this skill when the user has a PDF and needs its content as text or Markdown — even if they don't explicitly say "convert to markdown".
Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time
Design real technical solution architectures for scalable, secure, cost-aware systems by selecting patterns, components, integrations, data flows, and tradeoffs; use when asked for senior solution architecture, system architecture, SaaS architecture, LLM architecture, or architecture decisions after a spec.
Optimize and structure context for agents and LLMs by reducing noise, prioritizing relevance, organizing memory, defining constraints, and managing token budgets.
Cross-model benchmark for gstack skills. Runs the same prompt through Claude, GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, and optionally quality via LLM judge. Answers "which model is actually best for this skill?" with data instead of vibes. Separate from /benchmark, which measures web page performance. Use when: "benchmark models", "compare models", "which model is best for X", "cross-model comparison", "model shootout". (gstack) Voice triggers (speech-to-text aliases): "compare models", "model shootout", "which model is best".
Supabase Edge Function observability style: tiny provider-neutral OTel-shaped shim, OTLP export config, traces/logs/metrics, and LLM cost metrics.
FastAPI OpenTelemetry style: native FastAPIInstrumentor, centralized observability init, Python decorators, OTLP logs, and LLM cost metrics.
General OpenTelemetry onboarding style for Superlog managed agents: native APIs, signal quality, env vars, LLM metrics, and smoke checks.