Loading...
Loading...
Found 23 Skills
Optimize token usage when delegating to Gemini CLI. Covers token caching, batch queries, model selection (Flash vs Pro), and cost tracking. Use when planning bulk Gemini operations.
Intelligent project management dashboard - view all projects status, priorities, and todos from a CEO perspective
Configures and customizes Claude Code statuslines with multi-line layouts, cost tracking via ccusage, git status indicators, and customizable colors. Activates for statusline setup, installation, configuration, customization, color changes, cost display, git status integration, or troubleshooting statusline issues.
Add PostHog LLM analytics to trace AI model usage. Use after implementing LLM features or reviewing PRs to ensure all generations are captured with token counts, latency, and costs. Also handles initial PostHog SDK setup if not yet installed.
Add PostHog LLM analytics to trace AI model usage. Use after implementing LLM features or reviewing PRs to ensure all generations are captured with token counts, latency, and costs. Also handles initial PostHog SDK setup if not yet installed.
Consumer-side wiring for ADR-097 Phase 3 federation_spend events — per-peer rolling windows + suspension-threshold check
AI generation provenance and audit trail tracking. Records decision factors, data lineage, reasoning chains, confidence scoring, and cost tracking for AI-generated content.
Interactive TUI dashboard for visualizing Claude Code token usage, costs, and task breakdowns by project, model, and activity type.
Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time
LLM observability platform for tracing, evaluation, prompt management, and cost tracking. Use when setting up Langfuse, monitoring LLM costs, tracking token usage, or implementing prompt versioning.
Iterate on RAG systems with structured evals instead of eyeballing. This skill should be used when the user is tuning a RAG pipeline — changing retrieval prompts, swapping models, adjusting chunking, or debugging poor answers — and wants a cheap, ranked set of experiments with cost tracking and structured feedback on the stack. Also use when the user asks "how do I know if my RAG is working?", "this RAG eval is burning money", or "what should I try next on retrieval?".