Loading...
Loading...
Found 1,745 Skills
Autonomously optimize any Claude Code skill by running it repeatedly, scoring outputs against binary evals, mutating the prompt, and keeping improvements. Based on Karpathy's autoresearch methodology. Use when: optimize this skill, improve this skill, run autoresearch on, make this skill better, self-improve skill, benchmark skill, eval my skill, run evals on. Outputs: an improved SKILL.md, a results log, and a changelog of every mutation tried.
End-to-end GECX/CXAS/CES conversational agent lifecycle -- build agents from requirements (PRD-to-agent), create and run evals (goldens, simulations, tool tests, callback tests), debug failures, and iterate to production quality. Use this skill whenever the user mentions GECX, CXAS, CES, SCRAPI, conversational agents, voice agents, audio agents, agent evals, pushing/pulling/linting agents, or agent instructions/callbacks/tools on the Google Customer Engagement Suite platform.
Cross-model benchmark for gstack skills. Runs the same prompt through Claude, GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, and optionally quality via LLM judge. Answers "which model is actually best for this skill?" with data instead of vibes. Separate from /benchmark, which measures web page performance. Use when: "benchmark models", "compare models", "which model is best for X", "cross-model comparison", "model shootout". (gstack) Voice triggers (speech-to-text aliases): "compare models", "model shootout", "which model is best".
Value investing screen via Longbridge — scan A-share / HK / US stocks for fundamentally strong but undervalued companies based on PE, PB, dividend yield, ROE, and margin of safety. Suitable for value investing strategy. Triggers: "低估值", "价值投资", "低PE", "低PB", "便宜股票", "安全边际", "高股息低估值", "被低估", "低估值", "價值投資", "低PE", "低PB", "便宜股票", "安全邊際", "高股息低估值", "value investing", "undervalued stocks", "low PE", "low PB", "margin of safety", "value screen", "cheap stocks", "bargain stocks".
Graham cigar-butt (NCAV / net-net) single-stock diagnostic. Combines a 100-point static cheapness score (NCAV, PE, PB, dividend yield, debt coverage, earnings stability) with a dynamic adjustment layer (industry cycle, earnings trend, insider activity, NCAV trajectory) to separate real bargains from value traps. Pulls data from Longbridge CLI/MCP first, falls back to WebSearch only for gaps, runs cross-statement reconciliation (勾稽校验) before scoring, and footnotes every figure to its source. Triggers: "格雷厄姆", "捡烟蒂", "烟蒂股", "烟蒂投资", "NCAV", "净流动资产", "清算价值", "安全边际", "价值陷阱", "深度价值", "撿煙蒂", "煙蒂股", "煙蒂投資", "淨流動資產", "清算價值", "安全邊際", "價值陷阱", "深度價值", "Graham", "cigar butt", "net-net", "liquidation value", "value trap", "margin of safety", "deep value", "Benjamin Graham".
Simulate a panel of hackathon judges to generate adversarial questions, objections, and predicted scores for pitch hardening.
黑底 + CRT 网格扫描线 + $ 命令行标题 + 薄荷绿大字 + 三档 tag
Master context engineering principles for building production-grade AI agent systems with effective context management, multi-agent architectures, and memory systems.
Adversarial robustness engineering for ML/AI—evasion, poisoning, extraction, membership-inference threat models; robust training, sanitization, detectors; ASR/certified evals; lab model attacks; data-pipeline integrity; production I/O guardrails (classical ML and LLM/multimodal). Use for adversarial examples, robustness suites, poison audits, deploy guardrails—not LLM app red team (ai-redteam), governance (ai-risk-governance), safety classifier R&D (ml-research-engineer-safeguards), safeguard serving (ml-infrastructure-engineer-safeguards), privacy research (privacy-research-engineer-safeguards), AppSec pentest (penetration-tester).
Import datasets from HuggingFace and convert them to Coval test sets. Use when the user wants to create test cases from HuggingFace dataset or repository.
Calculate agreement between human ground truth and machine labels for a text LLM judge metric, then analyze transcripts and reviewer notes to propose an improved metric prompt. One metric at a time.
Create comprehensive industry and sector landscape reports covering market dynamics, competitive positioning, key players, and thematic trends. Use for client requests, sector initiations, thematic research pieces, or internal knowledge building. Triggers on "sector overview", "industry report", "market landscape", "sector analysis", "industry deep dive", or "thematic research".