Loading...
Loading...
Found 52 Skills
Test system behavior under extreme load conditions to identify breaking points, capacity limits, and failure modes. Use for stress test, capacity testing, breaking point analysis, spike test, and system limits validation.
Stress-test plans, proposals, and strategies. Use for pre-mortems, assumption audits, risk registers, evaluating business ideas, identifying failure modes, or when you need devil's advocate analysis before committing resources.
Expert guidance for systematic backtesting of trading strategies. Use when developing, testing, stress-testing, or validating quantitative trading strategies. Covers "beating ideas to death" methodology, parameter robustness testing, slippage modeling, bias prevention, and interpreting backtest results. Applicable when user asks about backtesting, strategy validation, robustness testing, avoiding overfitting, or systematic trading development.
Devil's Advocate stress-testing for code, architecture, PRs, and decisions. Surfaces hidden flaws through structured adversarial analysis with metacognitive depth. Use for high-stakes review, stress-testing choices, or when the user wants problems found deliberately. NOT for routine code review (use engineering:code-review). Triggers on "스트레스 테스트", "stress test", "devil's advocate", "반론", "이거 괜찮아", "문제 없을까", "깊은 리뷰", "critical review", "adversarial".
Get a second opinion via Codex MCP. Use for stress-testing ideas, getting fresh perspective, steelmanning arguments, or iteratively refining work through expert back-and-forth. Invoke for ANY request involving external review, feedback, or consultation.
Brainstorm product ideas, explore problem spaces, and challenge assumptions as a thinking partner. Use when exploring a new opportunity, generating solutions to a product problem, stress-testing an idea, or when a PM needs to think out loud with a sharp sparring partner before converging on a direction.
Philip Tetlock's Superforecasting framework applied to a business decision, investment thesis, or strategic question. Spawns a team of specialist agents — Calibrator, Decomposer, Updater, Devil's Advocate, Scorekeeper — who each apply a different piece of the superforecasting methodology. The lead synthesizes into a calibrated probability estimate with Brier-scoreable predictions, explicit base rates, and an accountability structure for keeping score over time. Use when the user says "tetlock this", "what's the probability", "how confident should I be", "forecast this", "calibrate this", proposes a business thesis and wants probabilistic stress-testing, or wants to apply superforecasting to a decision. Works standalone or after /munger.
Guides self-review of YOUR OWN academic paper before submission with adversarial stress-testing. Core method: 5-aspect checklist (contribution sufficiency, writing clarity, results quality, testing completeness, method design), counterintuitive protocol (reject-first simulation, delete unsupported claims, score trust, promote limitations, attack novelty), reverse-outlining, and figure/table quality checks. Use when: user wants to self-review or self-check their own paper draft before submission, stress-test their claims, prepare for reviewer criticism, or mentions 'self-review', 'check my draft', 'is my paper ready'. Do NOT use for writing a peer review of someone else's paper, and do NOT use after receiving actual reviews (use paper-rebuttal instead).
Calibrated grilling session for stress-testing a plan, design, idea, or decision. First assesses the user's topic knowledge, confidence, and desired pressure level, then asks one question at a time with recommended answers. Use when user says "grill me", "stress-test this", "challenge my plan", "interview me", or wants a plan probed without being overwhelmed.
Devil's Advocate stress-testing for code, architecture, PRs, and decisions. Surfaces hidden flaws through structured adversarial analysis with metacognitive depth. Use for high-stakes review, stress-testing choices, or when the user wants problems found deliberately. NOT for routine code review (use engineering:code-review). Triggers on "스트레스 테스트", "stress test", "devil's advocate", "반론", "이거 괜찮아", "문제 없을까", "깊은 리뷰", "critical review", "adversarial".
Adversarial thinking partner for founders and executives. Stress-tests plans, prepares for board meetings, dissects decisions with no good options, forces honest post-mortems, and identifies blind spots before competitors or board members do. Use when you need plan validation, board preparation, hard decision frameworks, assumption stress-testing, failure analysis, or when user mentions stress test, challenge, board prep, hard decision, pre-mortem, post-mortem, devil's advocate, plan review, or executive coaching.
Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.