Loading...
Loading...
Found 12 Skills
LLM guardrails with NeMo, Guardrails AI, and OpenAI. Input/output rails, hallucination prevention, fact-checking, toxicity detection, red-teaming patterns. Use when building LLM guardrails, safety checks, or red-team workflows.
Senior Code Architect & Quality Assurance Engineer for 2026. Specialized in context-aware AI code reviews, automated PR auditing, and technical debt mitigation. Expert in neutralizing "AI-Smells," identifying performance bottlenecks, and enforcing architectural integrity through multi-job red-teaming and surgical remediation suggestions.
Find every way users can break your AI before they do. Use when you need to red-team your AI, test for jailbreaks, find prompt injection vulnerabilities, run adversarial testing, do a safety audit before launch, prove your AI is safe for compliance, stress-test guardrails, or verify your AI holds up against adversarial users. Covers automated attack generation, iterative red-teaming with DSPy, and MIPROv2-optimized adversarial testing.
Provides calibrated decision analysis using Charlie Munger-style multiple mental models, inversion, incentive mapping, circle-of-competence checks, misjudgment audits, second-order effects, and forecast updates. Use when the user asks for an oracle take, a hard call, a decision memo, a premortem, an outside view, a red-team, a sanity-check, what am I missing, think this through, or wants a strategy, hire, investment, plan, product, partnership, or major life choice analysed. Avoid for simple factual lookups or time-sensitive legal, medical, or market questions without fresh evidence.
Use when challenging ideas, plans, decisions, or proposals using structured critical reasoning. Invoke to play devil's advocate, run a pre-mortem, red team, or audit evidence and assumptions.
Red-team security review for code changes. Use when reviewing pending git changes, branch diffs, or new features for security vulnerabilities, permission gaps, injection risks, and attack vectors. Acts as a pen-tester analyzing code.
(Industry standard: Loop Agent / Single Agent) Primary Use Case: Self-contained research, content generation, and exploration where no inner delegation is required. Self-directed research and knowledge capture loop. Use when: starting a session (Orientation), performing research (Synthesis), or closing a session (Seal, Persist, Retrospective). Ensures knowledge survives across isolated agent sessions.
Comprehensively evaluate the overall security of an application from two perspectives: attackers (Red Team) and defenders (Blue Team). Run two agents in parallel → output an integrated report via review-aggregator. Use this when you want to "understand the overall security status of the application", "identify vulnerabilities from an attacker's perspective", or "verify that there are no gaps in the defense system". Use security-hardening for addressing specific vulnerabilities, and security-audit-quick for fast detection of known patterns.
LLM prompt testing, evaluation, and CI/CD quality gates using Promptfoo. Invoke when: - Setting up prompt evaluation or regression testing - Integrating LLM testing into CI/CD pipelines - Configuring security testing (red teaming, jailbreaks) - Comparing prompt or model performance - Building evaluation suites for RAG, factuality, or safety Keywords: promptfoo, llm evaluation, prompt testing, red team, CI/CD, regression testing
Analyze arguments, detect biases, evaluate claims, and improve reasoning. Use when asked to fact-check, identify logical fallacies, evaluate arguments, analyze predictions, find root causes, or think adversarially about plans. Triggers include "evaluate this argument", "logical fallacies", "fact check", "analyze the claims", "identify biases", "devil's advocate", "red team this", "root cause".
Search and retrieve pentesting, red teaming, and security research information from the HackTricks wiki (book.hacktricks.wiki). Use for payloads, methodologies, bypasses, and edge-case behaviors across web, network, cloud, and application security topics.
Use when conducting authorized penetration tests, performing security assessments, running red team exercises, testing security controls, identifying attack paths, or validating hardening measures