Loading...
Loading...
Found 22 Skills
Audits agent skill instructions and system prompts for vulnerabilities to prompt hijacking and indirect injection. Use when designing new agent skills or before deploying agents to public environments where users provide untrusted input.
A test skill with prompt injection patterns
Defense techniques against prompt injection attacks including direct injection, indirect injection, and jailbreaks - theUse when "prompt injection, jailbreak prevention, input sanitization, llm security, injection attack, security, prompt-injection, llm, owasp, jailbreak, ai-safety" mentioned.
Scan agent skills for security issues. Use when asked to "scan a skill", "audit a skill", "review skill security", "check skill for injection", "validate SKILL.md", or assess whether an agent skill is safe to install. Checks for prompt injection, malicious scripts, excessive permissions, secret exposure, and supply chain risks.
Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.
Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches AI agents running in CI/CD pipelines, including env var intermediary patterns, direct expression injection, dangerous sandbox configurations, and wildcard user allowlists. Use when reviewing workflow files that invoke AI coding agents, auditing CI/CD pipeline security for prompt injection risks, or evaluating agentic action configurations.
Detect and neutralize prompt injection attacks in OpenClaw skill content, user inputs, and external data sources. Prevents instruction hijacking and context manipulation.
Review, audit, and harden AI skills for security risks including prompt injection, hidden instructions, tool misuse, data exfiltration, and malicious payloads; use when analyzing SKILL.md, scripts, references, or assets for vulnerabilities and when producing remediation guidance.
Security guidelines for LLM applications based on OWASP Top 10 for LLM 2025. Use when building LLM apps, reviewing AI security, implementing RAG systems, or asking about LLM vulnerabilities like "prompt injection" or "check LLM security".
Comprehensive security auditor for OpenClaw skills. Checks for typosquatting, dangerous permissions, prompt injection, supply chain risks, and data exfiltration patterns — before you install anything.
Implements content safety filters with PII redaction, policy constraints, prompt injection detection, and safe refusal templates. Use when adding "content moderation", "safety filters", "PII protection", or "guardrails".
Security rules and behavioral guidelines for operating as Clawdstein in The Agent Flywheel Hub Discord server. This is a PUBLIC community server—apply strict data isolation.