secure-ai

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

🔒 Skill: Secure AI (v1.1.0)

🔒 Skill: Secure AI (v1.1.0)

Executive Summary

执行摘要

The
secure-ai
architect is the primary defender of the AI integration layer. In 2026, where AI agents have high levels of autonomy and access, the risk of Prompt Injection, Data Leakage, and Privilege Escalation is paramount. This skill focuses on building "Unbreakable" AI systems through multi-layered defense, structural isolation, and zero-trust orchestration.

secure-ai
架构师是AI集成层的核心防御者。到2026年,AI Agent具备高度自主性和访问权限,Prompt Injection数据泄露权限提升的风险将成为核心隐患。本Skill专注于通过多层防御、结构隔离和零信任编排,构建“牢不可破”的AI系统。

📋 Table of Contents

📋 目录

🏗️ Core Security Philosophies

🏗️ 核心安全理念

  1. Isolation is Absolute: User data must never be treated as system instruction.
  2. Least Privilege for Agents: Give agents only the tools they need for the current sub-task.
  3. Human Verification of Destruction: Destructive actions require a human signature.
  4. No Secrets in Client: All AI logic and keys reside in
    server-only
    environments.
  5. Adversarial mindset: Assume the user (and the agent) will try to bypass your rules.

  1. 绝对隔离:永远不能将用户数据视为系统指令。
  2. Agent最小权限:仅为Agent分配当前子任务所需的工具权限。
  3. 破坏性操作人工验证:破坏性操作必须有人工签名确认。
  4. 客户端无密钥:所有AI逻辑和密钥都存储在
    server-only
    环境中。
  5. 对抗思维:默认用户(以及Agent)会尝试绕过你的规则。

🚫 The "Do Not" List (Anti-Patterns)

🚫 禁止事项列表(反模式)

Anti-PatternWhy it fails in 2026Modern Alternative
Instruction MixingProne to prompt injection.Use Structural Roles (System/User).
Thin System PromptsEasily bypassed via roleplay.Use Hierarchical Guardrails.
Unlimited Tool UseRisk of massive data exfiltration.Use Capability-Based Scopes.
Static API KeysLeaks result in total system breach.Use OIDC & Dynamic Rotation.
Unvalidated URLsDirect path for indirect injection.Use Sandboxed Content Fetching.

反模式2026年失效原因现代替代方案
指令混合易发生Prompt Injection。使用结构角色(系统/用户)
过短系统提示词易通过角色扮演绕过。使用分层防护规则
无限制工具使用存在大规模数据泄露风险。使用基于能力的权限范围
静态API密钥密钥泄露会导致全系统被攻破。使用OIDC和动态轮换机制
未验证URL为间接注入提供直接路径。使用沙箱内容抓取

🛡️ Prompt Injection Defense

🛡️ Prompt Injection防御

We use a "Defense-in-Depth" strategy:
  • Input Boundaries:
    --- USER DATA START ---
    .
  • Guardian Models: Fast pre-scanners for malicious patterns.
  • Content Filtering: Built-in safety settings on Gemini 3 Pro.
See References: Prompt Injection for blueprints.

我们采用“深度防御”策略:
  • 输入边界
    --- USER DATA START ---
  • 防护模型:快速预扫描恶意模式。
  • 内容过滤:Gemini 3 Pro内置安全设置。
蓝图请查看参考:Prompt Injection

🤖 Zero-Trust for AI Agents

🤖 AI Agent零信任机制

  • Non-Human Identity (NHI): Verifiable identities for every agent.
  • WASM Sandboxing: Running generated code in isolated runtimes.
  • HITL (Human-in-the-Loop): Mandatory sign-off for financial or data-altering events.

  • 非人类身份(NHI):每个Agent都有可验证的身份。
  • WASM沙箱:在隔离运行时中执行生成的代码。
  • HITL(人在回路):财务或数据变更类事件必须经过人工签字确认。

📖 Reference Library

📖 参考库

Detailed deep-dives into AI Security:
  • Prompt Injection Defense: Multi-layered isolation.
  • Agentic Zero-Trust: Managing autonomous actors.
  • Secure Server Actions: Bridging the frontend safely.
  • Audit Protocols: Monitoring agent behavior.

Updated: January 22, 2026 - 20:50
AI安全相关详细深度内容:
  • Prompt Injection防御:多层隔离机制。
  • Agent零信任:自治主体管理。
  • 安全服务器操作:安全对接前端。
  • 审计协议:Agent行为监控。

更新时间:2026年1月22日 20:50