owasp-llm-top10
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOWASP Top 10 for LLM Applications Security Audit
OWASP LLM应用Top 10安全审计
This skill enables AI agents to perform a comprehensive security assessment of Large Language Model (LLM) and Generative AI applications using the OWASP Top 10 for LLM Applications 2025, published by the OWASP GenAI Security Project.
The OWASP Top 10 for LLM Applications identifies the most critical security risks in systems that integrate large language models, covering vulnerabilities from prompt injection to unbounded resource consumption. This is the authoritative industry standard for LLM application security.
Use this skill to identify security vulnerabilities, assess risk exposure, prioritize remediation, and establish secure development practices for AI-powered applications.
Combine with "NIST AI RMF" for comprehensive risk management or "ISO 42001 AI Governance" for governance compliance.
该技能支持AI Agent借助OWASP生成式AI安全项目发布的《2025版OWASP LLM应用Top 10》,对大语言模型(LLM)和生成式AI应用开展全面的安全评估。
《OWASP LLM应用Top 10》明确了集成大语言模型的系统中最关键的安全风险,涵盖从提示注入到无限制资源消耗等各类漏洞,是LLM应用安全领域的权威行业标准。
使用该技能可识别安全漏洞、评估风险暴露程度、确定修复优先级,并为AI驱动的应用建立安全开发实践。
可结合「NIST AI RMF」实现全面风险管理,或结合「ISO 42001 AI治理」满足合规要求。
When to Use This Skill
何时使用该技能
Invoke this skill when:
- Auditing security of LLM-powered applications before deployment
- Reviewing GenAI integrations for security vulnerabilities
- Assessing RAG (Retrieval-Augmented Generation) systems
- Evaluating chatbot or AI assistant security
- Conducting penetration testing of AI features
- Building secure AI application architectures
- Reviewing third-party AI API integrations
- Preparing for security compliance reviews
- Responding to AI-related security incidents
在以下场景中调用该技能:
- 部署前审计LLM驱动应用的安全性
- 审查生成式AI集成中的安全漏洞
- 评估检索增强生成(RAG)系统
- 评估聊天机器人或AI助手的安全性
- 对AI功能进行渗透测试
- 构建安全的AI应用架构
- 审查第三方AI API集成
- 为安全合规审查做准备
- 响应AI相关安全事件
Inputs Required
所需输入
When executing this audit, gather:
- application_description: Description of the AI application (purpose, LLM used, architecture, features, user base) [REQUIRED]
- architecture_details: System architecture (APIs, databases, vector stores, plugins, integrations) [OPTIONAL but recommended]
- llm_provider: LLM provider and model (OpenAI GPT-4, Anthropic Claude, self-hosted, etc.) [OPTIONAL]
- deployment_context: Deployment environment (cloud, on-premise, hybrid, edge) [OPTIONAL]
- data_sensitivity: Types of data processed (PII, financial, health, proprietary) [OPTIONAL]
- existing_controls: Current security measures (auth, rate limiting, content filtering) [OPTIONAL]
- specific_concerns: Known vulnerabilities or areas of focus [OPTIONAL]
执行审计时,需收集以下信息:
- application_description:AI应用描述(用途、使用的LLM、架构、功能、用户群体)【必填】
- architecture_details:系统架构(API、数据库、向量存储、插件、集成组件)【可选但推荐提供】
- llm_provider:LLM提供商及模型(如OpenAI GPT-4、Anthropic Claude、自部署模型等)【可选】
- deployment_context:部署环境(云、本地、混合、边缘)【可选】
- data_sensitivity:处理的数据类型(个人可识别信息PII、金融数据、健康数据、专有数据)【可选】
- existing_controls:当前安全措施(认证、速率限制、内容过滤)【可选】
- specific_concerns:已知漏洞或重点关注领域【可选】
The OWASP Top 10 for LLM Applications (2025)
2025版OWASP LLM应用Top 10
LLM01: Prompt Injection
LLM01: 提示注入
Severity: Critical
Description: Attackers manipulate LLM operations through crafted inputs, either directly or indirectly, to bypass intended functionality, access unauthorized data, or trigger unintended actions.
Attack Vectors:
- Direct injection: Malicious user prompts containing override commands
- Indirect injection: Hidden instructions in external content (web pages, documents, emails) processed by the LLM
- Jailbreaks: Techniques to bypass safety constraints and content policies
Impact:
- Unauthorized data access and exfiltration
- Bypass of content safety filters
- Manipulation of downstream system actions
- Social engineering of users through manipulated outputs
Assessment Checklist:
- Input sanitization and validation implemented
- System prompts separated from user inputs with clear delimiters
- Least privilege applied to LLM backend access
- Output validation before downstream actions
- Human-in-the-loop for critical operations
- Adversarial testing conducted with known injection techniques
- Content filtering layers applied pre- and post-LLM
Mitigation Strategies:
- Enforce privilege controls on LLM backend access
- Segregate external content from user prompts
- Maintain human oversight for critical functions
- Implement input/output validation pipelines
- Conduct regular adversarial testing
严重程度:关键
描述:攻击者通过精心构造的输入(直接或间接方式)操纵LLM运行,绕过预期功能、访问未授权数据或触发意外操作。
攻击向量:
- 直接注入:包含覆盖命令的恶意用户提示
- 间接注入:LLM处理的外部内容(网页、文档、邮件)中隐藏的指令
- 越狱:绕过安全约束和内容政策的技术手段
影响:
- 未授权数据访问与泄露
- 绕过内容安全过滤器
- 操纵下游系统操作
- 通过篡改输出对用户进行社会工程攻击
评估清单:
- 已实现输入清理与验证
- 系统提示与用户输入通过明确分隔符区分
- 对LLM后端访问应用最小权限原则
- 下游操作前验证输出内容
- 关键操作设置人工审核环节
- 使用已知注入技术开展对抗性测试
- 在LLM处理前后应用内容过滤层
缓解策略:
- 对LLM后端访问实施权限控制
- 将外部内容与用户提示分离
- 为关键功能保留人工监督
- 实现输入/输出验证流水线
- 定期开展对抗性测试
LLM02: Sensitive Information Disclosure
LLM02: 敏感信息泄露
Severity: Critical
Description: LLMs inadvertently expose confidential data including PII, proprietary algorithms, credentials, intellectual property, or internal system information through their outputs.
Attack Vectors:
- Crafted prompts designed to extract training data
- Legitimate queries that trigger memorized sensitive content
- Model outputs revealing internal system architecture
- Embedding leakage from vector databases
Impact:
- Privacy violations and regulatory non-compliance (GDPR, CCPA)
- Intellectual property theft
- Credential exposure enabling further attacks
- Reputational damage
Assessment Checklist:
- PII and sensitive data removed from training/fine-tuning data
- Data masking and tokenization in logs and outputs
- System instructions forbidding sensitive disclosures
- Output filtering for known sensitive patterns (SSN, credit cards, API keys)
- Model access restricted to necessary information via middleware
- User education against pasting confidential content
- Output monitoring for anomalous data exposure
Mitigation Strategies:
- Sanitize training data to remove sensitive information
- Implement data loss prevention (DLP) on outputs
- Apply access controls limiting model's data reach
- Monitor outputs for sensitive data patterns
- Use differential privacy techniques in training
严重程度:关键
描述:LLM通过输出无意中暴露机密数据,包括个人可识别信息(PII)、专有算法、凭证、知识产权或内部系统信息。
攻击向量:
- 精心构造的用于提取训练数据的提示
- 触发记忆中敏感内容的合法查询
- 泄露内部系统架构的模型输出
- 向量数据库中的嵌入信息泄露
影响:
- 隐私违规与合规风险(GDPR、CCPA等)
- 知识产权被盗
- 凭证暴露导致后续攻击
- 声誉受损
评估清单:
- 训练/微调数据中已移除PII和敏感数据
- 日志和输出中已应用数据掩码与令牌化处理
- 系统指令禁止泄露敏感信息
- 对已知敏感模式(社保号、信用卡、API密钥)进行输出过滤
- 通过中间件限制模型可访问的必要信息
- 教育用户避免粘贴机密内容
- 监控输出中的异常数据泄露情况
缓解策略:
- 清理训练数据以移除敏感信息
- 在输出端实施数据丢失防护(DLP)
- 应用访问控制限制模型的数据访问范围
- 监控输出中的敏感数据模式
- 在训练中使用差分隐私技术
LLM03: Supply Chain Vulnerabilities
LLM03: 供应链漏洞
Severity: High
Description: Compromised third-party components (models, datasets, libraries, plugins) introduce security risks including malware, backdoors, or biased behavior.
Attack Vectors:
- Malicious pre-trained models from public repositories
- Poisoned datasets with embedded triggers
- Vulnerable ML libraries and dependencies
- Compromised plugins with unauthorized access
- Trojanized fine-tuning adapters
Impact:
- System compromise and data theft
- Backdoor access to production systems
- Model corruption affecting all users
- Legal liability from unlicensed content
Assessment Checklist:
- Models sourced from verified, reputable providers
- Digital signatures and checksums verified
- Model files scanned for suspicious code (picklescan, etc.)
- Third-party models deployed in sandboxed environments
- Dependencies regularly updated and audited
- Plugin permissions restricted with allowlists
- Complete inventory of all models and components maintained
- SBOM (Software Bill of Materials) maintained for AI components
Mitigation Strategies:
- Source models only from trusted, verified providers
- Scan model files for malicious code before deployment
- Sandbox third-party models with restricted permissions
- Maintain updated dependency inventory
- Implement model signing and integrity verification
严重程度:高
描述:受 compromise 的第三方组件(模型、数据集、库、插件)引入安全风险,包括恶意软件、后门或偏见行为。
攻击向量:
- 公共仓库中的恶意预训练模型
- 嵌入触发机制的毒化数据集
- 存在漏洞的机器学习库与依赖项
- 具有未授权访问权限的受 compromise 插件
- 植入木马的微调适配器
影响:
- 系统被 compromise 与数据被盗
- 生产系统的后门访问
- 模型损坏影响所有用户
- 因未授权内容引发法律责任
评估清单:
- 模型来自经过验证的可信提供商
- 已验证数字签名与校验和
- 扫描模型文件中的可疑代码(如picklescan等工具)
- 在沙箱环境中部署第三方模型
- 定期更新与审计依赖项
- 通过白名单限制插件权限
- 维护所有模型与组件的完整清单
- 为AI组件维护软件物料清单(SBOM)
缓解策略:
- 仅从可信、经过验证的提供商获取模型
- 部署前扫描模型文件中的恶意代码
- 在权限受限的沙箱中运行第三方模型
- 维护更新的依赖项清单
- 实现模型签名与完整性验证
LLM04: Data and Model Poisoning
LLM04: 数据与模型毒化
Severity: High
Description: Attackers manipulate training or fine-tuning data to introduce vulnerabilities, backdoors, or biases that compromise model security and reliability.
Attack Vectors:
- Crafted training examples with hidden trigger phrases
- Poisoned web-scraped content absorbed during training
- Direct tampering with model weights or parameters
- Malicious fine-tuning data
- Subtle label manipulation or data anomalies
Impact:
- Biased or degraded model outputs
- Trigger-activated backdoors in production
- Erosion of model trustworthiness
- Long-term hidden threats difficult to detect
Assessment Checklist:
- Training data validated, cleaned, and audited
- Data provenance tracked and documented
- Rate limiting and moderation for crowdsourced data
- Differential privacy techniques applied
- Models tested with known trigger phrases before deployment
- Deployed models monitored for behavioral drift
- Model file checksums verified against known-good states
Mitigation Strategies:
- Validate and clean all training data sources
- Implement data provenance tracking
- Apply differential privacy to limit individual data influence
- Test with adversarial inputs before deployment
- Monitor production models for unexpected behavior
严重程度:高
描述:攻击者操纵训练或微调数据,引入漏洞、后门或偏见,损害模型的安全性与可靠性。
攻击向量:
- 包含隐藏触发短语的精心构造训练示例
- 训练过程中吸收的毒化网页抓取内容
- 直接篡改模型权重或参数
- 恶意微调数据
- 微妙的标签操纵或数据异常
影响:
- 模型输出存在偏见或质量下降
- 生产环境中触发激活的后门
- 模型可信度受损
- 难以检测的长期隐藏威胁
评估清单:
- 训练数据已验证、清理与审计
- 跟踪并记录数据来源
- 对众包数据实施速率限制与审核
- 应用差分隐私技术
- 部署前使用已知触发短语测试模型
- 监控已部署模型的行为漂移
- 验证模型文件校验和与已知良好状态是否一致
缓解策略:
- 验证并清理所有训练数据源
- 实现数据来源跟踪
- 应用差分隐私以限制单个数据的影响
- 部署前使用对抗性输入测试
- 监控生产模型的意外行为
LLM05: Improper Output Handling
LLM05: 输出处理不当
Severity: High
Description: Applications blindly execute or render LLM outputs without validation, enabling code injection, XSS, SQL injection, SSRF, and other attacks.
Attack Vectors:
- Unescaped HTML/JavaScript in outputs (XSS)
- Model-generated shell commands executed without sanitization
- SQL queries constructed from model output
- Unsanitized API calls based on AI suggestions
- Direct execution via eval() or exec()
Impact:
- Remote code execution
- Session hijacking
- Database manipulation
- Privilege escalation
- Full system compromise
Assessment Checklist:
- All LLM output treated as untrusted input
- Strict output schema validation enforced (JSON, formats)
- Output sanitized and escaped based on context (HTML, SQL, shell)
- Parameterized queries used instead of raw SQL
- Allowlists for acceptable output patterns
- Generated code executed in sandboxed environments
- Human approval required for high-impact actions
- Rendering libraries with built-in escaping used
Mitigation Strategies:
- Never trust LLM output; validate and sanitize everything
- Enforce strict output schemas
- Use parameterized queries and safe ORM methods
- Sandbox all code execution
- Require human approval for privileged operations
严重程度:高
描述:应用盲目执行或渲染LLM输出而不进行验证,导致代码注入、跨站脚本(XSS)、SQL注入、服务器端请求伪造(SSRF)等攻击。
攻击向量:
- 输出中未转义的HTML/JavaScript(XSS)
- 未清理的模型生成shell命令被执行
- 基于模型输出构造的SQL查询
- 基于AI建议的未清理API调用
- 通过eval()或exec()直接执行
影响:
- 远程代码执行
- 会话劫持
- 数据库操纵
- 权限提升
- 系统完全被 compromise
评估清单:
- 所有LLM输出均被视为不可信输入
- 严格执行输出模式验证(JSON等格式)
- 根据上下文(HTML、SQL、shell)清理与转义输出
- 使用参数化查询而非原生SQL
- 设置可接受输出模式的白名单
- 在沙箱环境中执行生成的代码
- 高影响操作需人工批准
- 使用内置转义功能的渲染库
缓解策略:
- 绝不信任LLM输出;对所有内容进行验证与清理
- 强制执行严格的输出模式
- 使用参数化查询与安全ORM方法
- 对所有代码执行进行沙箱隔离
- 特权操作需人工批准
LLM06: Excessive Agency
LLM06: 过度自主权限
Severity: High
Description: AI agents possess excessive permissions and autonomous capabilities, enabling significant harm through compromised prompts, hallucinations, or malicious manipulation.
Attack Vectors:
- Prompt injection exploiting overly permissioned agents
- Hallucinations triggering unintended high-impact actions
- Confused deputy attacks using AI's elevated privileges
- Malicious plugins with excessive access
- Unrestricted system control (email, API, database)
Impact:
- Unauthorized data transmission
- Destructive actions (deletion, modification)
- Financial loss through unauthorized transactions
- Service disruptions
- Automated attack amplification
Assessment Checklist:
- Principle of least privilege applied to all AI capabilities
- Granular permissions with limited-scope OAuth tokens
- Functionality compartmentalized across narrow-scope agents
- High-risk actions restricted (deletion, transfers, device control)
- Explicit user approval for significant operations
- Rate limiting on AI actions and API calls
- Comprehensive audit logs of all agent activities
- Monitoring with alerts for anomalous behavior
Mitigation Strategies:
- Grant only essential capabilities (least privilege)
- Compartmentalize agent functionality
- Require human approval for high-impact operations
- Implement comprehensive audit logging
- Set up real-time monitoring and anomaly detection
严重程度:高
描述:AI Agent拥有过多权限与自主能力,通过受 compromise 的提示、幻觉或恶意操纵造成重大损害。
攻击向量:
- 利用权限过度的Agent进行提示注入
- 幻觉触发意外的高影响操作
- 利用AI提升的权限进行混淆代理攻击
- 权限过度的恶意插件
- 不受限制的系统控制(邮件、API、数据库)
影响:
- 未授权数据传输
- 破坏性操作(删除、修改)
- 未授权交易导致财务损失
- 服务中断
- 自动化攻击放大
评估清单:
- 对所有AI能力应用最小权限原则
- 使用范围受限的OAuth令牌实现细粒度权限
- 按窄范围Agent划分功能模块
- 限制高风险操作(删除、转账、设备控制)
- 重大操作需明确用户批准
- 对AI操作与API调用实施速率限制
- 维护所有Agent活动的完整审计日志
- 监控异常行为并设置警报
缓解策略:
- 仅授予必要的能力(最小权限)
- 划分Agent功能模块
- 高影响操作需人工批准
- 实现全面的审计日志
- 设置实时监控与异常检测
LLM07: System Prompt Leakage
LLM07: 系统提示泄露
Severity: Medium
Description: System instructions intended to guide AI behavior are exposed to users or attackers, revealing internal logic, security controls, or sensitive configurations.
Attack Vectors:
- Prompt injection requesting instruction disclosure
- Sophisticated probing asking to repeat conversation context
- Tokenization quirks causing unintended disclosure
- Reverse-engineering through behavioral observation
- Model unintentionally echoing system prompts
Impact:
- Security logic exposure enabling bypass attacks
- Credential compromise if secrets embedded in prompts
- Internal system knowledge revelation
- Facilitation of more targeted attacks
Assessment Checklist:
- No passwords, API keys, or secrets in system prompts
- Prompts treated as public information
- Models configured to refuse revealing system messages
- Clear message role delimiters (system/user/assistant)
- Security policies enforced at application level, not prompt level
- Output monitoring for prompt leakage patterns
- Regular testing with known extraction techniques
Mitigation Strategies:
- Never embed sensitive data in system prompts
- Implement application-level security enforcement
- Configure models to refuse prompt disclosure
- Monitor outputs for leakage patterns
- Use structured message formats with role delimiters
严重程度:中
描述:用于引导AI行为的系统指令被用户或攻击者获取,泄露内部逻辑、安全控制或敏感配置。
攻击向量:
- 请求披露指令的提示注入
- 要求重复对话上下文的复杂探测
- 导致意外披露的令牌化异常
- 通过行为观察进行逆向工程
- 模型无意中回显系统提示
影响:
- 安全逻辑暴露导致绕过攻击
- 提示中嵌入的密钥泄露导致凭证 compromise
- 内部系统信息泄露
- 助力更具针对性的攻击
评估清单:
- 系统提示中无密码、API密钥或机密信息
- 将提示视为公开信息
- 配置模型拒绝泄露系统消息
- 使用明确的消息角色分隔符(系统/用户/助手)
- 在应用层而非提示层实施安全策略
- 监控输出中的提示泄露模式
- 使用已知提取技术定期测试
缓解策略:
- 绝不在系统提示中嵌入敏感数据
- 在应用层实施安全控制
- 配置模型拒绝披露提示
- 监控输出中的泄露模式
- 使用带角色分隔符的结构化消息格式
LLM08: Vector and Embedding Weaknesses
LLM08: 向量与嵌入弱点
Severity: Medium
Description: Vulnerabilities in vector databases and embedding-based retrieval systems (RAG) allow poisoning, injection, or unauthorized access to stored data.
Attack Vectors:
- Poisoned embeddings retrieved during RAG operations
- Direct injection of malicious vectors into stores
- Retrieval of sensitive data from improperly secured databases
- Metadata-based attacks exploiting insufficient filtering
- Similarity-based retrieval returning harmful content
Impact:
- Output manipulation through poisoned context
- Sensitive data leakage from vector stores
- Misinformation injection
- Compromised RAG system integrity
Assessment Checklist:
- Data validated and sanitized before vectorization
- Access controls on vector store insertion and modification
- Metadata filtering restricts retrieval to appropriate categories
- Monitoring for suspicious bulk insertions
- Similarity thresholds ensuring relevant retrieval
- Sensitive and public vector stores separated
- Embedding source provenance tracked
- Anomaly detection for unusual retrieval patterns
Mitigation Strategies:
- Validate data before storing in vector databases
- Implement strict access controls on vector operations
- Use metadata filtering and similarity thresholds
- Separate sensitive and public data stores
- Monitor for anomalous patterns
严重程度:中
描述:向量数据库与基于嵌入的检索系统(RAG)中的漏洞,允许毒化、注入或未授权访问存储数据。
攻击向量:
- RAG操作中检索到的毒化嵌入
- 直接向存储中注入恶意向量
- 从安全措施不当的数据库中检索敏感数据
- 利用过滤不足的元数据攻击
- 基于相似度的检索返回有害内容
影响:
- 通过毒化上下文操纵输出
- 向量存储中的敏感数据泄露
- 注入错误信息
- RAG系统完整性受损
评估清单:
- 向量化前已验证与清理数据
- 对向量存储的插入与修改实施访问控制
- 元数据过滤限制检索到适当类别
- 监控可疑批量插入
- 相似度阈值确保检索相关性
- 分离敏感与公共向量存储
- 跟踪嵌入来源
- 检测异常检索模式
缓解策略:
- 存储到向量数据库前验证数据
- 对向量操作实施严格访问控制
- 使用元数据过滤与相似度阈值
- 分离敏感与公共数据存储
- 监控异常模式
LLM09: Misinformation
LLM09: 错误信息生成
Severity: Medium
Description: LLMs generate plausible but false information (hallucinations/confabulations) that users may trust and act upon, causing harm.
Attack Vectors:
- Fabricated facts presented authoritatively
- Fake citations or references that don't exist
- Invented case law, medical advice, or technical solutions
- Adversarial prompts designed to trigger hallucinations
- Confident incorrect reasoning
Impact:
- Harmful decisions based on false information
- Legal liability from incorrect advice
- Erosion of trust in AI systems
- Regulatory violations in compliance contexts
- Reputational damage
Assessment Checklist:
- Confidence scores or uncertainty indicators provided
- Fact-checking against reliable databases implemented
- Citations with verifiable sources required for sensitive domains
- RAG grounding responses in validated data
- System instructions encourage admitting uncertainty
- Human review for critical outputs
- Model limitations clearly communicated to users
Mitigation Strategies:
- Implement retrieval-augmented generation (RAG) for grounding
- Provide confidence indicators to users
- Require verifiable citations for critical domains
- Add human review for high-stakes outputs
- Clearly communicate model limitations
严重程度:中
描述:LLM生成看似合理但虚假的信息(幻觉/虚构内容),用户可能信任并据此采取行动,造成损害。
攻击向量:
- 以权威口吻呈现的编造事实
- 不存在的虚假引用或参考文献
- 虚构的判例法、医疗建议或技术解决方案
- 触发幻觉的对抗性提示
- 自信的错误推理
影响:
- 基于虚假信息做出有害决策
- 错误建议引发法律责任
- AI系统可信度下降
- 合规场景下的违规风险
- 声誉受损
评估清单:
- 提供置信度分数或不确定性指标
- 基于可靠数据库实施事实核查
- 敏感领域要求提供可验证来源的引用
- RAG将响应基于已验证数据
- 系统指令鼓励承认不确定性
- 关键输出需人工审核
- 向用户清晰传达模型局限性
缓解策略:
- 实现检索增强生成(RAG)以锚定事实
- 向用户提供置信度指标
- 关键领域要求可验证引用
- 高风险输出需人工审核
- 清晰传达模型局限性
LLM10: Unbounded Consumption
LLM10: 无限制资源消耗
Severity: Medium
Description: Uncontrolled LLM usage causes denial-of-service, system crashes, or excessive operational costs through resource exhaustion.
Attack Vectors:
- Flood of queries overwhelming API endpoints
- Extremely long or recursive prompts consuming resources
- Infinite loops through recursive prompt injection
- Distributed attacks with massive input volumes
- Cascading failures through connected systems
Impact:
- Service unavailability for legitimate users
- Financial loss from excessive token usage
- System crashes and performance degradation
- Infrastructure damage
Assessment Checklist:
- Rate limiting per user, IP, and API key
- Maximum token limits for requests and daily usage
- Resource consumption monitoring with alerting
- Request timeouts preventing hung operations
- Per-user quotas with cost implications
- Cost monitoring with automated budget alerts
- Load balancing across infrastructure
- Auto-scaling with cost-aware limits
Mitigation Strategies:
- Implement rate limiting at multiple levels
- Set token and cost limits per user/session
- Monitor resource consumption with alerts
- Use request timeouts and queue management
- Design auto-scaling with cost guardrails
严重程度:中
描述:不受控制的LLM使用导致资源耗尽,引发拒绝服务、系统崩溃或过高运营成本。
攻击向量:
- 大量查询淹没API端点
- 极长或递归提示消耗资源
- 通过递归提示注入导致无限循环
- 大规模输入的分布式攻击
- 关联系统引发的级联故障
影响:
- 合法用户无法访问服务
- 令牌过度使用导致财务损失
- 系统崩溃与性能下降
- 基础设施损坏
评估清单:
- 按用户、IP和API密钥实施速率限制
- 设置请求与每日使用的最大令牌限制
- 资源消耗监控与警报
- 请求超时防止挂起操作
- 带成本影响的用户配额
- 成本监控与自动预算警报
- 基础设施负载均衡
- 带成本限制的自动扩容
缓解策略:
- 在多个层面实施速率限制
- 按用户/会话设置令牌与成本限制
- 资源消耗监控与警报
- 使用请求超时与队列管理
- 设计带成本防护的自动扩容
Audit Procedure
审计流程
Step 1: Application Understanding (15 minutes)
步骤1:应用理解(15分钟)
-
System inventory:
- Document LLM provider, model version, and configuration
- Map application architecture (APIs, databases, vector stores)
- Identify all data flows to and from the LLM
- List plugins, tools, and integrations
-
Threat modeling:
- Identify attack surfaces (user inputs, APIs, data sources)
- Determine data sensitivity classification
- Map trust boundaries between components
- Identify privileged operations the LLM can trigger
-
系统清单:
- 记录LLM提供商、模型版本与配置
- 绘制应用架构(API、数据库、向量存储)
- 识别与LLM相关的所有数据流
- 列出插件、工具与集成组件
-
威胁建模:
- 识别攻击面(用户输入、API、数据源)
- 确定数据敏感性分类
- 绘制组件间的信任边界
- 识别LLM可触发的特权操作
Step 2: Vulnerability Assessment (40-60 minutes)
步骤2:漏洞评估(40-60分钟)
For each of the 10 vulnerabilities, assess:
针对每个漏洞,开展以下评估:
Prompt Injection (LLM01) - 10 min
LLM01: 提示注入(10分钟)
- Test with known injection techniques (direct and indirect)
- Attempt to override system instructions
- Test with malicious content in external data sources
- Verify input validation and sanitization
- 使用已知注入技术测试(直接与间接)
- 尝试覆盖系统指令
- 测试外部数据源中的恶意内容
- 验证输入验证与清理有效性
Sensitive Information Disclosure (LLM02) - 5 min
LLM02: 敏感信息泄露(5分钟)
- Attempt to extract training data
- Test for PII leakage in outputs
- Check for credential or API key exposure
- Verify output filtering effectiveness
- 尝试提取训练数据
- 测试输出中的PII泄露
- 检查凭证或API密钥暴露
- 验证输出过滤有效性
Supply Chain (LLM03) - 5 min
LLM03: 供应链(5分钟)
- Review model provenance and source
- Check dependency versions and vulnerability status
- Verify plugin and integration security
- Review SBOM completeness
- 审查模型来源与出处
- 检查依赖项版本与漏洞状态
- 验证插件与集成安全性
- 审查SBOM完整性
Data/Model Poisoning (LLM04) - 5 min
LLM04: 数据/模型毒化(5分钟)
- Review data pipeline security
- Check fine-tuning data validation
- Verify model integrity monitoring
- Test with known trigger patterns
- 审查数据流水线安全性
- 检查微调数据验证
- 验证模型完整性监控
- 使用已知触发模式测试
Improper Output Handling (LLM05) - 10 min
LLM05: 输出处理不当(10分钟)
- Test for XSS through LLM outputs
- Attempt SQL injection via model responses
- Check command injection possibilities
- Verify output sanitization and encoding
- 测试LLM输出中的XSS漏洞
- 尝试通过模型响应进行SQL注入
- 检查命令注入可能性
- 验证输出清理与编码
Excessive Agency (LLM06) - 5 min
LLM06: 过度自主权限(5分钟)
- Review agent permissions and capabilities
- Test privilege boundaries
- Verify human-in-the-loop for critical actions
- Check audit logging completeness
- 审查Agent权限与能力
- 测试权限边界
- 验证关键操作的人工审核环节
- 检查审计日志完整性
System Prompt Leakage (LLM07) - 5 min
LLM07: 系统提示泄露(5分钟)
- Attempt to extract system prompts
- Check for secrets in prompts
- Verify prompt protection mechanisms
- 尝试提取系统提示
- 检查提示中的机密信息
- 验证提示保护机制
Vector/Embedding Weaknesses (LLM08) - 5 min
LLM08: 向量/嵌入弱点(5分钟)
- Review vector store access controls
- Test RAG retrieval for injection
- Verify data separation and filtering
- 审查向量存储访问控制
- 测试RAG检索注入
- 验证数据分离与过滤
Misinformation (LLM09) - 5 min
LLM09: 错误信息生成(5分钟)
- Test for hallucination in critical domains
- Verify grounding and citation mechanisms
- Check confidence indicators
- 测试关键领域的幻觉生成
- 验证锚定与引用机制
- 检查置信度指标
Unbounded Consumption (LLM10) - 5 min
LLM10: 无限制资源消耗(5分钟)
- Test rate limiting effectiveness
- Verify token and cost limits
- Check timeout configurations
- 测试速率限制有效性
- 验证令牌与成本限制
- 检查超时配置
Step 3: Risk Scoring (15 minutes)
步骤3:风险评分(15分钟)
For each vulnerability found, score using:
Likelihood: How likely is exploitation?
- High: Known attack vectors, easy to exploit, publicly accessible
- Medium: Requires some skill or specific conditions
- Low: Difficult to exploit, limited attack surface
Impact: What is the potential damage?
- Critical: System compromise, major data breach, significant financial loss
- High: Unauthorized access, data exposure, service disruption
- Medium: Limited data exposure, partial service impact
- Low: Minor information disclosure, minimal impact
对每个发现的漏洞,使用以下标准评分:
可能性:被利用的概率?
- 高:已知攻击向量,易于利用,公开可访问
- 中:需要一定技能或特定条件
- 低:难以利用,攻击面有限
影响:潜在损害程度?
- 关键:系统被 compromise,重大数据泄露,严重财务损失
- 高:未授权访问,数据暴露,服务中断
- 中:有限数据暴露,部分服务影响
- 低:轻微信息泄露,影响极小
Step 4: Report Generation (20 minutes)
步骤4:报告生成(20分钟)
Compile comprehensive security assessment.
编制全面的安全评估报告。
Output Format
输出格式
Generate a comprehensive OWASP LLM security audit report:
markdown
undefined生成完整的OWASP LLM安全审计报告:
markdown
undefinedOWASP LLM Top 10 Security Audit Report
OWASP LLM Top 10安全审计报告
Application: [Name]
LLM Provider/Model: [Provider - Model]
Date: [Date]
Evaluator: [AI Agent or Human]
OWASP LLM Top 10 Version: 2025
应用名称:[名称]
LLM提供商/模型:[提供商 - 模型]
日期:[日期]
评估者:[AI Agent或人工]
OWASP LLM Top 10版本:2025
Executive Summary
执行摘要
Overall Security Posture: [Critical / High Risk / Medium Risk / Low Risk / Secure]
整体安全态势:[关键风险 / 高风险 / 中风险 / 低风险 / 安全]
Application Type: [Chatbot / Agent / RAG System / Content Generator / Code Assistant / Other]
Data Sensitivity: [Public / Internal / Confidential / Restricted]
User Base: [Internal / B2B / B2C / Public]
应用类型:[聊天机器人 / Agent / RAG系统 / 内容生成器 / 代码助手 / 其他]
数据敏感性:[公开 / 内部 / 机密 / 受限]
用户群体:[内部 / B2B / B2C / 公开]
Critical Findings
关键发现
| # | Vulnerability | Severity | Status |
|---|---|---|---|
| LLM01 | Prompt Injection | Critical | [Vulnerable / Mitigated / N/A] |
| LLM02 | Sensitive Info Disclosure | Critical | [Vulnerable / Mitigated / N/A] |
| LLM03 | Supply Chain | High | [Vulnerable / Mitigated / N/A] |
| LLM04 | Data/Model Poisoning | High | [Vulnerable / Mitigated / N/A] |
| LLM05 | Improper Output Handling | High | [Vulnerable / Mitigated / N/A] |
| LLM06 | Excessive Agency | High | [Vulnerable / Mitigated / N/A] |
| LLM07 | System Prompt Leakage | Medium | [Vulnerable / Mitigated / N/A] |
| LLM08 | Vector/Embedding Weaknesses | Medium | [Vulnerable / Mitigated / N/A] |
| LLM09 | Misinformation | Medium | [Vulnerable / Mitigated / N/A] |
| LLM10 | Unbounded Consumption | Medium | [Vulnerable / Mitigated / N/A] |
| # | 漏洞 | 严重程度 | 状态 |
|---|---|---|---|
| LLM01 | 提示注入 | 关键 | [存在漏洞 / 已缓解 / 不适用] |
| LLM02 | 敏感信息泄露 | 关键 | [存在漏洞 / 已缓解 / 不适用] |
| LLM03 | 供应链 | 高 | [存在漏洞 / 已缓解 / 不适用] |
| LLM04 | 数据/模型毒化 | 高 | [存在漏洞 / 已缓解 / 不适用] |
| LLM05 | 输出处理不当 | 高 | [存在漏洞 / 已缓解 / 不适用] |
| LLM06 | 过度自主权限 | 高 | [存在漏洞 / 已缓解 / 不适用] |
| LLM07 | 系统提示泄露 | 中 | [存在漏洞 / 已缓解 / 不适用] |
| LLM08 | 向量/嵌入弱点 | 中 | [存在漏洞 / 已缓解 / 不适用] |
| LLM09 | 错误信息生成 | 中 | [存在漏洞 / 已缓解 / 不适用] |
| LLM10 | 无限制资源消耗 | 中 | [存在漏洞 / 已缓解 / 不适用] |
Top 3 Critical Issues
三大关键问题
- [Issue] - [Impact description]
- [Issue] - [Impact description]
- [Issue] - [Impact description]
- [问题] - [影响描述]
- [问题] - [影响描述]
- [问题] - [影响描述]
Detailed Findings
详细发现
LLM01: Prompt Injection
LLM01: 提示注入
Status: [Vulnerable / Partially Mitigated / Mitigated]
Severity: [Critical / High / Medium / Low]
Likelihood: [High / Medium / Low]
Findings:
- [Finding with evidence]
- [Finding with evidence]
Attack Scenario:
[Description of how this could be exploited]
Recommendations:
- [Specific remediation step]
- [Specific remediation step]
Effort: [Low / Medium / High]
[Continue for LLM02 through LLM10...]
状态:[存在漏洞 / 部分缓解 / 已缓解]
严重程度:[关键 / 高 / 中 / 低]
可能性:[高 / 中 / 低]
发现结果:
- [带证据的发现]
- [带证据的发现]
攻击场景:
[该漏洞可能被利用的方式描述]
建议:
- [具体修复步骤]
- [具体修复步骤]
实施难度:[低 / 中 / 高]
[继续描述LLM02至LLM10...]
Architecture Security Review
架构安全审查
Data Flow Analysis
数据流分析
[Diagram or description of data flows with trust boundaries marked]
[带信任边界标记的数据流图或描述]
Attack Surface Summary
攻击面摘要
| Surface | Risk Level | Controls |
|---|---|---|
| User Input | [Level] | [Controls] |
| API Endpoints | [Level] | [Controls] |
| Vector Store | [Level] | [Controls] |
| Plugins/Tools | [Level] | [Controls] |
| Output Rendering | [Level] | [Controls] |
| 攻击面 | 风险等级 | 控制措施 |
|---|---|---|
| 用户输入 | [等级] | [控制措施] |
| API端点 | [等级] | [控制措施] |
| 向量存储 | [等级] | [控制措施] |
| 插件/工具 | [等级] | [控制措施] |
| 输出渲染 | [等级] | [控制措施] |
Remediation Roadmap
修复路线图
Phase 1: Critical (0-7 days)
阶段1:关键(0-7天)
- [Action item with owner]
- [Action item with owner]
- [带负责人的行动项]
- [带负责人的行动项]
Phase 2: High Priority (7-30 days)
阶段2:高优先级(7-30天)
- [Action item with owner]
- [带负责人的行动项]
Phase 3: Medium Priority (30-90 days)
阶段3:中优先级(30-90天)
- [Action item with owner]
- [带负责人的行动项]
Phase 4: Hardening (Ongoing)
阶段4:强化(持续)
- [Continuous improvement practices]
- [持续改进实践]
Security Controls Matrix
安全控制矩阵
| Control | Implemented | Effective | Recommendation |
|---|---|---|---|
| Input validation | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Output sanitization | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Rate limiting | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Authentication | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Authorization | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Logging/Monitoring | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Content filtering | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| Human-in-the-loop | [Yes/No/Partial] | [Yes/No] | [Recommendation] |
| 控制措施 | 是否已实施 | 是否有效 | 建议 |
|---|---|---|---|
| 输入验证 | [是/否/部分] | [是/否] | [建议] |
| 输出清理 | [是/否/部分] | [是/否] | [建议] |
| 速率限制 | [是/否/部分] | [是/否] | [建议] |
| 身份认证 | [是/否/部分] | [是/否] | [建议] |
| 授权 | [是/否/部分] | [是/否] | [建议] |
| 日志/监控 | [是/否/部分] | [是/否] | [建议] |
| 内容过滤 | [是/否/部分] | [是/否] | [建议] |
| 人工审核环节 | [是/否/部分] | [是/否] | [建议] |
Next Steps
后续步骤
- Prioritize and assign critical findings
- Implement quick wins (input validation, rate limiting)
- Schedule penetration testing for high-risk areas
- Establish continuous monitoring
- Plan follow-up audit after remediation
- 确定关键发现的优先级并分配责任人
- 实施快速修复(输入验证、速率限制)
- 为高风险区域安排渗透测试
- 建立持续监控
- 修复后计划跟进审计
Resources
参考资源
- OWASP Top 10 for LLM Applications 2025
- OWASP GenAI Security Project
- OWASP LLM AI Security & Governance Checklist
- OWASP GitHub Repository
Audit Version: 1.0
Date: [Date]
---Quick Reference: Vulnerability Priority
快速参考:漏洞优先级
| Priority | Vulnerabilities | Rationale |
|---|---|---|
| P0 | LLM01 (Prompt Injection), LLM02 (Data Disclosure) | Direct exploitation, high impact |
| P1 | LLM05 (Output Handling), LLM06 (Excessive Agency) | System compromise potential |
| P2 | LLM03 (Supply Chain), LLM04 (Poisoning) | Harder to exploit but severe impact |
| P3 | LLM07 (Prompt Leakage), LLM08 (Vector Weaknesses) | Enables further attacks |
| P4 | LLM09 (Misinformation), LLM10 (Unbounded Consumption) | Operational risk |
| 优先级 | 漏洞 | 理由 |
|---|---|---|
| P0 | LLM01(提示注入)、LLM02(数据泄露) | 可直接利用,影响重大 |
| P1 | LLM05(输出处理不当)、LLM06(过度自主权限) | 可能导致系统被 compromise |
| P2 | LLM03(供应链)、LLM04(毒化) | 利用难度较高但影响严重 |
| P3 | LLM07(提示泄露)、LLM08(向量弱点) | 助力后续攻击 |
| P4 | LLM09(错误信息生成)、LLM10(无限制资源消耗) | 运营风险 |
Best Practices
最佳实践
- Defense in depth: Never rely on a single security control
- Zero trust for LLM output: Treat all model output as untrusted
- Least privilege: Minimize AI agent permissions and capabilities
- Monitor continuously: Log and alert on anomalous AI behavior
- Test adversarially: Regular red-team exercises against AI features
- Secure the pipeline: Protect training data, models, and embeddings
- Human oversight: Maintain human-in-the-loop for critical operations
- Update regularly: Stay current with evolving attack techniques
- Educate users: Train users on safe AI interaction practices
- Plan for incidents: Have AI-specific incident response procedures
- 纵深防御:绝不依赖单一安全控制
- LLM输出零信任:将所有模型输出视为不可信
- 最小权限:最小化AI Agent的权限与能力
- 持续监控:记录并警报AI异常行为
- 对抗性测试:定期对AI功能开展红队演练
- 安全流水线:保护训练数据、模型与嵌入
- 人工监督:关键操作保留人工审核环节
- 定期更新:紧跟不断演变的攻击技术
- 用户教育:培训用户安全使用AI的实践
- 事件响应:制定AI特定的事件响应流程
Version
版本
1.0 - Initial release (OWASP Top 10 for LLM Applications 2025)
Remember: LLM security is an evolving field. New attack vectors emerge regularly. This audit provides a baseline assessment; continuous monitoring and periodic re-assessment are essential for maintaining security posture.
1.0 - 初始版本(基于2025版OWASP LLM应用Top 10)
注意:LLM安全是一个不断发展的领域,新的攻击向量会定期出现。本审计提供基线评估,持续监控与定期重新评估对于维持安全态势至关重要。