OWASP Top 10 for LLM Applications Security Audit

OWASP LLM应用Top 10安全审计

This skill enables AI agents to perform a comprehensive security assessment of Large Language Model (LLM) and Generative AI applications using the OWASP Top 10 for LLM Applications 2025, published by the OWASP GenAI Security Project.

The OWASP Top 10 for LLM Applications identifies the most critical security risks in systems that integrate large language models, covering vulnerabilities from prompt injection to unbounded resource consumption. This is the authoritative industry standard for LLM application security.

Use this skill to identify security vulnerabilities, assess risk exposure, prioritize remediation, and establish secure development practices for AI-powered applications.

Combine with "NIST AI RMF" for comprehensive risk management or "ISO 42001 AI Governance" for governance compliance.

该技能支持AI Agent借助OWASP生成式AI安全项目发布的《2025版OWASP LLM应用Top 10》，对大语言模型（LLM）和生成式AI应用开展全面的安全评估。

《OWASP LLM应用Top 10》明确了集成大语言模型的系统中最关键的安全风险，涵盖从提示注入到无限制资源消耗等各类漏洞，是LLM应用安全领域的权威行业标准。

使用该技能可识别安全漏洞、评估风险暴露程度、确定修复优先级，并为AI驱动的应用建立安全开发实践。

可结合「NIST AI RMF」实现全面风险管理，或结合「ISO 42001 AI治理」满足合规要求。

When to Use This Skill

何时使用该技能

Invoke this skill when:

Auditing security of LLM-powered applications before deployment
Reviewing GenAI integrations for security vulnerabilities
Assessing RAG (Retrieval-Augmented Generation) systems
Evaluating chatbot or AI assistant security
Conducting penetration testing of AI features
Building secure AI application architectures
Reviewing third-party AI API integrations
Preparing for security compliance reviews
Responding to AI-related security incidents

在以下场景中调用该技能：

部署前审计LLM驱动应用的安全性
审查生成式AI集成中的安全漏洞
评估检索增强生成（RAG）系统
评估聊天机器人或AI助手的安全性
对AI功能进行渗透测试
构建安全的AI应用架构
审查第三方AI API集成
为安全合规审查做准备
响应AI相关安全事件

Inputs Required

所需输入

When executing this audit, gather:

application_description: Description of the AI application (purpose, LLM used, architecture, features, user base) [REQUIRED]
architecture_details: System architecture (APIs, databases, vector stores, plugins, integrations) [OPTIONAL but recommended]
llm_provider: LLM provider and model (OpenAI GPT-4, Anthropic Claude, self-hosted, etc.) [OPTIONAL]
deployment_context: Deployment environment (cloud, on-premise, hybrid, edge) [OPTIONAL]
data_sensitivity: Types of data processed (PII, financial, health, proprietary) [OPTIONAL]
existing_controls: Current security measures (auth, rate limiting, content filtering) [OPTIONAL]
specific_concerns: Known vulnerabilities or areas of focus [OPTIONAL]

执行审计时，需收集以下信息：

application_description：AI应用描述（用途、使用的LLM、架构、功能、用户群体）【必填】
architecture_details：系统架构（API、数据库、向量存储、插件、集成组件）【可选但推荐提供】
llm_provider：LLM提供商及模型（如OpenAI GPT-4、Anthropic Claude、自部署模型等）【可选】
deployment_context：部署环境（云、本地、混合、边缘）【可选】
data_sensitivity：处理的数据类型（个人可识别信息PII、金融数据、健康数据、专有数据）【可选】
existing_controls：当前安全措施（认证、速率限制、内容过滤）【可选】
specific_concerns：已知漏洞或重点关注领域【可选】

The OWASP Top 10 for LLM Applications (2025)

2025版OWASP LLM应用Top 10

LLM01: Prompt Injection

LLM01: 提示注入

Severity: Critical

Description: Attackers manipulate LLM operations through crafted inputs, either directly or indirectly, to bypass intended functionality, access unauthorized data, or trigger unintended actions.

Attack Vectors:

Direct injection: Malicious user prompts containing override commands
Indirect injection: Hidden instructions in external content (web pages, documents, emails) processed by the LLM
Jailbreaks: Techniques to bypass safety constraints and content policies

Impact:

Unauthorized data access and exfiltration
Bypass of content safety filters
Manipulation of downstream system actions
Social engineering of users through manipulated outputs

Assessment Checklist:

Input sanitization and validation implemented
System prompts separated from user inputs with clear delimiters
Least privilege applied to LLM backend access
Output validation before downstream actions
Human-in-the-loop for critical operations
Adversarial testing conducted with known injection techniques
Content filtering layers applied pre- and post-LLM

Mitigation Strategies:

Enforce privilege controls on LLM backend access
Segregate external content from user prompts
Maintain human oversight for critical functions
Implement input/output validation pipelines
Conduct regular adversarial testing

严重程度：关键

描述：攻击者通过精心构造的输入（直接或间接方式）操纵LLM运行，绕过预期功能、访问未授权数据或触发意外操作。

攻击向量：

直接注入：包含覆盖命令的恶意用户提示
间接注入：LLM处理的外部内容（网页、文档、邮件）中隐藏的指令
越狱：绕过安全约束和内容政策的技术手段

影响：

未授权数据访问与泄露
绕过内容安全过滤器
操纵下游系统操作
通过篡改输出对用户进行社会工程攻击

评估清单：

已实现输入清理与验证
系统提示与用户输入通过明确分隔符区分
对LLM后端访问应用最小权限原则
下游操作前验证输出内容
关键操作设置人工审核环节
使用已知注入技术开展对抗性测试
在LLM处理前后应用内容过滤层

缓解策略：

对LLM后端访问实施权限控制
将外部内容与用户提示分离
为关键功能保留人工监督
实现输入/输出验证流水线
定期开展对抗性测试

LLM02: Sensitive Information Disclosure

LLM02: 敏感信息泄露

Severity: Critical

Description: LLMs inadvertently expose confidential data including PII, proprietary algorithms, credentials, intellectual property, or internal system information through their outputs.

Attack Vectors:

Crafted prompts designed to extract training data
Legitimate queries that trigger memorized sensitive content
Model outputs revealing internal system architecture
Embedding leakage from vector databases

Impact:

Privacy violations and regulatory non-compliance (GDPR, CCPA)
Intellectual property theft
Credential exposure enabling further attacks
Reputational damage

Assessment Checklist:

PII and sensitive data removed from training/fine-tuning data
Data masking and tokenization in logs and outputs
System instructions forbidding sensitive disclosures
Output filtering for known sensitive patterns (SSN, credit cards, API keys)
Model access restricted to necessary information via middleware
User education against pasting confidential content
Output monitoring for anomalous data exposure

Mitigation Strategies:

Sanitize training data to remove sensitive information
Implement data loss prevention (DLP) on outputs
Apply access controls limiting model's data reach
Monitor outputs for sensitive data patterns
Use differential privacy techniques in training

严重程度：关键

描述：LLM通过输出无意中暴露机密数据，包括个人可识别信息（PII）、专有算法、凭证、知识产权或内部系统信息。

攻击向量：

精心构造的用于提取训练数据的提示
触发记忆中敏感内容的合法查询
泄露内部系统架构的模型输出
向量数据库中的嵌入信息泄露

影响：

隐私违规与合规风险（GDPR、CCPA等）
知识产权被盗
凭证暴露导致后续攻击
声誉受损

评估清单：

训练/微调数据中已移除PII和敏感数据
日志和输出中已应用数据掩码与令牌化处理
系统指令禁止泄露敏感信息
对已知敏感模式（社保号、信用卡、API密钥）进行输出过滤
通过中间件限制模型可访问的必要信息
教育用户避免粘贴机密内容
监控输出中的异常数据泄露情况

缓解策略：

清理训练数据以移除敏感信息
在输出端实施数据丢失防护（DLP）
应用访问控制限制模型的数据访问范围
监控输出中的敏感数据模式
在训练中使用差分隐私技术

LLM03: Supply Chain Vulnerabilities

LLM03: 供应链漏洞

Severity: High

Description: Compromised third-party components (models, datasets, libraries, plugins) introduce security risks including malware, backdoors, or biased behavior.

Attack Vectors:

Malicious pre-trained models from public repositories
Poisoned datasets with embedded triggers
Vulnerable ML libraries and dependencies
Compromised plugins with unauthorized access
Trojanized fine-tuning adapters

Impact:

System compromise and data theft
Backdoor access to production systems
Model corruption affecting all users
Legal liability from unlicensed content

Assessment Checklist:

Models sourced from verified, reputable providers
Digital signatures and checksums verified
Model files scanned for suspicious code (picklescan, etc.)
Third-party models deployed in sandboxed environments
Dependencies regularly updated and audited
Plugin permissions restricted with allowlists
Complete inventory of all models and components maintained
SBOM (Software Bill of Materials) maintained for AI components

Mitigation Strategies:

Source models only from trusted, verified providers
Scan model files for malicious code before deployment
Sandbox third-party models with restricted permissions
Maintain updated dependency inventory
Implement model signing and integrity verification

严重程度：高

描述：受 compromise 的第三方组件（模型、数据集、库、插件）引入安全风险，包括恶意软件、后门或偏见行为。

攻击向量：

公共仓库中的恶意预训练模型
嵌入触发机制的毒化数据集
存在漏洞的机器学习库与依赖项
具有未授权访问权限的受 compromise 插件
植入木马的微调适配器

影响：

系统被 compromise 与数据被盗
生产系统的后门访问
模型损坏影响所有用户
因未授权内容引发法律责任

评估清单：

模型来自经过验证的可信提供商
已验证数字签名与校验和
扫描模型文件中的可疑代码（如picklescan等工具）
在沙箱环境中部署第三方模型
定期更新与审计依赖项
通过白名单限制插件权限
维护所有模型与组件的完整清单
为AI组件维护软件物料清单（SBOM）

缓解策略：

仅从可信、经过验证的提供商获取模型
部署前扫描模型文件中的恶意代码
在权限受限的沙箱中运行第三方模型
维护更新的依赖项清单
实现模型签名与完整性验证

LLM04: Data and Model Poisoning

LLM04: 数据与模型毒化

Severity: High

Description: Attackers manipulate training or fine-tuning data to introduce vulnerabilities, backdoors, or biases that compromise model security and reliability.

Attack Vectors:

Crafted training examples with hidden trigger phrases
Poisoned web-scraped content absorbed during training
Direct tampering with model weights or parameters
Malicious fine-tuning data
Subtle label manipulation or data anomalies

Impact:

Biased or degraded model outputs
Trigger-activated backdoors in production
Erosion of model trustworthiness
Long-term hidden threats difficult to detect

Assessment Checklist:

Training data validated, cleaned, and audited
Data provenance tracked and documented
Rate limiting and moderation for crowdsourced data
Differential privacy techniques applied
Models tested with known trigger phrases before deployment
Deployed models monitored for behavioral drift
Model file checksums verified against known-good states

Mitigation Strategies:

Validate and clean all training data sources
Implement data provenance tracking
Apply differential privacy to limit individual data influence
Test with adversarial inputs before deployment
Monitor production models for unexpected behavior

严重程度：高

描述：攻击者操纵训练或微调数据，引入漏洞、后门或偏见，损害模型的安全性与可靠性。

攻击向量：

包含隐藏触发短语的精心构造训练示例
训练过程中吸收的毒化网页抓取内容
直接篡改模型权重或参数
恶意微调数据
微妙的标签操纵或数据异常

影响：

模型输出存在偏见或质量下降
生产环境中触发激活的后门
模型可信度受损
难以检测的长期隐藏威胁

评估清单：

训练数据已验证、清理与审计
跟踪并记录数据来源
对众包数据实施速率限制与审核
应用差分隐私技术
部署前使用已知触发短语测试模型
监控已部署模型的行为漂移
验证模型文件校验和与已知良好状态是否一致

缓解策略：

验证并清理所有训练数据源
实现数据来源跟踪
应用差分隐私以限制单个数据的影响
部署前使用对抗性输入测试
监控生产模型的意外行为

LLM05: Improper Output Handling

LLM05: 输出处理不当

Severity: High

Description: Applications blindly execute or render LLM outputs without validation, enabling code injection, XSS, SQL injection, SSRF, and other attacks.

Attack Vectors:

Unescaped HTML/JavaScript in outputs (XSS)
Model-generated shell commands executed without sanitization
SQL queries constructed from model output
Unsanitized API calls based on AI suggestions
Direct execution via eval() or exec()

Impact:

Remote code execution
Session hijacking
Database manipulation
Privilege escalation
Full system compromise

Assessment Checklist:

All LLM output treated as untrusted input
Strict output schema validation enforced (JSON, formats)
Output sanitized and escaped based on context (HTML, SQL, shell)
Parameterized queries used instead of raw SQL
Allowlists for acceptable output patterns
Generated code executed in sandboxed environments
Human approval required for high-impact actions
Rendering libraries with built-in escaping used

Mitigation Strategies:

Never trust LLM output; validate and sanitize everything
Enforce strict output schemas
Use parameterized queries and safe ORM methods
Sandbox all code execution
Require human approval for privileged operations

严重程度：高

描述：应用盲目执行或渲染LLM输出而不进行验证，导致代码注入、跨站脚本（XSS）、SQL注入、服务器端请求伪造（SSRF）等攻击。

攻击向量：

输出中未转义的HTML/JavaScript（XSS）
未清理的模型生成shell命令被执行
基于模型输出构造的SQL查询
基于AI建议的未清理API调用
通过eval()或exec()直接执行

影响：

远程代码执行
会话劫持
数据库操纵
权限提升
系统完全被 compromise

评估清单：

所有LLM输出均被视为不可信输入
严格执行输出模式验证（JSON等格式）
根据上下文（HTML、SQL、shell）清理与转义输出
使用参数化查询而非原生SQL
设置可接受输出模式的白名单
在沙箱环境中执行生成的代码
高影响操作需人工批准
使用内置转义功能的渲染库

缓解策略：

绝不信任LLM输出；对所有内容进行验证与清理
强制执行严格的输出模式
使用参数化查询与安全ORM方法
对所有代码执行进行沙箱隔离
特权操作需人工批准

LLM06: Excessive Agency

LLM06: 过度自主权限

Severity: High

Description: AI agents possess excessive permissions and autonomous capabilities, enabling significant harm through compromised prompts, hallucinations, or malicious manipulation.

Attack Vectors:

Prompt injection exploiting overly permissioned agents
Hallucinations triggering unintended high-impact actions
Confused deputy attacks using AI's elevated privileges
Malicious plugins with excessive access
Unrestricted system control (email, API, database)

Impact:

Unauthorized data transmission
Destructive actions (deletion, modification)
Financial loss through unauthorized transactions
Service disruptions
Automated attack amplification

Assessment Checklist:

Principle of least privilege applied to all AI capabilities
Granular permissions with limited-scope OAuth tokens
Functionality compartmentalized across narrow-scope agents
High-risk actions restricted (deletion, transfers, device control)
Explicit user approval for significant operations
Rate limiting on AI actions and API calls
Comprehensive audit logs of all agent activities
Monitoring with alerts for anomalous behavior

Mitigation Strategies:

Grant only essential capabilities (least privilege)
Compartmentalize agent functionality
Require human approval for high-impact operations
Implement comprehensive audit logging
Set up real-time monitoring and anomaly detection

严重程度：高

描述：AI Agent拥有过多权限与自主能力，通过受 compromise 的提示、幻觉或恶意操纵造成重大损害。

攻击向量：

利用权限过度的Agent进行提示注入
幻觉触发意外的高影响操作
利用AI提升的权限进行混淆代理攻击
权限过度的恶意插件
不受限制的系统控制（邮件、API、数据库）

影响：

未授权数据传输
破坏性操作（删除、修改）
未授权交易导致财务损失
服务中断
自动化攻击放大

评估清单：

对所有AI能力应用最小权限原则
使用范围受限的OAuth令牌实现细粒度权限
按窄范围Agent划分功能模块
限制高风险操作（删除、转账、设备控制）
重大操作需明确用户批准
对AI操作与API调用实施速率限制
维护所有Agent活动的完整审计日志
监控异常行为并设置警报

缓解策略：

仅授予必要的能力（最小权限）
划分Agent功能模块
高影响操作需人工批准
实现全面的审计日志
设置实时监控与异常检测

LLM07: System Prompt Leakage

LLM07: 系统提示泄露

Severity: Medium

Description: System instructions intended to guide AI behavior are exposed to users or attackers, revealing internal logic, security controls, or sensitive configurations.

Attack Vectors:

Prompt injection requesting instruction disclosure
Sophisticated probing asking to repeat conversation context
Tokenization quirks causing unintended disclosure
Reverse-engineering through behavioral observation
Model unintentionally echoing system prompts

Impact:

Security logic exposure enabling bypass attacks
Credential compromise if secrets embedded in prompts
Internal system knowledge revelation
Facilitation of more targeted attacks

Assessment Checklist:

No passwords, API keys, or secrets in system prompts
Prompts treated as public information
Models configured to refuse revealing system messages
Clear message role delimiters (system/user/assistant)
Security policies enforced at application level, not prompt level
Output monitoring for prompt leakage patterns
Regular testing with known extraction techniques

Mitigation Strategies:

Never embed sensitive data in system prompts
Implement application-level security enforcement
Configure models to refuse prompt disclosure
Monitor outputs for leakage patterns
Use structured message formats with role delimiters

严重程度：中

描述：用于引导AI行为的系统指令被用户或攻击者获取，泄露内部逻辑、安全控制或敏感配置。

攻击向量：

请求披露指令的提示注入
要求重复对话上下文的复杂探测
导致意外披露的令牌化异常
通过行为观察进行逆向工程
模型无意中回显系统提示

影响：

安全逻辑暴露导致绕过攻击
提示中嵌入的密钥泄露导致凭证 compromise
内部系统信息泄露
助力更具针对性的攻击

评估清单：

系统提示中无密码、API密钥或机密信息
将提示视为公开信息
配置模型拒绝泄露系统消息
使用明确的消息角色分隔符（系统/用户/助手）
在应用层而非提示层实施安全策略
监控输出中的提示泄露模式
使用已知提取技术定期测试

缓解策略：

绝不在系统提示中嵌入敏感数据
在应用层实施安全控制
配置模型拒绝披露提示
监控输出中的泄露模式
使用带角色分隔符的结构化消息格式

LLM08: Vector and Embedding Weaknesses

LLM08: 向量与嵌入弱点

Severity: Medium

Description: Vulnerabilities in vector databases and embedding-based retrieval systems (RAG) allow poisoning, injection, or unauthorized access to stored data.

Attack Vectors:

Poisoned embeddings retrieved during RAG operations
Direct injection of malicious vectors into stores
Retrieval of sensitive data from improperly secured databases
Metadata-based attacks exploiting insufficient filtering
Similarity-based retrieval returning harmful content

Impact:

Output manipulation through poisoned context
Sensitive data leakage from vector stores
Misinformation injection
Compromised RAG system integrity

Assessment Checklist:

Data validated and sanitized before vectorization
Access controls on vector store insertion and modification
Metadata filtering restricts retrieval to appropriate categories
Monitoring for suspicious bulk insertions
Similarity thresholds ensuring relevant retrieval
Sensitive and public vector stores separated
Embedding source provenance tracked
Anomaly detection for unusual retrieval patterns

Mitigation Strategies:

Validate data before storing in vector databases
Implement strict access controls on vector operations
Use metadata filtering and similarity thresholds
Separate sensitive and public data stores
Monitor for anomalous patterns

严重程度：中

描述：向量数据库与基于嵌入的检索系统（RAG）中的漏洞，允许毒化、注入或未授权访问存储数据。

攻击向量：

RAG操作中检索到的毒化嵌入
直接向存储中注入恶意向量
从安全措施不当的数据库中检索敏感数据
利用过滤不足的元数据攻击
基于相似度的检索返回有害内容

影响：

通过毒化上下文操纵输出
向量存储中的敏感数据泄露
注入错误信息
RAG系统完整性受损

评估清单：

向量化前已验证与清理数据
对向量存储的插入与修改实施访问控制
元数据过滤限制检索到适当类别
监控可疑批量插入
相似度阈值确保检索相关性
分离敏感与公共向量存储
跟踪嵌入来源
检测异常检索模式

缓解策略：

存储到向量数据库前验证数据
对向量操作实施严格访问控制
使用元数据过滤与相似度阈值
分离敏感与公共数据存储
监控异常模式

LLM09: Misinformation

LLM09: 错误信息生成

Severity: Medium

Description: LLMs generate plausible but false information (hallucinations/confabulations) that users may trust and act upon, causing harm.

Attack Vectors:

Fabricated facts presented authoritatively
Fake citations or references that don't exist
Invented case law, medical advice, or technical solutions
Adversarial prompts designed to trigger hallucinations
Confident incorrect reasoning

Impact:

Harmful decisions based on false information
Legal liability from incorrect advice
Erosion of trust in AI systems
Regulatory violations in compliance contexts
Reputational damage

Assessment Checklist:

Confidence scores or uncertainty indicators provided
Fact-checking against reliable databases implemented
Citations with verifiable sources required for sensitive domains
RAG grounding responses in validated data
System instructions encourage admitting uncertainty
Human review for critical outputs
Model limitations clearly communicated to users

Mitigation Strategies:

Implement retrieval-augmented generation (RAG) for grounding
Provide confidence indicators to users
Require verifiable citations for critical domains
Add human review for high-stakes outputs
Clearly communicate model limitations

严重程度：中

描述：LLM生成看似合理但虚假的信息（幻觉/虚构内容），用户可能信任并据此采取行动，造成损害。

攻击向量：

以权威口吻呈现的编造事实
不存在的虚假引用或参考文献
虚构的判例法、医疗建议或技术解决方案
触发幻觉的对抗性提示
自信的错误推理

影响：

基于虚假信息做出有害决策
错误建议引发法律责任
AI系统可信度下降
合规场景下的违规风险
声誉受损

评估清单：

提供置信度分数或不确定性指标
基于可靠数据库实施事实核查
敏感领域要求提供可验证来源的引用
RAG将响应基于已验证数据
系统指令鼓励承认不确定性
关键输出需人工审核
向用户清晰传达模型局限性

缓解策略：

实现检索增强生成（RAG）以锚定事实
向用户提供置信度指标
关键领域要求可验证引用
高风险输出需人工审核
清晰传达模型局限性

LLM10: Unbounded Consumption

LLM10: 无限制资源消耗

Severity: Medium

Description: Uncontrolled LLM usage causes denial-of-service, system crashes, or excessive operational costs through resource exhaustion.

Attack Vectors:

Flood of queries overwhelming API endpoints
Extremely long or recursive prompts consuming resources
Infinite loops through recursive prompt injection
Distributed attacks with massive input volumes
Cascading failures through connected systems

Impact:

Service unavailability for legitimate users
Financial loss from excessive token usage
System crashes and performance degradation
Infrastructure damage

Assessment Checklist:

Rate limiting per user, IP, and API key
Maximum token limits for requests and daily usage
Resource consumption monitoring with alerting
Request timeouts preventing hung operations
Per-user quotas with cost implications
Cost monitoring with automated budget alerts
Load balancing across infrastructure
Auto-scaling with cost-aware limits

Mitigation Strategies:

Implement rate limiting at multiple levels
Set token and cost limits per user/session
Monitor resource consumption with alerts
Use request timeouts and queue management
Design auto-scaling with cost guardrails

严重程度：中

描述：不受控制的LLM使用导致资源耗尽，引发拒绝服务、系统崩溃或过高运营成本。

攻击向量：

大量查询淹没API端点
极长或递归提示消耗资源
通过递归提示注入导致无限循环
大规模输入的分布式攻击
关联系统引发的级联故障

影响：

合法用户无法访问服务
令牌过度使用导致财务损失
系统崩溃与性能下降
基础设施损坏

评估清单：

按用户、IP和API密钥实施速率限制
设置请求与每日使用的最大令牌限制
资源消耗监控与警报
请求超时防止挂起操作
带成本影响的用户配额
成本监控与自动预算警报
基础设施负载均衡
带成本限制的自动扩容

缓解策略：

在多个层面实施速率限制
按用户/会话设置令牌与成本限制
资源消耗监控与警报
使用请求超时与队列管理
设计带成本防护的自动扩容

Audit Procedure

审计流程

Step 1: Application Understanding (15 minutes)

步骤1：应用理解（15分钟）

System inventory:
- Document LLM provider, model version, and configuration
- Map application architecture (APIs, databases, vector stores)
- Identify all data flows to and from the LLM
- List plugins, tools, and integrations
Threat modeling:
- Identify attack surfaces (user inputs, APIs, data sources)
- Determine data sensitivity classification
- Map trust boundaries between components
- Identify privileged operations the LLM can trigger

系统清单：
- 记录LLM提供商、模型版本与配置
- 绘制应用架构（API、数据库、向量存储）
- 识别与LLM相关的所有数据流
- 列出插件、工具与集成组件
威胁建模：
- 识别攻击面（用户输入、API、数据源）
- 确定数据敏感性分类
- 绘制组件间的信任边界
- 识别LLM可触发的特权操作

Step 2: Vulnerability Assessment (40-60 minutes)

步骤2：漏洞评估（40-60分钟）

For each of the 10 vulnerabilities, assess:

针对每个漏洞，开展以下评估：

Prompt Injection (LLM01) - 10 min

LLM01: 提示注入（10分钟）

Test with known injection techniques (direct and indirect)
Attempt to override system instructions
Test with malicious content in external data sources
Verify input validation and sanitization

使用已知注入技术测试（直接与间接）
尝试覆盖系统指令
测试外部数据源中的恶意内容
验证输入验证与清理有效性

Sensitive Information Disclosure (LLM02) - 5 min

LLM02: 敏感信息泄露（5分钟）

Attempt to extract training data
Test for PII leakage in outputs
Check for credential or API key exposure
Verify output filtering effectiveness

尝试提取训练数据
测试输出中的PII泄露
检查凭证或API密钥暴露
验证输出过滤有效性

Supply Chain (LLM03) - 5 min

LLM03: 供应链（5分钟）

Review model provenance and source
Check dependency versions and vulnerability status
Verify plugin and integration security
Review SBOM completeness

审查模型来源与出处
检查依赖项版本与漏洞状态
验证插件与集成安全性
审查SBOM完整性

Data/Model Poisoning (LLM04) - 5 min

LLM04: 数据/模型毒化（5分钟）

Review data pipeline security
Check fine-tuning data validation
Verify model integrity monitoring
Test with known trigger patterns

审查数据流水线安全性
检查微调数据验证
验证模型完整性监控
使用已知触发模式测试

Improper Output Handling (LLM05) - 10 min

LLM05: 输出处理不当（10分钟）

Test for XSS through LLM outputs
Attempt SQL injection via model responses
Check command injection possibilities
Verify output sanitization and encoding

测试LLM输出中的XSS漏洞
尝试通过模型响应进行SQL注入
检查命令注入可能性
验证输出清理与编码

Excessive Agency (LLM06) - 5 min

LLM06: 过度自主权限（5分钟）

Review agent permissions and capabilities
Test privilege boundaries
Verify human-in-the-loop for critical actions
Check audit logging completeness

审查Agent权限与能力
测试权限边界
验证关键操作的人工审核环节
检查审计日志完整性

System Prompt Leakage (LLM07) - 5 min

LLM07: 系统提示泄露（5分钟）

Attempt to extract system prompts
Check for secrets in prompts
Verify prompt protection mechanisms

尝试提取系统提示
检查提示中的机密信息
验证提示保护机制

Vector/Embedding Weaknesses (LLM08) - 5 min

LLM08: 向量/嵌入弱点（5分钟）

Review vector store access controls
Test RAG retrieval for injection
Verify data separation and filtering

审查向量存储访问控制
测试RAG检索注入
验证数据分离与过滤

Misinformation (LLM09) - 5 min

LLM09: 错误信息生成（5分钟）

Test for hallucination in critical domains
Verify grounding and citation mechanisms
Check confidence indicators

测试关键领域的幻觉生成
验证锚定与引用机制
检查置信度指标

Unbounded Consumption (LLM10) - 5 min

LLM10: 无限制资源消耗（5分钟）

Test rate limiting effectiveness
Verify token and cost limits
Check timeout configurations

测试速率限制有效性
验证令牌与成本限制
检查超时配置

Step 3: Risk Scoring (15 minutes)

步骤3：风险评分（15分钟）

For each vulnerability found, score using:

Likelihood: How likely is exploitation?

High: Known attack vectors, easy to exploit, publicly accessible
Medium: Requires some skill or specific conditions
Low: Difficult to exploit, limited attack surface

Impact: What is the potential damage?

Critical: System compromise, major data breach, significant financial loss
High: Unauthorized access, data exposure, service disruption
Medium: Limited data exposure, partial service impact
Low: Minor information disclosure, minimal impact

对每个发现的漏洞，使用以下标准评分：

可能性：被利用的概率？

高：已知攻击向量，易于利用，公开可访问
中：需要一定技能或特定条件
低：难以利用，攻击面有限

影响：潜在损害程度？

关键：系统被 compromise，重大数据泄露，严重财务损失
高：未授权访问，数据暴露，服务中断
中：有限数据暴露，部分服务影响
低：轻微信息泄露，影响极小

Step 4: Report Generation (20 minutes)

步骤4：报告生成（20分钟）

Compile comprehensive security assessment.

编制全面的安全评估报告。

Output Format

输出格式

Generate a comprehensive OWASP LLM security audit report:

markdown

undefined

生成完整的OWASP LLM安全审计报告：

markdown

undefined

OWASP LLM Top 10 Security Audit Report

OWASP LLM Top 10安全审计报告

Application: [Name] LLM Provider/Model: [Provider - Model] Date: [Date] Evaluator: [AI Agent or Human] OWASP LLM Top 10 Version: 2025

应用名称：[名称] LLM提供商/模型：[提供商 - 模型] 日期：[日期] 评估者：[AI Agent或人工] OWASP LLM Top 10版本：2025

Executive Summary

执行摘要

Overall Security Posture: [Critical / High Risk / Medium Risk / Low Risk / Secure]

整体安全态势：[关键风险 / 高风险 / 中风险 / 低风险 / 安全]

Application Type: [Chatbot / Agent / RAG System / Content Generator / Code Assistant / Other] Data Sensitivity: [Public / Internal / Confidential / Restricted] User Base: [Internal / B2B / B2C / Public]

应用类型：[聊天机器人 / Agent / RAG系统 / 内容生成器 / 代码助手 / 其他] 数据敏感性：[公开 / 内部 / 机密 / 受限] 用户群体：[内部 / B2B / B2C / 公开]

Critical Findings

关键发现

#	Vulnerability	Severity	Status
LLM01	Prompt Injection	Critical	[Vulnerable / Mitigated / N/A]
LLM02	Sensitive Info Disclosure	Critical	[Vulnerable / Mitigated / N/A]
LLM03	Supply Chain	High	[Vulnerable / Mitigated / N/A]
LLM04	Data/Model Poisoning	High	[Vulnerable / Mitigated / N/A]
LLM05	Improper Output Handling	High	[Vulnerable / Mitigated / N/A]
LLM06	Excessive Agency	High	[Vulnerable / Mitigated / N/A]
LLM07	System Prompt Leakage	Medium	[Vulnerable / Mitigated / N/A]
LLM08	Vector/Embedding Weaknesses	Medium	[Vulnerable / Mitigated / N/A]
LLM09	Misinformation	Medium	[Vulnerable / Mitigated / N/A]
LLM10	Unbounded Consumption	Medium	[Vulnerable / Mitigated / N/A]

#	漏洞	严重程度	状态
LLM01	提示注入	关键	[存在漏洞 / 已缓解 / 不适用]
LLM02	敏感信息泄露	关键	[存在漏洞 / 已缓解 / 不适用]
LLM03	供应链	高	[存在漏洞 / 已缓解 / 不适用]
LLM04	数据/模型毒化	高	[存在漏洞 / 已缓解 / 不适用]
LLM05	输出处理不当	高	[存在漏洞 / 已缓解 / 不适用]
LLM06	过度自主权限	高	[存在漏洞 / 已缓解 / 不适用]
LLM07	系统提示泄露	中	[存在漏洞 / 已缓解 / 不适用]
LLM08	向量/嵌入弱点	中	[存在漏洞 / 已缓解 / 不适用]
LLM09	错误信息生成	中	[存在漏洞 / 已缓解 / 不适用]
LLM10	无限制资源消耗	中	[存在漏洞 / 已缓解 / 不适用]

Top 3 Critical Issues

三大关键问题

[Issue] - [Impact description]
[Issue] - [Impact description]
[Issue] - [Impact description]

[问题] - [影响描述]
[问题] - [影响描述]
[问题] - [影响描述]

Detailed Findings

详细发现

LLM01: Prompt Injection

LLM01: 提示注入

Status: [Vulnerable / Partially Mitigated / Mitigated] Severity: [Critical / High / Medium / Low] Likelihood: [High / Medium / Low]

Findings:

[Finding with evidence]
[Finding with evidence]

Attack Scenario: [Description of how this could be exploited]

Recommendations:

[Specific remediation step]
[Specific remediation step]

Effort: [Low / Medium / High]

[Continue for LLM02 through LLM10...]

状态：[存在漏洞 / 部分缓解 / 已缓解] 严重程度：[关键 / 高 / 中 / 低] 可能性：[高 / 中 / 低]

发现结果：

[带证据的发现]
[带证据的发现]

攻击场景： [该漏洞可能被利用的方式描述]

建议：

[具体修复步骤]
[具体修复步骤]

实施难度：[低 / 中 / 高]

[继续描述LLM02至LLM10...]

Architecture Security Review

架构安全审查

Data Flow Analysis

数据流分析

[Diagram or description of data flows with trust boundaries marked]

[带信任边界标记的数据流图或描述]

Attack Surface Summary

攻击面摘要

Surface	Risk Level	Controls
User Input	[Level]	[Controls]
API Endpoints	[Level]	[Controls]
Vector Store	[Level]	[Controls]
Plugins/Tools	[Level]	[Controls]
Output Rendering	[Level]	[Controls]

攻击面	风险等级	控制措施
用户输入	[等级]	[控制措施]
API端点	[等级]	[控制措施]
向量存储	[等级]	[控制措施]
插件/工具	[等级]	[控制措施]
输出渲染	[等级]	[控制措施]

Remediation Roadmap

修复路线图

Phase 1: Critical (0-7 days)

阶段1：关键（0-7天）

[Action item with owner]
[Action item with owner]

[带负责人的行动项]
[带负责人的行动项]

Phase 2: High Priority (7-30 days)

阶段2：高优先级（7-30天）

[Action item with owner]

[带负责人的行动项]

Phase 3: Medium Priority (30-90 days)

阶段3：中优先级（30-90天）

[Action item with owner]

[带负责人的行动项]

Phase 4: Hardening (Ongoing)

阶段4：强化（持续）

[Continuous improvement practices]

[持续改进实践]

Security Controls Matrix

安全控制矩阵

Control	Implemented	Effective	Recommendation
Input validation	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Output sanitization	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Rate limiting	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Authentication	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Authorization	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Logging/Monitoring	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Content filtering	[Yes/No/Partial]	[Yes/No]	[Recommendation]
Human-in-the-loop	[Yes/No/Partial]	[Yes/No]	[Recommendation]

控制措施	是否已实施	是否有效	建议
输入验证	[是/否/部分]	[是/否]	[建议]
输出清理	[是/否/部分]	[是/否]	[建议]
速率限制	[是/否/部分]	[是/否]	[建议]
身份认证	[是/否/部分]	[是/否]	[建议]
授权	[是/否/部分]	[是/否]	[建议]
日志/监控	[是/否/部分]	[是/否]	[建议]
内容过滤	[是/否/部分]	[是/否]	[建议]
人工审核环节	[是/否/部分]	[是/否]	[建议]

Next Steps

后续步骤

Prioritize and assign critical findings
Implement quick wins (input validation, rate limiting)
Schedule penetration testing for high-risk areas
Establish continuous monitoring
Plan follow-up audit after remediation

确定关键发现的优先级并分配责任人
实施快速修复（输入验证、速率限制）
为高风险区域安排渗透测试
建立持续监控
修复后计划跟进审计

Resources

参考资源

Audit Version: 1.0 Date: [Date]

---

审计版本：1.0 日期：[日期]

---

Quick Reference: Vulnerability Priority

快速参考：漏洞优先级

Priority	Vulnerabilities	Rationale
P0	LLM01 (Prompt Injection), LLM02 (Data Disclosure)	Direct exploitation, high impact
P1	LLM05 (Output Handling), LLM06 (Excessive Agency)	System compromise potential
P2	LLM03 (Supply Chain), LLM04 (Poisoning)	Harder to exploit but severe impact
P3	LLM07 (Prompt Leakage), LLM08 (Vector Weaknesses)	Enables further attacks
P4	LLM09 (Misinformation), LLM10 (Unbounded Consumption)	Operational risk

优先级	漏洞	理由
P0	LLM01（提示注入）、LLM02（数据泄露）	可直接利用，影响重大
P1	LLM05（输出处理不当）、LLM06（过度自主权限）	可能导致系统被 compromise
P2	LLM03（供应链）、LLM04（毒化）	利用难度较高但影响严重
P3	LLM07（提示泄露）、LLM08（向量弱点）	助力后续攻击
P4	LLM09（错误信息生成）、LLM10（无限制资源消耗）	运营风险

Best Practices

最佳实践

Defense in depth: Never rely on a single security control
Zero trust for LLM output: Treat all model output as untrusted
Least privilege: Minimize AI agent permissions and capabilities
Monitor continuously: Log and alert on anomalous AI behavior
Test adversarially: Regular red-team exercises against AI features
Secure the pipeline: Protect training data, models, and embeddings
Human oversight: Maintain human-in-the-loop for critical operations
Update regularly: Stay current with evolving attack techniques
Educate users: Train users on safe AI interaction practices
Plan for incidents: Have AI-specific incident response procedures

纵深防御：绝不依赖单一安全控制
LLM输出零信任：将所有模型输出视为不可信
最小权限：最小化AI Agent的权限与能力
持续监控：记录并警报AI异常行为
对抗性测试：定期对AI功能开展红队演练
安全流水线：保护训练数据、模型与嵌入
人工监督：关键操作保留人工审核环节
定期更新：紧跟不断演变的攻击技术
用户教育：培训用户安全使用AI的实践
事件响应：制定AI特定的事件响应流程

Version

版本

1.0 - Initial release (OWASP Top 10 for LLM Applications 2025)

Remember: LLM security is an evolving field. New attack vectors emerge regularly. This audit provides a baseline assessment; continuous monitoring and periodic re-assessment are essential for maintaining security posture.

1.0 - 初始版本（基于2025版OWASP LLM应用Top 10）

注意：LLM安全是一个不断发展的领域，新的攻击向量会定期出现。本审计提供基线评估，持续监控与定期重新评估对于维持安全态势至关重要。

owasp-llm-top10

Original

Translation

OWASP Top 10 for LLM Applications Security Audit

OWASP LLM应用Top 10安全审计

When to Use This Skill

何时使用该技能

Inputs Required

所需输入

The OWASP Top 10 for LLM Applications (2025)

2025版OWASP LLM应用Top 10

LLM01: Prompt Injection

LLM01: 提示注入

LLM02: Sensitive Information Disclosure

LLM02: 敏感信息泄露

LLM03: Supply Chain Vulnerabilities

LLM03: 供应链漏洞

LLM04: Data and Model Poisoning

LLM04: 数据与模型毒化

LLM05: Improper Output Handling

LLM05: 输出处理不当

LLM06: Excessive Agency

LLM06: 过度自主权限

LLM07: System Prompt Leakage

LLM07: 系统提示泄露

LLM08: Vector and Embedding Weaknesses

LLM08: 向量与嵌入弱点

LLM09: Misinformation

LLM09: 错误信息生成

LLM10: Unbounded Consumption

LLM10: 无限制资源消耗

Audit Procedure

审计流程

Step 1: Application Understanding (15 minutes)

步骤1：应用理解（15分钟）

Step 2: Vulnerability Assessment (40-60 minutes)

步骤2：漏洞评估（40-60分钟）

Prompt Injection (LLM01) - 10 min

LLM01: 提示注入（10分钟）

Sensitive Information Disclosure (LLM02) - 5 min

LLM02: 敏感信息泄露（5分钟）

Supply Chain (LLM03) - 5 min

LLM03: 供应链（5分钟）

Data/Model Poisoning (LLM04) - 5 min

LLM04: 数据/模型毒化（5分钟）

Improper Output Handling (LLM05) - 10 min

LLM05: 输出处理不当（10分钟）

Excessive Agency (LLM06) - 5 min

LLM06: 过度自主权限（5分钟）

System Prompt Leakage (LLM07) - 5 min

LLM07: 系统提示泄露（5分钟）

Vector/Embedding Weaknesses (LLM08) - 5 min

LLM08: 向量/嵌入弱点（5分钟）

Misinformation (LLM09) - 5 min

LLM09: 错误信息生成（5分钟）

Unbounded Consumption (LLM10) - 5 min

LLM10: 无限制资源消耗（5分钟）

Step 3: Risk Scoring (15 minutes)

步骤3：风险评分（15分钟）

Step 4: Report Generation (20 minutes)

步骤4：报告生成（20分钟）

Output Format

输出格式

OWASP LLM Top 10 Security Audit Report

OWASP LLM Top 10安全审计报告

Executive Summary

执行摘要

Overall Security Posture: [Critical / High Risk / Medium Risk / Low Risk / Secure]

整体安全态势：[关键风险 / 高风险 / 中风险 / 低风险 / 安全]

Critical Findings

关键发现

Top 3 Critical Issues

三大关键问题

Detailed Findings

详细发现

LLM01: Prompt Injection

LLM01: 提示注入

Architecture Security Review

架构安全审查

Data Flow Analysis