k8s-yaml-validator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Kubernetes YAML Validator

Kubernetes YAML 验证工具

Overview

概述

This skill provides a comprehensive validation workflow for Kubernetes YAML resources, combining syntax linting, schema validation, cluster dry-run testing, and intelligent CRD documentation lookup. Validate any Kubernetes manifest with confidence before applying it to the cluster.
IMPORTANT: This is a REPORT-ONLY validation tool. Do NOT modify files, do NOT use Edit tool, do NOT use AskUserQuestion to offer fixes. Generate a comprehensive validation report with suggested fixes shown as before/after code blocks, then let the user decide what to do next.
此技能为Kubernetes YAML资源提供全面的验证工作流,结合了语法检查、Schema验证、集群试运行测试以及智能CRD文档查询功能。在将Kubernetes清单应用到集群前,可放心进行验证。
重要提示:这是一个仅用于生成报告的验证工具。 请勿修改文件,请勿使用编辑工具,请勿通过询问用户问题来主动提供修复方案。生成包含建议修复的前后代码块的综合验证报告,然后由用户决定后续操作。

When to Use This Skill

何时使用此技能

Invoke this skill when:
  • Validating Kubernetes YAML files before applying to a cluster
  • Debugging YAML syntax or formatting errors
  • Working with Custom Resource Definitions (CRDs) and need documentation
  • Performing dry-run tests to catch admission controller errors
  • Ensuring YAML follows Kubernetes best practices
  • Understanding what validation errors exist in manifests (report-only, user fixes manually)
  • The user asks to "validate", "lint", "check", or "test" Kubernetes YAML files
在以下场景调用此技能:
  • 将Kubernetes YAML文件应用到集群前进行验证
  • 调试YAML语法或格式错误
  • 处理自定义资源定义(CRD)并需要查阅相关文档
  • 执行试运行测试以捕获准入控制器错误
  • 确保YAML遵循Kubernetes最佳实践
  • 了解清单中存在的验证错误(仅生成报告,由用户手动修复)
  • 用户要求“验证”“检查”“测试”Kubernetes YAML文件时

Validation Workflow

验证工作流

Follow this sequential validation workflow. Each stage catches different types of issues:
遵循以下顺序验证工作流,每个阶段可捕获不同类型的问题:

Stage 0: Pre-Validation Setup (Resource Count Check)

阶段0:预验证准备(资源数量检查)

IMPORTANT: Before running any validation tools, check the file complexity:
  1. Count the number of resources in the file by counting
    ---
    document separators or parsing the file
  2. If the file contains 3 or more resources, immediately load
    references/validation_workflow.md
    :
    Read references/validation_workflow.md
    This ensures you have the complete workflow context for handling complex multi-resource files.
  3. Note the resource count for the validation report summary
This pre-check ensures proper handling of complex files from the start of validation.
重要提示:在运行任何验证工具前,检查文件复杂度:
  1. 统计文件中的资源数量:通过统计
    ---
    文档分隔符或解析文件来实现
  2. 如果文件包含3个或更多资源,立即加载
    references/validation_workflow.md
    Read references/validation_workflow.md
    这确保你拥有处理复杂多资源文件的完整工作流上下文。
  3. 在验证报告摘要中记录资源数量
此预检查确保从验证开始就正确处理复杂文件。

Stage 1: Tool Check

阶段1:工具检查

Before starting validation, verify required tools are installed:
bash
bash scripts/setup_tools.sh
Required tools:
  • yamllint: YAML syntax and style linting
  • kubeconform: Kubernetes schema validation with CRD support
  • kubectl: Cluster dry-run testing (optional but recommended)
If tools are missing, display the installation instructions from the script output and continue with available tools. Document which tools are missing in the validation report.
开始验证前,确认所需工具已安装:
bash
bash scripts/setup_tools.sh
所需工具:
  • yamllint:YAML语法和风格检查工具
  • kubeconform:支持CRD的Kubernetes Schema验证工具
  • kubectl:集群试运行测试工具(可选但推荐)
如果工具缺失,显示脚本输出中的安装说明,并使用可用工具继续验证。在验证报告中记录缺失的工具。

Stage 2: YAML Syntax Validation

阶段2:YAML语法验证

Validate YAML syntax and formatting using yamllint:
bash
yamllint -c assets/.yamllint <file.yaml>
Common issues caught:
  • Indentation errors (tabs vs spaces)
  • Trailing whitespace
  • Line length violations
  • Syntax errors
  • Duplicate keys
Reporting approach:
  • Report all syntax issues with file:line references
  • For fixable issues, show suggested before/after code blocks
  • Continue to next validation stage to collect all issues before reporting
使用yamllint验证YAML语法和格式:
bash
yamllint -c assets/.yamllint <file.yaml>
可捕获的常见问题:
  • 缩进错误(制表符vs空格)
  • 尾随空格
  • 行长度违规
  • 语法错误
  • 重复键
报告方式:
  • 报告所有带文件:行号引用的语法问题
  • 对于可修复的问题,显示建议的前后代码块
  • 在报告前继续执行下一个验证阶段,以收集所有问题

Stage 3: CRD Detection and Documentation Lookup

阶段3:CRD检测与文档查询

Before schema validation, detect if the YAML contains Custom Resource Definitions:
bash
bash scripts/detect_crd_wrapper.sh <file.yaml>
The wrapper script automatically handles Python dependencies by creating a temporary virtual environment if PyYAML is not available.
Resilient Parsing: The script is resilient to syntax errors in individual documents. If a multi-document YAML file has some valid and some invalid documents, the script will:
  • Parse valid documents and detect their CRDs
  • Report errors for invalid documents but continue processing
  • This matches kubeconform's behavior of validating 2/3 resources even when 1/3 has syntax errors
The script outputs JSON with resource information and parse status:
json
{
  "resources": [
    {
      "kind": "Certificate",
      "apiVersion": "cert-manager.io/v1",
      "group": "cert-manager.io",
      "version": "v1",
      "isCRD": true,
      "name": "example-cert"
    }
  ],
  "parseErrors": [
    {
      "document": 1,
      "start_line": 2,
      "error_line": 6,
      "error": "mapping values are not allowed in this context"
    }
  ],
  "summary": {
    "totalDocuments": 3,
    "parsedSuccessfully": 2,
    "parseErrors": 1,
    "crdsDetected": 1
  }
}
For each detected CRD:
  1. Try context7 MCP first (preferred):
    Use mcp__context7__resolve-library-id with the CRD project name
    Example: "cert-manager" for cert-manager.io CRDs
    
    Then use mcp__context7__get-library-docs with:
    - context7CompatibleLibraryID from resolve step
    - topic: The CRD kind (e.g., "Certificate")
    - tokens: 5000 (adjust based on need)
  2. Fallback to WebSearch if context7 fails:
    Search query pattern:
    "<kind>" "<group>" kubernetes CRD "<version>" documentation spec
    
    Example:
    "Certificate" "cert-manager.io" kubernetes CRD "v1" documentation spec
  3. Extract key information:
    • Required fields in
      spec
    • Field types and validation rules
    • Examples from documentation
    • Version-specific changes or deprecations
Secondary CRD Detection via kubeconform: If the detect_crd_wrapper.sh script fails to detect CRDs (e.g., all documents have syntax errors), but kubeconform successfully validates a CRD resource, you should still look up documentation for that CRD. Parse the kubeconform output to identify validated CRDs and perform context7/WebSearch lookups for them.
Why this matters: CRDs have custom schemas not available in standard Kubernetes validation tools. Understanding the CRD's spec requirements prevents validation errors and ensures correct resource configuration.
在Schema验证前,检测YAML是否包含自定义资源定义:
bash
bash scripts/detect_crd_wrapper.sh <file.yaml>
该包装脚本会自动处理Python依赖:如果PyYAML不可用,将创建临时虚拟环境。
弹性解析: 即使单个文档存在语法错误,该脚本也能弹性解析。如果多文档YAML文件中部分文档有效、部分无效,脚本将:
  • 解析有效文档并检测其中的CRD
  • 报告无效文档的错误,但继续处理其他文档
  • 这与kubeconform的行为一致:即使1/3的文档有语法错误,仍会验证另外2/3的资源
脚本输出包含资源信息和解析状态的JSON:
json
{
  "resources": [
    {
      "kind": "Certificate",
      "apiVersion": "cert-manager.io/v1",
      "group": "cert-manager.io",
      "version": "v1",
      "isCRD": true,
      "name": "example-cert"
    }
  ],
  "parseErrors": [
    {
      "document": 1,
      "start_line": 2,
      "error_line": 6,
      "error": "mapping values are not allowed in this context"
    }
  ],
  "summary": {
    "totalDocuments": 3,
    "parsedSuccessfully": 2,
    "parseErrors": 1,
    "crdsDetected": 1
  }
}
针对每个检测到的CRD:
  1. 优先尝试context7 MCP:
    使用mcp__context7__resolve-library-id,传入CRD项目名称
    示例:cert-manager.io CRD传入"cert-manager"
    
    然后使用mcp__context7__get-library-docs,参数为:
    - resolve步骤得到的context7CompatibleLibraryID
    - topic:CRD的kind(例如"Certificate")
    - tokens:5000(根据需要调整)
  2. 如果context7失败,回退到Web搜索:
    搜索查询格式:
    "<kind>" "<group>" kubernetes CRD "<version>" documentation spec
    
    示例:
    "Certificate" "cert-manager.io" kubernetes CRD "v1" documentation spec
  3. 提取关键信息:
    • spec
      中的必填字段
    • 字段类型和验证规则
    • 文档中的示例
    • 版本特定的变更或弃用信息
通过kubeconform进行二次CRD检测: 如果detect_crd_wrapper.sh脚本未能检测到CRD(例如所有文档都有语法错误),但kubeconform成功验证了某个CRD资源,仍需查阅该CRD的文档。解析kubeconform的输出以识别已验证的CRD,并执行context7/网页搜索查询。
为什么这很重要: CRD拥有标准Kubernetes验证工具中不存在的自定义Schema。了解CRD的spec要求可避免验证错误,并确保资源配置正确。

Stage 4: Schema Validation

阶段4:Schema验证

Validate against Kubernetes schemas using kubeconform:
bash
kubeconform \
  -schema-location default \
  -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
  -strict \
  -ignore-missing-schemas \
  -summary \
  -verbose \
  <file.yaml>
Options explained:
  • -strict
    : Reject unknown fields (recommended for production - catches typos)
  • -ignore-missing-schemas
    : Skip validation for CRDs without available schemas
  • -kubernetes-version 1.30.0
    : Validate against specific K8s version
Common issues caught:
  • Invalid apiVersion or kind
  • Missing required fields
  • Wrong field types
  • Invalid enum values
  • Unknown fields (with -strict)
For CRDs: If kubeconform reports "no schema found", this is expected. Use the documentation from Stage 3 to manually validate the spec fields.
使用kubeconform验证Kubernetes Schema:
bash
kubeconform \
  -schema-location default \
  -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
  -strict \
  -ignore-missing-schemas \
  -summary \
  -verbose \
  <file.yaml>
选项说明:
  • -strict
    :拒绝未知字段(生产环境推荐,可捕获拼写错误)
  • -ignore-missing-schemas
    :跳过无可用Schema的CRD验证
  • -kubernetes-version 1.30.0
    :针对特定K8s版本进行验证
可捕获的常见问题:
  • 无效的apiVersion或kind
  • 缺失必填字段
  • 错误的字段类型
  • 无效的枚举值
  • 未知字段(启用-strict时)
针对CRD: 如果kubeconform报告“no schema found”,这属于预期情况。使用阶段3获取的文档手动验证spec字段。

Stage 5: Cluster Dry-Run (if available)

阶段5:集群试运行(如果可用)

IMPORTANT: Always try server-side dry-run first. Server-side validation catches more issues than client-side because it runs through admission controllers and webhooks.
Decision Tree:
1. Try server-side dry-run first:
   kubectl apply --dry-run=server -f <file.yaml>

   └─ If SUCCESS → Use results, continue to Stage 6

   └─ If FAILS with connection error (e.g., "connection refused",
      "unable to connect", "no configuration"):
      ├─ 2. Fall back to client-side dry-run:
      │     kubectl apply --dry-run=client -f <file.yaml>
      │     Document in report: "Server-side validation skipped (no cluster access)"
      └─ If FAILS with validation error (e.g., "admission webhook denied",
         "resource quota exceeded", "invalid value"):
         └─ Record the error, continue to Stage 6

   └─ If FAILS with parse error (e.g., "error converting YAML to JSON",
      "yaml: line X: mapping values are not allowed"):
      └─ Record the error, skip client-side dry-run (same error will occur)
         Document in report: "Dry-run blocked by YAML syntax errors - fix syntax first"
         Continue to Stage 6
Note: Parse errors from earlier stages (yamllint, kubeconform) will also cause dry-run to fail. Do NOT attempt client-side dry-run as a fallback for parse errors - it will produce the same error. Parse errors must be fixed before dry-run validation can proceed.
Server-side dry-run catches:
  • Admission controller rejections
  • Policy violations (PSP, OPA, Kyverno, etc.)
  • Resource quota violations
  • Missing namespaces
  • Invalid ConfigMap/Secret references
  • Webhook validations
Client-side dry-run catches (fallback):
  • Basic schema validation
  • Required field checks
  • Type validation
  • Note: Does NOT catch admission controller or policy issues
Document in your report which mode was used:
  • If server-side: "Full cluster validation performed"
  • If client-side: "Limited validation (no cluster access) - admission policies not checked"
  • If skipped: "Dry-run skipped - kubectl not available"
For updates to existing resources:
bash
kubectl diff -f <file.yaml>
This shows what would change, helping catch unintended modifications.
重要提示:始终优先尝试服务器端试运行。 服务器端验证比客户端验证能捕获更多问题,因为它会经过准入控制器和Webhook。
决策树:
1. 首先尝试服务器端试运行:
   kubectl apply --dry-run=server -f <file.yaml>

   └─ 如果成功 → 使用结果,继续到阶段6

   └─ 如果因连接错误失败(例如"connection refused",
      "unable to connect", "no configuration"):
      ├─ 2. 回退到客户端试运行:
      │     kubectl apply --dry-run=client -f <file.yaml>
      │     在报告中记录:"跳过服务器端验证(无集群访问权限)"
      └─ 如果因验证错误失败(例如"admission webhook denied",
         "resource quota exceeded", "invalid value"):
         └─ 记录错误,继续到阶段6

   └─ 如果因解析错误失败(例如"error converting YAML to JSON",
      "yaml: line X: mapping values are not allowed"):
      └─ 记录错误,跳过客户端试运行(会产生相同错误)
         在报告中记录:"试运行被YAML语法错误阻止 - 请先修复语法问题"
         继续到阶段6
注意: 早期阶段(yamllint、kubeconform)的解析错误也会导致试运行失败。不要尝试将客户端试运行作为解析错误的回退方案——它会产生相同的错误。必须先修复解析错误,才能继续进行试运行验证。
服务器端试运行可捕获:
  • 准入控制器拒绝
  • 策略违规(PSP、OPA、Kyverno等)
  • 资源配额违规
  • 缺失的命名空间
  • 无效的ConfigMap/Secret引用
  • Webhook验证
客户端试运行可捕获(回退方案):
  • 基础Schema验证
  • 必填字段检查
  • 类型验证
  • 注意: 无法捕获准入控制器或策略问题
在报告中记录使用的模式:
  • 如果使用服务器端:"已执行完整集群验证"
  • 如果使用客户端:"有限验证(无集群访问权限) - 未检查准入策略"
  • 如果跳过:"跳过试运行 - kubectl不可用"
针对现有资源的更新:
bash
kubectl diff -f <file.yaml>
此命令显示将发生的变更,有助于捕获意外修改。

Stage 6: Generate Detailed Validation Report (REPORT ONLY)

阶段6:生成详细验证报告(仅报告)

After completing all validation stages, generate a comprehensive report. This is a REPORT-ONLY stage.
NEVER do any of the following:
  • Do NOT use the Edit tool to modify files
  • Do NOT use AskUserQuestion to offer to fix issues
  • Do NOT prompt the user asking if they want fixes applied
  • Do NOT modify any YAML files
ALWAYS do the following:
  • Generate a comprehensive validation report
  • Show before/after code blocks as SUGGESTIONS only
  • Let the user decide what to do after reviewing the report
  • End with "Next Steps" for the user to take manually
  1. Summarize all issues found across all stages in a table format:
    | Severity | Stage | Location | Issue | Suggested Fix |
    |----------|-------|----------|-------|---------------|
    | Error | Syntax | file.yaml:5 | Indentation error | Use 2 spaces |
    | Error | Schema | file.yaml:21 | Wrong type | Change to integer |
    | Warning | Best Practice | file.yaml:30 | Missing labels | Add app label |
  2. Categorize by severity:
    • Errors (must fix): Syntax errors, missing required fields, dry-run failures
    • Warnings (should fix): Style issues, best practice violations
    • Info (optional): Suggestions for improvement
  3. Show before/after code blocks for each issue:
    For every issue, display explicit before/after YAML snippets showing the suggested fix:
    **Issue 1: deployment.yaml:21 - Wrong field type (Error)**
    
    Current:
    ```yaml
            - containerPort: "80"
    Suggested Fix:
    yaml
            - containerPort: 80
    Why: containerPort must be an integer, not a string. Kubernetes will reject string values. Reference: See k8s_best_practices.md "Invalid Values" section.
    undefined
  4. Provide validation summary:
    ## Validation Report Summary
    
    File: deployment.yaml
    Resources Analyzed: 3 (Deployment, Service, Certificate)
    
    | Stage | Status | Issues Found |
    |-------|--------|--------------|
    | YAML Syntax | ❌ Failed | 2 errors |
    | CRD Detection | ✅ Passed | 1 CRD detected (Certificate) |
    | Schema Validation | ❌ Failed | 1 error |
    | Dry-Run | ❌ Failed | 1 error |
    
    Total Issues: 4 errors, 2 warnings
    
    ## Detailed Findings
    
    [List each issue with before/after code blocks as shown above]
    
    ## Next Steps
    
    1. Fix the 4 errors listed above (deployment will fail without these)
    2. Consider addressing the 2 warnings for best practices
    3. Re-run validation after fixes to confirm resolution
  5. Do NOT modify files - this is a reporting tool only
    • Present all findings clearly
    • Let the user decide which fixes to apply
    • User can request fixes after reviewing the report
完成所有验证阶段后,生成综合报告。这是仅报告阶段。
绝对禁止以下操作:
  • 不要使用编辑工具修改文件
  • 不要通过询问用户问题来主动提供修复
  • 不要提示用户是否需要应用修复
  • 不要修改任何YAML文件
必须执行以下操作:
  • 生成综合验证报告
  • 将前后代码块仅作为建议展示
  • 让用户在查看报告后决定后续操作
  • 以“下一步操作”结尾,供用户手动执行
  1. 以表格形式汇总所有阶段发现的问题:
    | 严重程度 | 阶段 | 位置 | 问题 | 建议修复 |
    |----------|-------|----------|-------|---------------|
    | 错误 | 语法 | file.yaml:5 | 缩进错误 | 使用2个空格 |
    | 错误 | Schema | file.yaml:21 | 类型错误 | 改为整数 |
    | 警告 | 最佳实践 | file.yaml:30 | 缺失标签 | 添加app标签 |
  2. 按严重程度分类:
    • 错误(必须修复):语法错误、缺失必填字段、试运行失败
    • 警告(建议修复):风格问题、最佳实践违规
    • 信息(可选):改进建议
  3. 为每个问题显示前后代码块:
    针对每个问题,显示明确的YAML代码片段,展示建议的修复:
    **问题1:deployment.yaml:21 - 字段类型错误(错误)**
    
    当前代码:
    ```yaml
            - containerPort: "80"
    建议修复:
    yaml
            - containerPort: 80
    原因:containerPort必须是整数,而非字符串。Kubernetes会拒绝字符串值。 参考:查看k8s_best_practices.md中的“无效值”章节。
    undefined
  4. 提供验证摘要:
    ## 验证报告摘要
    
    文件:deployment.yaml
    分析的资源:3个(Deployment、Service、Certificate)
    
    | 阶段 | 状态 | 发现问题数 |
    |-------|--------|--------------|
    | YAML语法 | ❌ 失败 | 2个错误 |
    | CRD检测 | ✅ 通过 | 检测到1个CRD(Certificate) |
    | Schema验证 | ❌ 失败 | 1个错误 |
    | 试运行 | ❌ 失败 | 1个错误 |
    
    总问题数:4个错误,2个警告
    
    ## 详细发现
    
    [按上述格式列出每个问题及前后代码块]
    
    ## 下一步操作
    
    1. 修复上述列出的4个错误(不修复这些错误,部署将失败)
    2. 考虑解决2个警告问题以符合最佳实践
    3. 修复后重新运行验证以确认问题已解决
  5. 请勿修改文件 - 这只是一个报告工具
    • 清晰呈现所有发现
    • 让用户决定应用哪些修复
    • 用户查看报告后可请求修复

Best Practices Reference

最佳实践参考

For detailed Kubernetes YAML best practices, load the reference:
Read references/k8s_best_practices.md
This reference includes:
  • Metadata and label conventions
  • Resource limits and requests
  • Security context guidelines
  • Probe configurations
  • Common validation issues and fixes
When to load (ALWAYS load in these cases):
  • Schema validation fails with type errors (e.g., string vs integer, invalid values)
  • Schema validation reports missing required fields
  • kubeconform reports invalid field values or unknown fields
  • Dry-run fails with validation errors related to resources, probes, or security
  • When explaining why a fix is needed (to provide context from best practices)
如需详细的Kubernetes YAML最佳实践,加载以下参考文档:
Read references/k8s_best_practices.md
此参考文档包括:
  • 元数据和标签约定
  • 资源限制和请求
  • 安全上下文指南
  • 探针配置
  • 常见验证问题及修复方法
何时加载(在以下情况必须加载):
  • Schema验证因类型错误失败(例如字符串vs整数、无效值)
  • Schema验证报告缺失必填字段
  • kubeconform报告无效字段值或未知字段
  • 试运行因资源、探针或安全相关的验证错误失败
  • 解释修复必要性时(从最佳实践中提供上下文)

Detailed Validation Workflow Reference

详细验证工作流参考

For in-depth workflow details and error handling strategies, load the reference:
Read references/validation_workflow.md
This reference includes:
  • Detailed command options for each tool
  • Error handling strategies
  • Multi-resource file handling
  • Complete workflow diagram
  • Troubleshooting guide
When to load (ALWAYS load in these cases):
  • File contains 3 or more resources (multi-document YAML)
  • Validation produces errors you haven't seen before or can't immediately diagnose
  • Need to understand the complete workflow for debugging
  • Errors span multiple validation stages
如需深入的工作流细节和错误处理策略,加载以下参考文档:
Read references/validation_workflow.md
此参考文档包括:
  • 每个工具的详细命令选项
  • 错误处理策略
  • 多资源文件处理
  • 完整工作流图
  • 故障排除指南
何时加载(在以下情况必须加载):
  • 文件包含3个或更多资源(多文档YAML)
  • 验证产生你从未见过或无法立即诊断的错误
  • 需要了解完整的工作流以进行调试
  • 错误跨多个验证阶段

Working with Multiple Resources

处理多资源文件

When a YAML file contains multiple resources (separated by
---
):
  1. Validate the entire file first with yamllint and kubeconform
  2. If errors occur, identify which resource has issues by checking line numbers
  3. For dry-run, the file is tested as a unit (Kubernetes processes in order)
  4. Track issues per-resource when presenting findings to the user
当YAML文件包含多个资源(由
---
分隔)时:
  1. 首先验证整个文件:使用yamllint和kubeconform
  2. 如果发生错误,确定哪个资源有问题:通过检查行号
  3. 对于试运行:文件作为一个单元进行测试(Kubernetes按顺序处理)
  4. 在向用户呈现发现时,按资源跟踪问题

Partial Parsing Behavior

部分解析行为

When a multi-document YAML file has some valid and some invalid documents:
Expected behavior:
  • The CRD detection script (
    detect_crd.py
    ) will parse valid documents and skip invalid ones
  • kubeconform will validate resources it can parse and report errors for unparseable ones
  • The validation report should clearly show which documents parsed and which failed
Example scenario: A file with 3 documents where document 1 has a syntax error:
  • Document 1 (Deployment): Syntax error at line 6
  • Document 2 (Service): Valid
  • Document 3 (Certificate CRD): Valid
Expected output:
  • CRD detection: Finds Certificate CRD from document 3
  • kubeconform: Reports error for document 1, validates documents 2 and 3
  • Report: Shows syntax error for document 1, validation results for documents 2 and 3
In your report:
| Document | Resource | Parsing | Validation |
|----------|----------|---------|------------|
| 1 | Deployment | ❌ Syntax error (line 6) | Skipped |
| 2 | Service | ✅ Parsed | ✅ Valid |
| 3 | Certificate | ✅ Parsed | ✅ Valid |
Line Number Reference Style:
  • Always use file-absolute line numbers (line numbers relative to the start of the entire file)
  • This matches what yamllint, kubeconform, and kubectl report
  • Example: If a file has 3 documents and the error is in document 2 which starts at line 35, report as "line 42" (the absolute line in the file), not "line 7" (relative to document start)
  • This consistency makes it easy for users to navigate directly to the error in their editor
This ensures users get maximum validation feedback even when some documents have issues.
当多文档YAML文件中部分文档有效、部分无效时:
预期行为:
  • CRD检测脚本(
    detect_crd.py
    )将解析有效文档并跳过无效文档
  • kubeconform将验证可解析的资源,并报告无法解析的资源的错误
  • 验证报告应清晰显示哪些文档已解析、哪些失败
示例场景: 一个包含3个文档的文件,其中文档1有语法错误:
  • 文档1(Deployment):第6行有语法错误
  • 文档2(Service):有效
  • 文档3(Certificate CRD):有效
预期输出:
  • CRD检测:从文档3中检测到Certificate CRD
  • kubeconform:报告文档1的错误,验证文档2和3
  • 报告:显示文档1的语法错误,以及文档2和3的验证结果
在报告中:
| 文档 | 资源 | 解析状态 | 验证状态 |
|----------|----------|---------|------------|
| 1 | Deployment | ❌ 语法错误(第6行) | 已跳过 |
| 2 | Service | ✅ 已解析 | ✅ 有效 |
| 3 | Certificate | ✅ 已解析 | ✅ 有效 |
行号引用格式:
  • 始终使用文件绝对行号(相对于整个文件开头的行号)
  • 这与yamllint、kubeconform和kubectl报告的行号一致
  • 示例:如果一个文件有3个文档,错误位于从第35行开始的文档2中,报告为“第42行”(文件中的绝对行号),而非“第7行”(相对于文档开头的行号)
  • 这种一致性便于用户直接在编辑器中导航到错误位置
即使部分文档存在问题,这也能确保用户获得最多的验证反馈。

Error Handling Strategies

错误处理策略

Tool Not Available

工具不可用

  • Run
    scripts/setup_tools.sh
    to check availability
  • Provide installation instructions
  • Skip optional stages but document what was skipped
  • Continue with available tools
  • 运行
    scripts/setup_tools.sh
    检查可用性
  • 提供安装说明
  • 跳过可选阶段,但记录跳过的内容
  • 使用可用工具继续验证

Cluster Access Issues

集群访问问题

  • Fall back to client-side dry-run
  • Skip cluster validation if no kubectl config
  • Document limitations in validation report
  • 回退到客户端试运行
  • 如果没有kubectl配置,跳过集群验证
  • 在验证报告中记录限制

CRD Documentation Not Found

CRD文档未找到

  • Document that documentation lookup failed
  • Attempt validation with kubeconform CRD schemas
  • Suggest manual CRD inspection:
    bash
    kubectl get crd <crd-name>.group -o yaml
    kubectl explain <kind>
  • 记录文档查询失败
  • 尝试使用kubeconform CRD Schema进行验证
  • 建议手动检查CRD:
    bash
    kubectl get crd <crd-name>.group -o yaml
    kubectl explain <kind>

Validation Stage Failures

验证阶段失败

  • Continue to next stage even if one fails
  • Collect all errors before presenting to user
  • Prioritize fixing earlier stage errors first
  • 即使某个阶段失败,仍继续执行下一个阶段
  • 向用户呈现前收集所有错误
  • 优先修复早期阶段的错误

Communication Guidelines

沟通指南

When presenting validation results:
  1. Be clear and concise about what was found
  2. Explain why issues matter (e.g., "This will cause pod creation to fail")
  3. Provide context from best practices when relevant
  4. Group related issues (e.g., all missing label issues together)
  5. Use file:line references for all issues
  6. Show fix complexity - Include a complexity indicator in the issue header:
    • [Simple]: Single-line fixes like indentation, typos, or value changes
    • [Medium]: Multi-line changes or adding missing fields/sections
    • [Complex]: Logic changes, restructuring, or changes affecting multiple resources
    Example format in issue header:
    **Issue 1: deployment.yaml:8 - Wrong indentation (Error) [Simple]**
    **Issue 2: deployment.yaml:15-25 - Missing security context (Warning) [Medium]**
    **Issue 3: deployment.yaml - Selector mismatch with Service (Error) [Complex]**
  7. Always provide a comprehensive report including:
    • Summary table of all issues by stage
    • Before/after code blocks for each issue
    • Total count of errors and warnings
    • Clear next steps for the user
  8. NEVER offer to apply fixes - this is strictly a reporting tool
    • Do not ask "Would you like me to fix this?"
    • Do not use AskUserQuestion for fix confirmations
    • Present the report and let the user take action
呈现验证结果时:
  1. 清晰简洁地说明发现的问题
  2. 解释问题的影响(例如“这将导致Pod创建失败”)
  3. 提供相关最佳实践的上下文
  4. 将相关问题分组(例如所有缺失标签的问题放在一起)
  5. 对所有问题使用文件:行号引用
  6. 显示修复复杂度 - 在问题标题中包含复杂度标识:
    • [简单]:单行修复,如缩进、拼写错误或值变更
    • [中等]:多行变更或添加缺失的字段/章节
    • [复杂]:逻辑变更、结构调整或影响多个资源的变更
    问题标题示例格式:
    **问题1:deployment.yaml:8 - 缩进错误(错误)[简单]**
    **问题2:deployment.yaml:15-25 - 缺失安全上下文(警告)[中等]**
    **问题3:deployment.yaml - 与Service的选择器不匹配(错误)[复杂]**
  7. 始终提供综合报告,包括:
    • 按严重程度统计的问题摘要表
    • 各阶段状态表(通过/失败/跳过)
    • 文档解析表(针对多资源文件)
    • 每个问题的前后代码块
    • 错误和警告的总数
    • 清晰的用户下一步操作
  8. 绝对不要主动提供修复 - 这严格是一个报告工具
    • 不要询问“你需要我修复这个问题吗?”
    • 不要通过询问用户问题来确认修复
    • 呈现报告,让用户采取行动

Performance Optimization

性能优化

Parallel Tool Execution

并行工具执行

For improved validation speed, some stages can be executed in parallel:
Can run in parallel (no dependencies):
  • yamllint
    (Stage 2) and
    detect_crd_wrapper.sh
    (Stage 3) can run simultaneously
  • Both tools operate independently on the input file
  • Results from both are needed before proceeding to schema validation
Example parallel execution:
undefined
为提高验证速度,部分阶段可并行执行:
可并行执行(无依赖):
  • yamllint
    (阶段2)和
    detect_crd_wrapper.sh
    (阶段3)可同时运行
  • 两个工具独立处理输入文件
  • 进行Schema验证前需要两者的结果
并行执行示例:
undefined

Run these in parallel (using & and wait, or parallel tool calls):

并行运行这些命令(使用&和wait,或并行工具调用):

yamllint -c assets/.yamllint <file.yaml> bash scripts/detect_crd_wrapper.sh <file.yaml>

**Must run sequentially:**
- Stage 0 (Resource Count Check) → Before all other stages
- Stage 1 (Tool Check) → Before using any tools
- Stage 4 (Schema Validation) → After CRD detection (needs CRD info for context)
- Stage 5 (Dry-Run) → After schema validation
- Stage 6 (Report) → After all validation stages complete

**When to parallelize:**
- Files with more than 5 resources benefit most from parallel execution
- For small files (1-2 resources), sequential execution is fine
yamllint -c assets/.yamllint <file.yaml> bash scripts/detect_crd_wrapper.sh <file.yaml>

**必须顺序执行:**
- 阶段0(资源数量检查)→ 在所有其他阶段之前
- 阶段1(工具检查)→ 在使用任何工具之前
- 阶段4(Schema验证)→ 在CRD检测之后(需要CRD信息作为上下文)
- 阶段5(试运行)→ 在Schema验证之后
- 阶段6(报告)→ 在所有验证阶段完成之后

**何时并行执行:**
- 包含5个以上资源的文件最能从并行执行中受益
- 对于小文件(1-2个资源),顺序执行即可

Version Awareness

版本兼容性

Always consider Kubernetes version compatibility:
  • Check for deprecated APIs (e.g.,
    extensions/v1beta1
    apps/v1
    )
  • For CRDs, ensure the apiVersion matches what's in the cluster
  • Use
    kubectl api-versions
    to list available API versions in the cluster
  • Reference version-specific documentation when available
始终考虑Kubernetes版本兼容性:
  • 检查已弃用的API(例如
    extensions/v1beta1
    apps/v1
  • 对于CRD,确保apiVersion与集群中的版本匹配
  • 使用
    kubectl api-versions
    列出集群中可用的API版本
  • 尽可能参考版本特定的文档

Test Coverage Guidance

测试覆盖指南

The
test/
directory contains example files to exercise all validation paths. Use these to verify skill behavior.
test/
目录包含用于测试所有验证路径的示例文件。使用这些文件验证技能行为。

Test Files

测试文件

Test FilePurposeExpected Behavior
deployment-test.yaml
Valid standard K8s resourceAll stages pass, no errors
certificate-crd-test.yaml
Valid CRD resourceCRD detected, context7 lookup performed, no errors
comprehensive-test.yaml
Multi-resource with intentional errorsSyntax error detected, partial parsing works, CRD found
测试文件用途预期行为
deployment-test.yaml
有效的标准K8s资源所有阶段通过,无错误
certificate-crd-test.yaml
有效的CRD资源检测到CRD,执行context7查询,无错误
comprehensive-test.yaml
包含故意错误的多资源文件检测到语法错误,部分解析有效,找到CRD

Validation Paths to Test

需测试的验证路径

  1. Happy Path (All Valid)
    • File:
      deployment-test.yaml
    • Expected: All stages pass, report shows "0 errors, 0 warnings"
  2. CRD Detection Path
    • File:
      certificate-crd-test.yaml
    • Expected: CRD detected, context7 MCP called, documentation retrieved
  3. Syntax Error Path
    • File:
      comprehensive-test.yaml
    • Expected: yamllint catches error, kubeconform reports partial validation, dry-run blocked
  4. Multi-Resource Partial Parsing
    • File:
      comprehensive-test.yaml
      (has 3 resources, 1 with syntax error)
    • Expected: 2/3 resources validated, parse error reported for document 1
  5. No Cluster Access Path
    • Any valid file with no kubectl cluster configured
    • Expected: Server-side dry-run fails, falls back to client-side
  6. Missing Tools Path
    • Test by temporarily removing a tool from PATH
    • Expected: setup_tools.sh reports missing, validation continues with available tools
  1. 正常路径(全部有效)
    • 文件:
      deployment-test.yaml
    • 预期:所有阶段通过,报告显示“0错误,0警告”
  2. CRD检测路径
    • 文件:
      certificate-crd-test.yaml
    • 预期:检测到CRD,调用context7 MCP,获取文档
  3. 语法错误路径
    • 文件:
      comprehensive-test.yaml
    • 预期:yamllint捕获错误,kubeconform报告部分验证,试运行被阻止
  4. 多资源部分解析
    • 文件:
      comprehensive-test.yaml
      (包含3个资源,1个有语法错误)
    • 预期:验证2/3的资源,报告文档1的解析错误
  5. 无集群访问路径
    • 任何有效文件,但未配置kubectl集群
    • 预期:服务器端试运行失败,回退到客户端
  6. 缺失工具路径
    • 通过临时从PATH中移除工具进行测试
    • 预期:setup_tools.sh报告缺失,使用可用工具继续验证

Creating New Test Files

创建新测试文件

When adding test files:
  1. Name files descriptively:
    <scenario>-test.yaml
  2. Document expected behavior in comments at top of file
  3. Include intentional errors for error-path tests
  4. Test both standard K8s resources and CRDs
添加测试文件时:
  1. 为文件起描述性名称:
    <场景>-test.yaml
  2. 在文件顶部的注释中记录预期行为
  3. 为错误路径测试包含故意错误
  4. 同时测试标准K8s资源和CRD

Expected Report Structure

预期报告结构

For any validation, the report should include:
  • Summary table with issue counts by severity
  • Stage-by-stage status table (passed/failed/skipped)
  • Document parsing table (for multi-resource files)
  • Before/after code blocks for each issue
  • Fix complexity indicators ([Simple], [Medium], [Complex])
  • File-absolute line numbers
  • "Next Steps" section
任何验证的报告应包含:
  • 按严重程度统计的问题摘要表
  • 各阶段状态表(通过/失败/跳过)
  • 文档解析表(针对多资源文件)
  • 每个问题的前后代码块
  • 修复复杂度标识([简单]、[中等]、[复杂])
  • 文件绝对行号
  • “下一步操作”章节

Resources

资源

scripts/

scripts/

detect_crd_wrapper.sh
  • Wrapper script that handles Python dependency management
  • Automatically creates temporary venv if PyYAML is not available
  • Calls detect_crd.py to parse YAML files
  • Usage:
    bash scripts/detect_crd_wrapper.sh <file.yaml>
detect_crd.py
  • Parses YAML files to identify Custom Resource Definitions
  • Extracts kind, apiVersion, group, and version information
  • Outputs JSON for programmatic processing
  • Requires PyYAML (handled automatically by wrapper script)
  • Can be called directly:
    python3 scripts/detect_crd.py <file.yaml>
setup_tools.sh
  • Checks for required validation tools
  • Provides installation instructions for missing tools
  • Verifies versions of installed tools
  • Usage:
    bash scripts/setup_tools.sh
detect_crd_wrapper.sh
  • 处理Python依赖管理的包装脚本
  • 如果PyYAML不可用,自动创建临时虚拟环境
  • 调用detect_crd.py解析YAML文件
  • 使用方法:
    bash scripts/detect_crd_wrapper.sh <file.yaml>
detect_crd.py
  • 解析YAML文件以识别自定义资源定义
  • 提取kind、apiVersion、group和版本信息
  • 输出JSON以供程序化处理
  • 需要PyYAML(由包装脚本自动处理)
  • 可直接调用:
    python3 scripts/detect_crd.py <file.yaml>
setup_tools.sh
  • 检查所需验证工具是否存在
  • 为缺失的工具提供安装说明
  • 验证已安装工具的版本
  • 使用方法:
    bash scripts/setup_tools.sh

references/

references/

k8s_best_practices.md
  • Comprehensive guide to Kubernetes YAML best practices
  • Covers metadata, labels, resource limits, security context
  • Common validation issues and how to fix them
  • Load when providing context for validation errors
validation_workflow.md
  • Detailed validation workflow with all stages
  • Command options and configurations
  • Error handling strategies
  • Complete workflow diagram
  • Load for complex validation scenarios
k8s_best_practices.md
  • Kubernetes YAML最佳实践综合指南
  • 涵盖元数据、标签、资源限制、安全上下文
  • 常见验证问题及修复方法
  • 解释验证错误的原因时加载此文档
validation_workflow.md
  • 包含所有阶段的详细验证工作流
  • 命令选项和配置
  • 错误处理策略
  • 完整工作流图
  • 复杂验证场景下加载此文档

assets/

assets/

.yamllint
  • Pre-configured yamllint rules for Kubernetes YAML
  • Follows Kubernetes conventions (2-space indentation, line length, etc.)
  • Can be customized per project
  • Usage:
    yamllint -c assets/.yamllint <file.yaml>
.yamllint
  • 针对Kubernetes YAML的预配置yamllint规则
  • 遵循Kubernetes约定(2空格缩进、行长度等)
  • 可按项目自定义
  • 使用方法:
    yamllint -c assets/.yamllint <file.yaml>