gitops-principles-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

GitOps Principles Skill

GitOps 原则技能指南

Complete guide for implementing GitOps methodology in Kubernetes environments - the operational framework where Git is the single source of truth for declarative infrastructure and applications.
这是一份在Kubernetes环境中实施GitOps方法论的完整指南——GitOps是一种操作框架,其中Git是声明式基础设施和应用的单一事实源

What is GitOps?

什么是GitOps?

GitOps is a set of practices that uses Git repositories as the source of truth for defining the desired state of infrastructure and applications. An automated process ensures the production environment matches the state described in the repository.
GitOps是一系列实践,它将Git仓库作为定义基础设施和应用期望状态的事实源。自动化流程确保生产环境与仓库中描述的状态保持一致。

The OpenGitOps Definition (CNCF)

OpenGitOps 定义(CNCF)

GitOps is defined by four core principles established by the OpenGitOps project (part of CNCF):
PrincipleDescription
1. DeclarativeThe entire system must be described declaratively
2. Versioned and ImmutableDesired state is stored in a way that enforces immutability, versioning, and retention
3. Pulled AutomaticallySoftware agents automatically pull desired state from the source
4. Continuously ReconciledAgents continuously observe and attempt to apply desired state
GitOps由OpenGitOps项目(CNCF旗下)确立的四大核心原则定义:
原则描述
1. 声明式整个系统必须以声明式方式描述
2. 版本化与不可变期望状态的存储需满足不可变性、版本化和可留存性要求
3. 自动拉取软件Agent自动从源拉取期望状态
4. 持续调和Agent持续监控并尝试使实际状态匹配期望状态

Core Concepts Quick Reference

核心概念速览

Git as Single Source of Truth

Git 作为单一事实源

┌─────────────────────────────────────────────────────────────────┐
│                        GIT REPOSITORY                           │
│  (Single Source of Truth for Desired State)                    │
├─────────────────────────────────────────────────────────────────┤
│  manifests/                                                     │
│  ├── base/                    # Base configurations             │
│  │   ├── deployment.yaml                                        │
│  │   ├── service.yaml                                           │
│  │   └── kustomization.yaml                                     │
│  └── overlays/                # Environment-specific            │
│      ├── dev/                                                   │
│      ├── staging/                                               │
│      └── production/                                            │
└─────────────────────────────────────────────────────────────────┘
                              ▼ Pull (not Push)
┌─────────────────────────────────────────────────────────────────┐
│                      GITOPS CONTROLLER                          │
│  (ArgoCD / Flux / Kargo)                                       │
│  - Continuously watches Git repository                          │
│  - Compares desired state vs actual state                       │
│  - Reconciles differences automatically                         │
└─────────────────────────────────────────────────────────────────┘
                              ▼ Apply
┌─────────────────────────────────────────────────────────────────┐
│                    KUBERNETES CLUSTER                           │
│  (Actual State / Runtime Environment)                          │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│                        GIT REPOSITORY                           │
│  (Single Source of Truth for Desired State)                    │
├─────────────────────────────────────────────────────────────────┤
│  manifests/                                                     │
│  ├── base/                    # Base configurations             │
│  │   ├── deployment.yaml                                        │
│  │   ├── service.yaml                                           │
│  │   └── kustomization.yaml                                     │
│  └── overlays/                # Environment-specific            │
│      ├── dev/                                                   │
│      ├── staging/                                               │
│      └── production/                                            │
└─────────────────────────────────────────────────────────────────┘
                              ▼ Pull (not Push)
┌─────────────────────────────────────────────────────────────────┐
│                      GITOPS CONTROLLER                          │
│  (ArgoCD / Flux / Kargo)                                       │
│  - Continuously watches Git repository                          │
│  - Compares desired state vs actual state                       │
│  - Reconciles differences automatically                         │
└─────────────────────────────────────────────────────────────────┘
                              ▼ Apply
┌─────────────────────────────────────────────────────────────────┐
│                    KUBERNETES CLUSTER                           │
│  (Actual State / Runtime Environment)                          │
└─────────────────────────────────────────────────────────────────┘

Push vs Pull Model

推送模型 vs 拉取模型

Push Model (Traditional CI/CD)Pull Model (GitOps)
CI system pushes changes to clusterAgent pulls changes from Git
Requires cluster credentials in CICredentials stay within cluster
Point-in-time deploymentContinuous reconciliation
Drift goes undetectedDrift automatically corrected
Manual rollback processRollback =
git revert
推送模型(传统CI/CD)拉取模型(GitOps)
CI系统将变更推送到集群Agent从Git拉取变更
CI系统中需存储集群凭证凭证仅保留在集群内部
一次性部署持续调和
环境漂移无法被检测漂移被自动修正
回滚需手动操作回滚 =
git revert

Key GitOps Benefits

GitOps 核心优势

  1. Auditability: Git history = deployment history
  2. Security: No external access to cluster required
  3. Reliability: Automated drift correction
  4. Speed: Deploy via PR merge
  5. Rollback: Simple
    git revert
  6. Disaster Recovery: Redeploy entire cluster from Git
  1. 可审计性:Git历史记录 = 部署历史记录
  2. 安全性:无需外部访问集群
  3. 可靠性:自动修正环境漂移
  4. 高效性:通过PR合并完成部署
  5. 回滚简单:仅需执行
    git revert
  6. 灾难恢复:从Git重新部署整个集群

Repository Strategies

仓库策略

Monorepo vs Polyrepo

单仓库 vs 多仓库

Monorepo (Single repository for all environments):
gitops-repo/
├── apps/
│   ├── app-a/
│   │   ├── base/
│   │   └── overlays/
│   │       ├── dev/
│   │       ├── staging/
│   │       └── prod/
│   └── app-b/
└── infrastructure/
    ├── monitoring/
    └── networking/
Polyrepo (Separate repositories):
undefined
单仓库(所有环境共用一个仓库):
gitops-repo/
├── apps/
│   ├── app-a/
│   │   ├── base/
│   │   └── overlays/
│   │       ├── dev/
│   │       ├── staging/
│   │       └── prod/
│   └── app-b/
└── infrastructure/
    ├── monitoring/
    └── networking/
多仓库(按职责拆分仓库):
undefined

Repository per concern

按职责划分仓库

app-a-config/ # App A manifests app-b-config/ # App B manifests infrastructure/ # Shared infrastructure cluster-bootstrap/ # Cluster setup
undefined
app-a-config/ # App A 清单文件 app-b-config/ # App B 清单文件 infrastructure/ # 共享基础设施配置 cluster-bootstrap/ # 集群初始化配置
undefined

Multi-Repository Pattern (This Project)

多仓库模式(本项目采用)

Separates infrastructure from values for security boundaries:
infra-team/                    # Base configurations, ApplicationSets
├── applications/              # ArgoCD Application definitions
└── helm-base-values/          # Default Helm values

argo-cd-helm-values/           # Environment-specific overrides
├── dev/                       # Development values
├── stg/                       # Staging values
└── prd/                       # Production values
Benefits:
  • Different access controls per repo
  • Separation of concerns
  • Environment-specific secrets isolated
基础设施配置配置值分离,实现安全边界隔离:
infra-team/                    # 基础配置、ApplicationSets
├── applications/              # ArgoCD Application 定义
└── helm-base-values/          # 默认Helm配置值

argo-cd-helm-values/           # 环境专属配置覆盖
├── dev/                       # 开发环境配置值
├── stg/                       # 预发布环境配置值
└── prd/                       # 生产环境配置值
优势
  • 不同仓库可设置独立访问控制
  • 职责清晰分离
  • 环境专属密钥被隔离存储

Branching Strategies

分支策略

Environment Branches

环境分支模式

main ────────────────────────────────────► Production
  └──► staging ──────────────────────────► Staging cluster
         └──► develop ───────────────────► Development cluster
main ────────────────────────────────────► Production
  └──► staging ──────────────────────────► Staging cluster
         └──► develop ───────────────────► Development cluster

Trunk-Based with Overlays (Recommended)

主干分支+覆盖层模式(推荐)

main ────────────────────────────────────► All environments
  ├── overlays/dev/       → Dev cluster
  ├── overlays/staging/   → Staging cluster
  └── overlays/prod/      → Prod cluster
main ────────────────────────────────────► 所有环境
  ├── overlays/dev/       → 开发集群
  ├── overlays/staging/   → 预发布集群
  └── overlays/prod/      → 生产集群

Release Branches

发布分支模式

main
  ├── release/v1.0 ──────► Production (v1.0)
  ├── release/v1.1 ──────► Production (v1.1)
  └── release/v2.0 ──────► Production (v2.0)
main
  ├── release/v1.0 ──────► 生产环境(v1.0)
  ├── release/v1.1 ──────► 生产环境(v1.1)
  └── release/v2.0 ──────► 生产环境(v2.0)

Sync Policies and Strategies

同步策略与配置

Automated Sync

自动同步

yaml
syncPolicy:
  automated:
    prune: true       # Delete resources not in Git
    selfHeal: true    # Revert manual changes
yaml
syncPolicy:
  automated:
    prune: true       # 删除Git中不存在的资源
    selfHeal: true    # 还原手动变更

Manual Sync (Production Recommended)

手动同步(生产环境推荐)

yaml
syncPolicy:
  automated: null     # Require explicit sync
yaml
syncPolicy:
  automated: null     # 需要手动触发同步

Sync Options

同步选项

OptionUse Case
CreateNamespace=true
Auto-create missing namespaces
PruneLast=true
Delete after successful sync
ServerSideApply=true
Handle large CRDs
ApplyOutOfSyncOnly=true
Performance optimization
Replace=true
Force resource replacement
选项使用场景
CreateNamespace=true
自动创建缺失的命名空间
PruneLast=true
同步成功后再删除旧资源
ServerSideApply=true
处理大型CRD资源
ApplyOutOfSyncOnly=true
性能优化,仅同步不一致的资源
Replace=true
强制替换资源

Declarative Configuration Patterns

声明式配置模式

Kustomize Pattern

Kustomize 模式

yaml
undefined
yaml
undefined

base/kustomization.yaml

base/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources:
  • deployment.yaml
  • service.yaml
apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources:
  • deployment.yaml
  • service.yaml

overlays/prod/kustomization.yaml

overlays/prod/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources:
  • ../../base patchesStrategicMerge:
  • replica-patch.yaml images:
  • name: myapp newTag: v1.2.3
undefined
apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources:
  • ../../base patchesStrategicMerge:
  • replica-patch.yaml images:
  • name: myapp newTag: v1.2.3
undefined

Helm Pattern

Helm 模式

yaml
undefined
yaml
undefined

Application pointing to Helm chart

指向Helm Chart的Application配置

spec: source: repoURL: https://charts.example.com chart: my-app targetRevision: 1.2.3 helm: releaseName: my-app valueFiles: - values.yaml - values-prod.yaml
undefined
spec: source: repoURL: https://charts.example.com chart: my-app targetRevision: 1.2.3 helm: releaseName: my-app valueFiles: - values.yaml - values-prod.yaml
undefined

Multi-Source Pattern

多源模式

yaml
spec:
  sources:
    - repoURL: https://charts.bitnami.com/bitnami
      chart: nginx
      targetRevision: 15.0.0
      helm:
        valueFiles:
          - $values/nginx/values-prod.yaml
    - repoURL: https://github.com/org/values.git
      targetRevision: main
      ref: values
yaml
spec:
  sources:
    - repoURL: https://charts.bitnami.com/bitnami
      chart: nginx
      targetRevision: 15.0.0
      helm:
        valueFiles:
          - $values/nginx/values-prod.yaml
    - repoURL: https://github.com/org/values.git
      targetRevision: main
      ref: values

Progressive Delivery Integration

渐进式交付集成

GitOps enables progressive delivery patterns:
GitOps支持多种渐进式交付模式:

Blue-Green Deployments

蓝绿部署

yaml
undefined
yaml
undefined

Two applications, traffic shift via Ingress/Service

两个应用实例,通过Ingress/Service切换流量

apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: app-blue

apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: app-green
undefined

apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: app-blue

apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: app-green
undefined

Canary with Argo Rollouts

金丝雀发布(Argo Rollouts)

yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    canary:
      steps:
        - setWeight: 10
        - pause: {duration: 5m}
        - setWeight: 50
        - pause: {duration: 10m}
yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    canary:
      steps:
        - setWeight: 10
        - pause: {duration: 5m}
        - setWeight: 50
        - pause: {duration: 10m}

Environment Promotion (Kargo)

环境晋升(Kargo)

Warehouse → Dev Stage → Staging Stage → Production Stage
    │           │              │               │
    └── Freight promotion through environments ───┘
Warehouse → Dev Stage → Staging Stage → Production Stage
    │           │              │               │
    └── 跨环境的交付物晋升流程 ───┘

Cloud Provider Integration

云厂商集成

Azure Arc-enabled Kubernetes & AKS

Azure Arc-enabled Kubernetes & AKS

Azure provides a managed ArgoCD experience through the Microsoft.ArgoCD cluster extension:
bash
undefined
Azure通过Microsoft.ArgoCD集群扩展提供托管式ArgoCD服务:
bash
undefined

Simple installation (single node)

简单安装(单节点)

az k8s-extension create
--resource-group <rg> --cluster-name <cluster>
--cluster-type managedClusters
--name argocd
--extension-type Microsoft.ArgoCD
--release-train preview
--config deployWithHighAvailability=false
az k8s-extension create
--resource-group <rg> --cluster-name <cluster>
--cluster-type managedClusters
--name argocd
--extension-type Microsoft.ArgoCD
--release-train preview
--config deployWithHighAvailability=false

Production with workload identity (recommended)

生产环境推荐(带工作负载身份)

Use Bicep template - see references/azure-arc-integration.md

使用Bicep模板 - 参考 references/azure-arc-integration.md


**Key Benefits:**

| Feature | Description |
|---------|-------------|
| Managed Installation | Azure handles deployment and upgrades |
| Workload Identity | Azure AD authentication without secrets |
| Multi-Cluster | Consistent GitOps across hybrid environments |
| Azure Integration | Native ACR, Key Vault, Azure AD support |

**Prerequisites:**

- Azure Arc-connected cluster OR MSI-based AKS cluster
- `Microsoft.KubernetesConfiguration` provider registered
- `k8s-extension` CLI extension installed

See `references/azure-arc-integration.md` for complete setup guide.

---

**核心优势**:

| 特性 | 描述 |
|---------|-------------|
| 托管式安装 | Azure负责部署与升级 |
| 工作负载身份 | 无需密钥的Azure AD认证 |
| 多集群支持 | 混合环境下的一致GitOps体验 |
| Azure原生集成 | 原生支持ACR、Key Vault、Azure AD |

**前提条件**:

- Azure Arc连接集群 或 基于MSI的AKS集群
- 已注册`Microsoft.KubernetesConfiguration`提供商
- 已安装`k8s-extension` CLI扩展

完整设置指南请查看 `references/azure-arc-integration.md`。

---

Security Considerations

安全考量

Secrets Management

密钥管理

Never store secrets in Git! Use:
ApproachTool
External SecretsExternal Secrets Operator
Sealed SecretsBitnami Sealed Secrets
SOPSMozilla SOPS encryption
VaultHashiCorp Vault + CSI
Cloud KMSAWS/Azure/GCP Key Management
绝对不要在Git中存储密钥! 请使用以下方案:
方案工具
外部密钥External Secrets Operator
密封密钥Bitnami Sealed Secrets
加密工具Mozilla SOPS
密钥管理系统HashiCorp Vault + CSI
云KMSAWS/Azure/GCP 密钥管理服务

RBAC Best Practices

RBAC 最佳实践

yaml
undefined
yaml
undefined

Limit ArgoCD to specific namespaces

限制ArgoCD仅访问特定命名空间

apiVersion: argoproj.io/v1alpha1 kind: AppProject spec: destinations: - namespace: 'team-a-' server: https://kubernetes.default.svc sourceRepos: - 'https://github.com/org/team-a-'
undefined
apiVersion: argoproj.io/v1alpha1 kind: AppProject spec: destinations: - namespace: 'team-a-' server: https://kubernetes.default.svc sourceRepos: - 'https://github.com/org/team-a-'
undefined

Network Policies

网络策略

  • GitOps controller should be only component with Git access
  • Restrict egress from application namespaces
  • Use network policies to isolate environments
  • GitOps控制器应是唯一能访问Git的组件
  • 限制应用命名空间的出站流量
  • 使用网络策略隔离不同环境

Observability and Debugging

可观测性与调试

Health Status Interpretation

健康状态解读

StatusMeaningAction
HealthyAll resources runningNone
ProgressingDeployment in progressWait
DegradedHealth check failedInvestigate
SuspendedManually pausedResume when ready
MissingResource not foundCheck manifests
状态含义操作
Healthy所有资源正常运行无需操作
Progressing部署正在进行中等待完成
Degraded健康检查失败排查问题
Suspended已手动暂停准备就绪后恢复
Missing资源未找到检查清单文件

Common Issues Checklist

常见问题排查清单

  1. Sync Failed: Check YAML syntax, RBAC permissions
  2. OutOfSync: Compare diff, check ignoreDifferences
  3. Degraded: Check Pod logs, resource limits
  4. Missing: Verify namespace, check pruning settings
  1. 同步失败:检查YAML语法、RBAC权限
  2. 状态不一致:对比差异,检查ignoreDifferences配置
  3. 状态退化:查看Pod日志、资源限制
  4. 资源缺失:验证命名空间,检查清理设置

Drift Detection

漂移检测

bash
undefined
bash
undefined

Check application diff

查看应用差异

argocd app diff myapp
argocd app diff myapp

Force refresh from Git

从Git强制刷新

argocd app get myapp --refresh
undefined
argocd app get myapp --refresh
undefined

Quick Decision Guide

快速决策指南

When to Use GitOps

何时使用GitOps

  • Kubernetes-native workloads
  • Multiple environments (dev/staging/prod)
  • Need audit trail for deployments
  • Team collaboration on infrastructure
  • Disaster recovery requirements
  • Kubernetes原生工作负载
  • 多环境部署(开发/预发布/生产)
  • 需要部署审计追踪
  • 团队协作管理基础设施
  • 有灾难恢复需求

When GitOps May Not Fit

GitOps 不适用场景

  • Rapidly changing development environments
  • Legacy systems without declarative configs
  • Real-time configuration changes required
  • Single developer, single environment
  • 快速变化的开发环境
  • 无声明式配置的遗留系统
  • 需要实时修改配置的场景
  • 单开发者、单环境场景

References

参考资料

For detailed information, see:
  • references/core-principles.md
    - Deep dive into the 4 pillars
  • references/patterns-and-practices.md
    - Branching and repo patterns
  • references/tooling-ecosystem.md
    - ArgoCD vs Flux vs Kargo
  • references/anti-patterns.md
    - Common mistakes to avoid
  • references/troubleshooting.md
    - Debugging guide
  • references/azure-arc-integration.md
    - Azure Arc & AKS GitOps setup
如需详细信息,请查看:
  • references/core-principles.md
    - 四大支柱深度解析
  • references/patterns-and-practices.md
    - 分支与仓库模式
  • references/tooling-ecosystem.md
    - ArgoCD vs Flux vs Kargo对比
  • references/anti-patterns.md
    - 常见错误规避
  • references/troubleshooting.md
    - 调试指南
  • references/azure-arc-integration.md
    - Azure Arc & AKS GitOps设置

Templates

模板

Ready-to-use templates in
templates/
:
  • application.yaml
    - ArgoCD Application example
  • applicationset.yaml
    - Multi-cluster deployment
  • kustomization.yaml
    - Kustomize overlay structure
templates/
目录下提供可直接使用的模板:
  • application.yaml
    - ArgoCD Application示例
  • applicationset.yaml
    - 多集群部署示例
  • kustomization.yaml
    - Kustomize覆盖层结构示例

Scripts

脚本

Utility scripts in
scripts/
:
  • gitops-health-check.sh
    - Validate GitOps setup
scripts/
目录下的实用脚本:
  • gitops-health-check.sh
    - 验证GitOps设置

External Resources

外部资源