k8s-cluster-api

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Kubernetes Cluster API

Kubernetes Cluster API

Kubernetes Cluster API (CAPI) is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.
Kubernetes Cluster API(CAPI)是Kubernetes的一个子项目,专注于提供声明式API和工具,以简化多Kubernetes集群的部署、升级和运维。

Overview

概述

Started by SIG Cluster Lifecycle, Cluster API uses Kubernetes-style APIs and patterns to automate cluster lifecycle management. The infrastructure (VMs, networks, load balancers, VPCs) and Kubernetes configuration are defined declaratively, enabling consistent and repeatable cluster deployments across environments.
由SIG Cluster Lifecycle发起,Cluster API采用Kubernetes风格的API和模式来自动化集群生命周期管理。基础设施(虚拟机、网络、负载均衡器、VPC)和Kubernetes配置均以声明式方式定义,可在不同环境中实现一致且可重复的集群部署。

Why Cluster API?

为什么选择Cluster API?

While kubeadm reduces installation complexity, it doesn't address day-to-day cluster management:
  • How to consistently provision infrastructure across providers and locations?
  • How to automate cluster lifecycle (upgrades, deletion)?
  • How to scale processes to manage any number of clusters?
Cluster API addresses these gaps with declarative, Kubernetes-style APIs that automate cluster creation, configuration, and management.
虽然kubeadm降低了安装复杂度,但它并未解决日常集群管理的问题:
  • 如何在不同供应商和地域之间一致地部署基础设施?
  • 如何自动化集群生命周期(升级、删除)?
  • 如何扩展流程以管理任意数量的集群?
Cluster API通过声明式、Kubernetes风格的API解决了这些痛点,可自动化集群的创建、配置和管理。

Goals

目标

  • Manage lifecycle (create, scale, upgrade, destroy) of Kubernetes-conformant clusters via declarative API
  • Work in different environments (on-premises and cloud)
  • Define common operations with swappable implementations
  • Reuse existing ecosystem components (cluster-autoscaler, node-problem-detector)
  • Provide transition path for existing tools to adopt incrementally
  • 通过声明式API管理符合Kubernetes标准的集群的生命周期(创建、扩容、升级、销毁)
  • 支持不同环境(本地数据中心和云环境)
  • 定义可替换实现的通用操作
  • 复用现有生态系统组件(cluster-autoscaler、node-problem-detector)
  • 为现有工具提供渐进式适配的过渡路径

Non-Goals

非目标

  • Add APIs to Kubernetes core
  • Manage infrastructure unrelated to Kubernetes clusters
  • Force all lifecycle products to use these APIs
  • Manage non-CAPI provisioned clusters
  • Manage single cluster spanning multiple providers
  • Configure machines after create/upgrade
  • 向Kubernetes核心添加API
  • 管理与Kubernetes集群无关的基础设施
  • 强制所有生命周期产品使用这些API
  • 管理非CAPI部署的集群
  • 管理跨多个供应商的单一集群
  • 在创建/升级后配置机器

Quick Navigation

快速导航

TopicReference
Getting Startedgetting-started.md
Concepts & Architectureconcepts.md
Certificatescertificates.md
Bootstrap (Kubeadm/MicroK8s)bootstrap.md
Cluster Operationscluster-operations.md
Experimental Featuresexperimental.md
clusterctl CLIclusterctl.md
Developer Guidedeveloper.md
Troubleshootingtroubleshooting.md
API Reference & Providersapi-reference.md
主题参考文档
快速入门getting-started.md
概念与架构concepts.md
证书管理certificates.md
引导配置(Kubeadm/MicroK8s)bootstrap.md
集群运维操作cluster-operations.md
实验性功能experimental.md
clusterctl CLI工具clusterctl.md
开发者指南developer.md
故障排查troubleshooting.md
API参考与供应商api-reference.md

When to Use

适用场景

  • Provisioning Kubernetes clusters across multiple infrastructure providers
  • Managing cluster lifecycle (create, scale, upgrade, destroy)
  • Automating cluster operations with declarative APIs
  • Implementing GitOps workflows for cluster management
  • Building custom infrastructure providers
  • 在多个基础设施供应商之间部署Kubernetes集群
  • 管理集群生命周期(创建、扩容、升级、销毁)
  • 通过声明式API自动化集群操作
  • 为集群管理实现GitOps工作流
  • 构建自定义基础设施供应商

Core Concepts

核心概念

Architecture

架构

┌─────────────────────────────────────────┐
│         Management Cluster              │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │ CAPI Core   │  │ Infrastructure  │   │
│  │ Controllers │  │ Provider        │   │
│  └─────────────┘  └─────────────────┘   │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │  Bootstrap  │  │  Control Plane  │   │
│  │  Provider   │  │  Provider       │   │
│  └─────────────┘  └─────────────────┘   │
└─────────────────────┬───────────────────┘
                      │ manages
          ┌───────────┴───────────┐
          ▼                       ▼
┌─────────────────┐     ┌─────────────────┐
│ Workload        │     │ Workload        │
│ Cluster 1       │     │ Cluster N       │
└─────────────────┘     └─────────────────┘
┌─────────────────────────────────────────┐
│         Management Cluster              │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │ CAPI Core   │  │ Infrastructure  │   │
│  │ Controllers │  │ Provider        │   │
│  └─────────────┘  └─────────────────┘   │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │  Bootstrap  │  │  Control Plane  │   │
│  │  Provider   │  │  Provider       │   │
│  └─────────────┘  └─────────────────┘   │
└─────────────────────┬───────────────────┘
                      │ manages
          ┌───────────┴───────────┐
          ▼                       ▼
┌─────────────────┐     ┌─────────────────┐
│ Workload        │     │ Workload        │
│ Cluster 1       │     │ Cluster N       │
└─────────────────┘     └─────────────────┘

Key Components

关键组件

ComponentPurpose
Management ClusterHosts CAPI controllers, manages workloads
Workload ClusterUser clusters managed by CAPI
Infrastructure ProviderProvisions VMs, networks, load balancers
Bootstrap ProviderGenerates cloud-init/ignition configs
Control Plane ProviderManages control plane nodes lifecycle
组件名称用途说明
管理集群(Management Cluster)托管CAPI控制器,管理工作负载集群
工作负载集群(Workload Cluster)由CAPI管理的用户集群
基础设施供应商(Infrastructure Provider)部署虚拟机、网络、负载均衡器
引导配置供应商(Bootstrap Provider)生成cloud-init/ignition配置文件
控制平面供应商(Control Plane Provider)管理控制平面节点的生命周期

Core Resources

核心资源

ResourceDescription
ClusterRepresents a Kubernetes cluster
MachineRepresents a single node/VM
MachineSetManages replicas of Machines
MachineDeploymentDeclarative updates for MachineSets
MachineHealthCheckAutomatic remediation of unhealthy nodes
资源名称描述说明
Cluster代表一个Kubernetes集群
Machine代表单个节点/虚拟机
MachineSet管理Machine的副本集
MachineDeployment为MachineSet提供声明式更新能力
MachineHealthCheck自动修复不健康节点

Quick Start

快速开始

bash
undefined
bash
undefined

Install clusterctl

Install clusterctl

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.12.0/clusterctl-linux-amd64 -o clusterctl chmod +x clusterctl sudo mv clusterctl /usr/local/bin/
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.12.0/clusterctl-linux-amd64 -o clusterctl chmod +x clusterctl sudo mv clusterctl /usr/local/bin/

Initialize management cluster

Initialize management cluster

clusterctl init --infrastructure docker
clusterctl init --infrastructure docker

Create workload cluster

Create workload cluster

clusterctl generate cluster my-cluster --kubernetes-version v1.32.0 --control-plane-machine-count 1 --worker-machine-count 3 | kubectl apply -f -
clusterctl generate cluster my-cluster --kubernetes-version v1.32.0 --control-plane-machine-count 1 --worker-machine-count 3 | kubectl apply -f -

Get cluster kubeconfig

Get cluster kubeconfig

clusterctl get kubeconfig my-cluster > my-cluster.kubeconfig
clusterctl get kubeconfig my-cluster > my-cluster.kubeconfig

Delete cluster

Delete cluster

kubectl delete cluster my-cluster
undefined
kubectl delete cluster my-cluster
undefined

Common Workflows

常见工作流

Cluster Lifecycle

集群生命周期管理

bash
undefined
bash
undefined

Create cluster from template

Create cluster from template

clusterctl generate cluster prod-cluster
--infrastructure aws
--kubernetes-version v1.32.0
--control-plane-machine-count 3
--worker-machine-count 5
| kubectl apply -f -
clusterctl generate cluster prod-cluster
--infrastructure aws
--kubernetes-version v1.32.0
--control-plane-machine-count 3
--worker-machine-count 5
| kubectl apply -f -

Scale workers

Scale workers

kubectl scale machinedeployment prod-cluster-md-0 --replicas=10
kubectl scale machinedeployment prod-cluster-md-0 --replicas=10

Upgrade Kubernetes version

Upgrade Kubernetes version

kubectl patch cluster prod-cluster --type merge -p '{"spec":{"topology":{"version":"v1.33.0"}}}'
kubectl patch cluster prod-cluster --type merge -p '{"spec":{"topology":{"version":"v1.33.0"}}}'

Move cluster to new management cluster

Move cluster to new management cluster

clusterctl move --to-kubeconfig target-mgmt.kubeconfig
undefined
clusterctl move --to-kubeconfig target-mgmt.kubeconfig
undefined

Health Monitoring

健康监控

yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
  name: my-cluster-mhc
spec:
  clusterName: my-cluster
  maxUnhealthy: 40%
  nodeStartupTimeout: 10m
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  unhealthyConditions:
    - type: Ready
      status: "False"
      timeout: 5m
    - type: Ready
      status: Unknown
      timeout: 5m
yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
  name: my-cluster-mhc
spec:
  clusterName: my-cluster
  maxUnhealthy: 40%
  nodeStartupTimeout: 10m
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  unhealthyConditions:
    - type: Ready
      status: "False"
      timeout: 5m
    - type: Ready
      status: Unknown
      timeout: 5m

Critical Prohibitions

重要注意事项

  • Do NOT modify management cluster directly without proper backup
  • Do NOT delete Machine objects directly (use MachineDeployment scale)
  • Do NOT mix provider versions without checking compatibility
  • Do NOT skip cluster upgrade steps (control plane before workers)
  • Do NOT ignore MachineHealthCheck alerts
  • 请勿在未进行适当备份的情况下直接修改管理集群
  • 请勿直接删除Machine对象(请使用MachineDeployment进行扩容/缩容操作)
  • 请勿混合使用不同版本的供应商组件,需先检查兼容性
  • 请勿跳过集群升级步骤(先升级控制平面,再升级工作节点)
  • 请勿忽略MachineHealthCheck的告警

Links

相关链接