Kubernetes Cluster API

Kubernetes Cluster API (CAPI) is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.

Kubernetes Cluster API（CAPI）是Kubernetes的一个子项目，专注于提供声明式API和工具，以简化多Kubernetes集群的部署、升级和运维。

Overview

概述

Started by SIG Cluster Lifecycle, Cluster API uses Kubernetes-style APIs and patterns to automate cluster lifecycle management. The infrastructure (VMs, networks, load balancers, VPCs) and Kubernetes configuration are defined declaratively, enabling consistent and repeatable cluster deployments across environments.

由SIG Cluster Lifecycle发起，Cluster API采用Kubernetes风格的API和模式来自动化集群生命周期管理。基础设施（虚拟机、网络、负载均衡器、VPC）和Kubernetes配置均以声明式方式定义，可在不同环境中实现一致且可重复的集群部署。

Why Cluster API?

为什么选择Cluster API？

While kubeadm reduces installation complexity, it doesn't address day-to-day cluster management:

How to consistently provision infrastructure across providers and locations?
How to automate cluster lifecycle (upgrades, deletion)?
How to scale processes to manage any number of clusters?

Cluster API addresses these gaps with declarative, Kubernetes-style APIs that automate cluster creation, configuration, and management.

虽然kubeadm降低了安装复杂度，但它并未解决日常集群管理的问题：

如何在不同供应商和地域之间一致地部署基础设施？
如何自动化集群生命周期（升级、删除）？
如何扩展流程以管理任意数量的集群？

Cluster API通过声明式、Kubernetes风格的API解决了这些痛点，可自动化集群的创建、配置和管理。

Goals

目标

Manage lifecycle (create, scale, upgrade, destroy) of Kubernetes-conformant clusters via declarative API
Work in different environments (on-premises and cloud)
Define common operations with swappable implementations
Reuse existing ecosystem components (cluster-autoscaler, node-problem-detector)
Provide transition path for existing tools to adopt incrementally

通过声明式API管理符合Kubernetes标准的集群的生命周期（创建、扩容、升级、销毁）
支持不同环境（本地数据中心和云环境）
定义可替换实现的通用操作
复用现有生态系统组件（cluster-autoscaler、node-problem-detector）
为现有工具提供渐进式适配的过渡路径

Non-Goals

非目标

Add APIs to Kubernetes core
Manage infrastructure unrelated to Kubernetes clusters
Force all lifecycle products to use these APIs
Manage non-CAPI provisioned clusters
Manage single cluster spanning multiple providers
Configure machines after create/upgrade

向Kubernetes核心添加API
管理与Kubernetes集群无关的基础设施
强制所有生命周期产品使用这些API
管理非CAPI部署的集群
管理跨多个供应商的单一集群
在创建/升级后配置机器

Quick Navigation

快速导航

Topic	Reference
Getting Started	getting-started.md
Concepts & Architecture	concepts.md
Certificates	certificates.md
Bootstrap (Kubeadm/MicroK8s)	bootstrap.md
Cluster Operations	cluster-operations.md
Experimental Features	experimental.md
clusterctl CLI	clusterctl.md
Developer Guide	developer.md
Troubleshooting	troubleshooting.md
API Reference & Providers	api-reference.md

主题	参考文档
快速入门	getting-started.md
概念与架构	concepts.md
证书管理	certificates.md
引导配置（Kubeadm/MicroK8s）	bootstrap.md
集群运维操作	cluster-operations.md
实验性功能	experimental.md
clusterctl CLI工具	clusterctl.md
开发者指南	developer.md
故障排查	troubleshooting.md
API参考与供应商	api-reference.md

When to Use

适用场景

Provisioning Kubernetes clusters across multiple infrastructure providers
Managing cluster lifecycle (create, scale, upgrade, destroy)
Automating cluster operations with declarative APIs
Implementing GitOps workflows for cluster management
Building custom infrastructure providers

在多个基础设施供应商之间部署Kubernetes集群
管理集群生命周期（创建、扩容、升级、销毁）
通过声明式API自动化集群操作
为集群管理实现GitOps工作流
构建自定义基础设施供应商

Core Concepts

核心概念

Architecture

架构

┌─────────────────────────────────────────┐
│         Management Cluster              │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │ CAPI Core   │  │ Infrastructure  │   │
│  │ Controllers │  │ Provider        │   │
│  └─────────────┘  └─────────────────┘   │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │  Bootstrap  │  │  Control Plane  │   │
│  │  Provider   │  │  Provider       │   │
│  └─────────────┘  └─────────────────┘   │
└─────────────────────┬───────────────────┘
                      │ manages
          ┌───────────┴───────────┐
          ▼                       ▼
┌─────────────────┐     ┌─────────────────┐
│ Workload        │     │ Workload        │
│ Cluster 1       │     │ Cluster N       │
└─────────────────┘     └─────────────────┘

┌─────────────────────────────────────────┐
│         Management Cluster              │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │ CAPI Core   │  │ Infrastructure  │   │
│  │ Controllers │  │ Provider        │   │
│  └─────────────┘  └─────────────────┘   │
│  ┌─────────────┐  ┌─────────────────┐   │
│  │  Bootstrap  │  │  Control Plane  │   │
│  │  Provider   │  │  Provider       │   │
│  └─────────────┘  └─────────────────┘   │
└─────────────────────┬───────────────────┘
                      │ manages
          ┌───────────┴───────────┐
          ▼                       ▼
┌─────────────────┐     ┌─────────────────┐
│ Workload        │     │ Workload        │
│ Cluster 1       │     │ Cluster N       │
└─────────────────┘     └─────────────────┘

Key Components

关键组件

Component	Purpose
Management Cluster	Hosts CAPI controllers, manages workloads
Workload Cluster	User clusters managed by CAPI
Infrastructure Provider	Provisions VMs, networks, load balancers
Bootstrap Provider	Generates cloud-init/ignition configs
Control Plane Provider	Manages control plane nodes lifecycle

组件名称	用途说明
管理集群（Management Cluster）	托管CAPI控制器，管理工作负载集群
工作负载集群（Workload Cluster）	由CAPI管理的用户集群
基础设施供应商（Infrastructure Provider）	部署虚拟机、网络、负载均衡器
引导配置供应商（Bootstrap Provider）	生成cloud-init/ignition配置文件
控制平面供应商（Control Plane Provider）	管理控制平面节点的生命周期

Core Resources

核心资源

Resource	Description
Cluster	Represents a Kubernetes cluster
Machine	Represents a single node/VM
MachineSet	Manages replicas of Machines
MachineDeployment	Declarative updates for MachineSets
MachineHealthCheck	Automatic remediation of unhealthy nodes

资源名称	描述说明
Cluster	代表一个Kubernetes集群
Machine	代表单个节点/虚拟机
MachineSet	管理Machine的副本集
MachineDeployment	为MachineSet提供声明式更新能力
MachineHealthCheck	自动修复不健康节点

Quick Start

快速开始

bash

undefined

bash

undefined

Install clusterctl

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.12.0/clusterctl-linux-amd64 -o clusterctl chmod +x clusterctl sudo mv clusterctl /usr/local/bin/

Initialize management cluster

clusterctl init --infrastructure docker

Create workload cluster

clusterctl generate cluster my-cluster --kubernetes-version v1.32.0 --control-plane-machine-count 1 --worker-machine-count 3 | kubectl apply -f -

Get cluster kubeconfig

clusterctl get kubeconfig my-cluster > my-cluster.kubeconfig

Delete cluster

kubectl delete cluster my-cluster

undefined

kubectl delete cluster my-cluster

undefined

Common Workflows

常见工作流

Cluster Lifecycle

集群生命周期管理

bash

undefined

bash

undefined

Create cluster from template

clusterctl generate cluster prod-cluster
--infrastructure aws
--kubernetes-version v1.32.0
--control-plane-machine-count 3
--worker-machine-count 5
| kubectl apply -f -

Scale workers

kubectl scale machinedeployment prod-cluster-md-0 --replicas=10

Upgrade Kubernetes version

kubectl patch cluster prod-cluster --type merge -p '{"spec":{"topology":{"version":"v1.33.0"}}}'

Move cluster to new management cluster

clusterctl move --to-kubeconfig target-mgmt.kubeconfig

undefined

clusterctl move --to-kubeconfig target-mgmt.kubeconfig

undefined

Health Monitoring

健康监控

yaml

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
  name: my-cluster-mhc
spec:
  clusterName: my-cluster
  maxUnhealthy: 40%
  nodeStartupTimeout: 10m
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  unhealthyConditions:
    - type: Ready
      status: "False"
      timeout: 5m
    - type: Ready
      status: Unknown
      timeout: 5m

yaml

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
  name: my-cluster-mhc
spec:
  clusterName: my-cluster
  maxUnhealthy: 40%
  nodeStartupTimeout: 10m
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  unhealthyConditions:
    - type: Ready
      status: "False"
      timeout: 5m
    - type: Ready
      status: Unknown
      timeout: 5m

Critical Prohibitions

重要注意事项

Do NOT modify management cluster directly without proper backup
Do NOT delete Machine objects directly (use MachineDeployment scale)
Do NOT mix provider versions without checking compatibility
Do NOT skip cluster upgrade steps (control plane before workers)
Do NOT ignore MachineHealthCheck alerts

请勿在未进行适当备份的情况下直接修改管理集群
请勿直接删除Machine对象（请使用MachineDeployment进行扩容/缩容操作）
请勿混合使用不同版本的供应商组件，需先检查兼容性
请勿跳过集群升级步骤（先升级控制平面，再升级工作节点）
请勿忽略MachineHealthCheck的告警

k8s-cluster-api

Original

Translation

Kubernetes Cluster API

Kubernetes Cluster API

Overview

概述

Why Cluster API?

为什么选择Cluster API？

Goals

目标

Non-Goals

非目标

Quick Navigation

快速导航

When to Use

适用场景

Core Concepts

核心概念

Architecture

架构

Key Components

关键组件

Core Resources

核心资源

Quick Start

快速开始

Install clusterctl

Install clusterctl

Initialize management cluster

Initialize management cluster

Create workload cluster

Create workload cluster

Get cluster kubeconfig

Get cluster kubeconfig

Delete cluster

Delete cluster

Common Workflows

常见工作流

Cluster Lifecycle

集群生命周期管理

Create cluster from template

Create cluster from template

Scale workers

Scale workers

Upgrade Kubernetes version

Upgrade Kubernetes version

Move cluster to new management cluster

Move cluster to new management cluster

Health Monitoring

健康监控

Critical Prohibitions

重要注意事项

Links

相关链接