platform
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePlatform Engineering
平台工程
Build reliable, observable, cost-efficient infrastructure.
构建可靠、可观测、成本优化的基础设施。
Quick Reference
快速参考
The 2026 Platform Stack
2026年平台技术栈
| Layer | Tool | Purpose |
|---|---|---|
| IaC | OpenTofu / Pulumi | Infrastructure definition |
| GitOps | Argo CD / Flux | Continuous deployment |
| Control Plane | Crossplane | Kubernetes-native infra |
| Observability | OpenTelemetry | Unified telemetry |
| Service Mesh | Istio Ambient / Cilium | mTLS, traffic management |
| Cost | FinOps Framework | Cloud optimization |
| 层级 | 工具 | 用途 |
|---|---|---|
| IaC | OpenTofu / Pulumi | 基础设施定义 |
| GitOps | Argo CD / Flux | 持续部署 |
| 控制平面 | Crossplane | 基于Kubernetes的基础设施 |
| 可观测性 | OpenTelemetry | 统一遥测数据 |
| 服务网格 | Istio Ambient / Cilium | mTLS、流量管理 |
| 成本管理 | FinOps Framework | 云成本优化 |
Infrastructure as Code
基础设施即代码
OpenTofu (Terraform-compatible, open-source):
hcl
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
tags = {
Name = "web-server"
Environment = "production"
}
}Pulumi (Real programming languages):
typescript
import * as aws from "@pulumi/aws";
const server = new aws.ec2.Instance("web", {
ami: "ami-0c55b159cbfafe1f0",
instanceType: "t3.micro",
tags: { Name: "web-server" },
});
export const publicIp = server.publicIp;OpenTofu(与Terraform兼容,开源):
hcl
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
tags = {
Name = "web-server"
Environment = "production"
}
}Pulumi(支持真实编程语言):
typescript
import * as aws from "@pulumi/aws";
const server = new aws.ec2.Instance("web", {
ami: "ami-0c55b159cbfafe1f0",
instanceType: "t3.micro",
tags: { Name: "web-server" },
});
export const publicIp = server.publicIp;GitOps with Argo CD
基于GitOps的Argo CD配置
yaml
undefinedyaml
undefinedApplication manifest
应用清单
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/repo
targetRevision: HEAD
path: k8s/overlays/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
undefinedapiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/repo
targetRevision: HEAD
path: k8s/overlays/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
undefinedKubernetes Patterns
Kubernetes模式
Gateway API (replacing Ingress):
yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: api-route
spec:
parentRefs:
- name: main-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /api
backendRefs:
- name: api-service
port: 8080Istio Ambient Mode (sidecar-less service mesh):
yaml
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
istio.io/dataplane-mode: ambient # Enable ambient meshGateway API(替代Ingress):
yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: api-route
spec:
parentRefs:
- name: main-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /api
backendRefs:
- name: api-service
port: 8080Istio Ambient Mode(无Sidecar的服务网格):
yaml
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
istio.io/dataplane-mode: ambient # 启用Ambient网格OpenTelemetry Setup
OpenTelemetry配置
python
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporterpython
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporterInitialize
初始化
provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
Use
使用
tracer = trace.get_tracer(name)
with tracer.start_as_current_span("my-operation"):
do_work()
undefinedtracer = trace.get_tracer(name)
with tracer.start_as_current_span("my-operation"):
do_work()
undefinedCI/CD Pipeline (GitHub Actions)
CI/CD流水线(GitHub Actions)
yaml
name: Deploy
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build and push
uses: docker/build-push-action@v5
with:
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
- name: Update manifests
run: |
cd k8s/overlays/production
kustomize edit set image app=ghcr.io/${{ github.repository }}:${{ github.sha }}
git commit -am "Deploy ${{ github.sha }}"
git pushyaml
name: Deploy
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: 构建并推送镜像
uses: docker/build-push-action@v5
with:
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
- name: 更新清单文件
run: |
cd k8s/overlays/production
kustomize edit set image app=ghcr.io/${{ github.repository }}:${{ github.sha }}
git commit -am "Deploy ${{ github.sha }}"
git pushFinOps Framework
FinOps框架
Phase 1: INFORM (visibility)
- Tag everything: ,
team,environmentcost-center - Use cloud cost explorers
- Target: 95%+ cost allocation accuracy
Phase 2: OPTIMIZE (action)
- Rightsize instances (most are overprovisioned)
- Use spot/preemptible for stateless workloads
- Reserved instances for baseline capacity
- Target: 20-30% cost reduction
Phase 3: OPERATE (governance)
- Budget alerts at 80% threshold
- Cost metrics in CI/CD gates
- Regular FinOps reviews
阶段1:感知(可见性)
- 为所有资源打标签:、
team、environmentcost-center - 使用云成本分析工具
- 目标:95%以上的成本分配准确率
阶段2:优化(执行)
- 调整实例规格(大多数实例存在过度配置)
- 为无状态工作负载使用Spot/抢占式实例
- 为基线容量使用预留实例
- 目标:降低20-30%的成本
阶段3:运营(治理)
- 设置预算阈值为80%时触发告警
- 在CI/CD网关中加入成本指标
- 定期开展FinOps评审
Security Baseline
安全基线
yaml
undefinedyaml
undefinedTetragon policy (eBPF runtime enforcement)
Tetragon策略(eBPF运行时强制执行)
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: block-shell
spec:
kprobes:
- call: "sys_execve"
selectors:
- matchBinaries:
- operator: "In"
values: ["/bin/sh", "/bin/bash"]
matchNamespaces:
- namespace: production
action: Block
undefinedapiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: block-shell
spec:
kprobes:
- call: "sys_execve"
selectors:
- matchBinaries:
- operator: "In"
values: ["/bin/sh", "/bin/bash"]
matchNamespaces:
- namespace: production
action: Block
undefinedAgents
关联Agent
- platform-engineer - GitOps, IaC, Kubernetes, observability
- data-engineer - Pipelines, ETL, data infrastructure
- finops-engineer - Cloud cost optimization, FinOps framework
- platform-engineer - GitOps、IaC、Kubernetes、可观测性
- data-engineer - 数据流水线、ETL、数据基础设施
- finops-engineer - 云成本优化、FinOps框架
Deep Dives
深度解析
- references/gitops-patterns.md
- references/kubernetes-gateway.md
- references/opentelemetry.md
- references/finops-framework.md
- references/gitops-patterns.md
- references/kubernetes-gateway.md
- references/opentelemetry.md
- references/finops-framework.md
Examples
示例
- examples/argo-cd-setup/
- examples/pulumi-aws/
- examples/otel-stack/
- examples/argo-cd-setup/
- examples/pulumi-aws/
- examples/otel-stack/