Kubernetes Operations

Kubernetes 操作

Expert knowledge for Kubernetes cluster management, deployment, and troubleshooting with mastery of kubectl and cloud-native patterns.

具备Kubernetes集群管理、部署和故障排查的专业知识，精通kubectl和云原生模式。

Core Expertise

核心专业能力

Kubernetes Operations

Workload Management: Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
Networking: Services, Ingress, NetworkPolicies, and DNS configuration
Configuration & Storage: ConfigMaps, Secrets, PersistentVolumes, and PersistentVolumeClaims
Troubleshooting: Debugging pods, analyzing logs, and inspecting cluster events

Kubernetes 操作

工作负载管理：Deployments、StatefulSets、DaemonSets、Jobs和CronJobs
网络管理：Services、Ingress、NetworkPolicies和DNS配置
配置与存储：ConfigMaps、Secrets、PersistentVolumes和PersistentVolumeClaims
故障排查：Pod调试、日志分析和集群事件检查

Cluster Operations Process

集群操作流程

Manifest First: Always prefer declarative YAML manifests for resource management
Validate & Dry-Run: Use
```
kubectl apply --dry-run=client
```
to validate changes
Inspect & Verify: After applying changes, verify with
```
kubectl get
```
,
```
kubectl describe
```
,
```
kubectl logs
```
Monitor Health: Continuously check status of nodes, pods, and services
Clean Up: Ensure old or unused resources are properly garbage collected

优先使用清单：资源管理始终优先采用声明式YAML清单
验证与试运行：使用
```
kubectl apply --dry-run=client
```
验证变更
检查与验证：应用变更后，使用
```
kubectl get
```
、
```
kubectl describe
```
、
```
kubectl logs
```
进行验证
监控健康状态：持续检查节点、Pod和服务的状态
清理资源：确保旧的或未使用的资源被正确垃圾回收

Essential Commands

必备命令

bash

undefined

bash

undefined

Resource management

kubectl apply -f manifest.yaml kubectl get pods -A kubectl describe pod <pod-name> kubectl logs -f <pod-name> kubectl exec -it <pod-name> -- /bin/bash

Debugging

kubectl get events --sort-by='.lastTimestamp' kubectl top nodes kubectl top pods --containers kubectl port-forward <pod-name> 8080:80

Deployment management

kubectl rollout status deployment/<name> kubectl rollout history deployment/<name> kubectl rollout undo deployment/<name>

Cluster inspection

kubectl cluster-info kubectl get nodes -o wide kubectl api-resources

undefined

kubectl cluster-info kubectl get nodes -o wide kubectl api-resources

undefined

Key Debugging Patterns

关键调试模式

Pod Debugging

bash

undefined

Pod 调试

bash

undefined

Pod inspection

kubectl describe pod <pod-name> kubectl get pod <pod-name> -o yaml kubectl logs <pod-name> --previous

Interactive debugging

kubectl exec -it <pod-name> -- /bin/bash kubectl debug <pod-name> -it --image=busybox kubectl port-forward <pod-name> 8080:80


**Networking Troubleshooting**
```bash

kubectl exec -it <pod-name> -- /bin/bash kubectl debug <pod-name> -it --image=busybox kubectl port-forward <pod-name> 8080:80


**网络故障排查**
```bash

Service debugging

kubectl get svc -o wide kubectl get endpoints kubectl describe svc <service>

Network connectivity

kubectl run test-pod --image=busybox -it --rm -- sh

Inside pod: nslookup, wget, nc commands


**Common Issues**
```bash


**常见问题**
```bash

CrashLoopBackOff debugging

kubectl logs <pod> --previous kubectl describe pod <pod> kubectl get events --field-selector involvedObject.name=<pod>

Resource constraints

kubectl top pod <pod> kubectl describe pod <pod> | grep -A 5 Limits

State management

kubectl state list kubectl state show <resource>

undefined

kubectl state list kubectl state show <resource>

undefined

Best Practices

最佳实践

Context Safety (CRITICAL)

Always specify
--context
explicitly in every kubectl command
Never rely on the current context - it may have been changed by another process

Use

kubectl --context=<context-name> get pods

format for all operations

This prevents accidental operations on the wrong cluster (e.g., running production commands against staging)

bash

undefined

上下文安全（至关重要）

始终显式指定
--context
：在每个kubectl命令中都要显式指定
永远不要依赖当前上下文——它可能已被其他进程修改

对所有操作使用

kubectl --context=<context-name> get pods

格式

这可以防止在错误的集群上执行意外操作（例如，针对预发布集群运行生产命令）

bash

undefined

CORRECT: Explicit context

kubectl --context=gke_myproject_us-central1_prod get pods kubectl --context=staging-cluster apply -f deployment.yaml

WRONG: Relying on current context

kubectl get pods # Which cluster is this targeting?


**Resource Definitions**
- Use declarative YAML manifests
- Implement proper labels and selectors
- Define resource requests and limits
- Configure health checks (liveness/readiness probes)

**Security**
- Use NetworkPolicies to restrict traffic
- Implement RBAC for access control
- Store sensitive data in Secrets
- Run containers as non-root users

**Monitoring**
- Configure proper logging and metrics
- Set up alerts for critical conditions
- Use health checks and readiness probes
- Monitor resource usage and quotas

kubectl get pods # Which cluster is this targeting?


**资源定义**
- 使用声明式YAML清单
- 实现适当的标签和选择器
- 定义资源请求和限制
- 配置健康检查（存活/就绪探针）

**安全**
- 使用NetworkPolicies限制流量
- 实现RBAC进行访问控制
- 将敏感数据存储在Secrets中
- 以非root用户身份运行容器

**监控**
- 配置适当的日志和指标
- 为关键条件设置告警
- 使用健康检查和就绪探针
- 监控资源使用情况和配额

Agentic Optimizations

Agentic 优化

Context	Command
Pod status (structured)	`kubectl get pods -n <ns> -o json \| jq '.items[] \| {name:.metadata.name, status:.status.phase}'`
Quick overview	`kubectl get pods -n <ns> -o wide`
Events (compact)	`kubectl get events -n <ns> --sort-by='.lastTimestamp' -o json`
Resource details	`kubectl get <resource> -o json`
Logs (bounded)	`kubectl logs <pod> -n <ns> --tail=50`

For detailed debugging commands, troubleshooting patterns, Helm workflows, and advanced K8s operations, see REFERENCE.md.

上下文	命令
Pod状态（结构化）	`kubectl get pods -n <ns> -o json \| jq '.items[] \| {name:.metadata.name, status:.status.phase}'`
快速概览	`kubectl get pods -n <ns> -o wide`
事件（精简版）	`kubectl get events -n <ns> --sort-by='.lastTimestamp' -o json`
资源详情	`kubectl get <resource> -o json`
日志（限制条数）	`kubectl logs <pod> -n <ns> --tail=50`

有关详细的调试命令、故障排查模式、Helm工作流和高级K8s操作，请参阅REFERENCE.md。

kubernetes-operations

Original

Translation

Kubernetes Operations

Kubernetes 操作

Core Expertise

核心专业能力

Cluster Operations Process

集群操作流程

Essential Commands

必备命令

Resource management

Resource management

Debugging

Debugging

Deployment management

Deployment management

Cluster inspection

Cluster inspection

Key Debugging Patterns

关键调试模式

Pod inspection

Pod inspection

Interactive debugging

Interactive debugging

Service debugging

Service debugging

Network connectivity

Network connectivity

Inside pod: nslookup, wget, nc commands

Inside pod: nslookup, wget, nc commands

CrashLoopBackOff debugging

CrashLoopBackOff debugging

Resource constraints

Resource constraints

State management

State management

Best Practices

最佳实践

CORRECT: Explicit context

CORRECT: Explicit context

WRONG: Relying on current context

WRONG: Relying on current context

Agentic Optimizations

Agentic 优化