kubernetes-operations
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseKubernetes Operations
Kubernetes 操作
Expert knowledge for Kubernetes cluster management, deployment, and troubleshooting with mastery of kubectl and cloud-native patterns.
具备Kubernetes集群管理、部署和故障排查的专业知识,精通kubectl和云原生模式。
Core Expertise
核心专业能力
Kubernetes Operations
- Workload Management: Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
- Networking: Services, Ingress, NetworkPolicies, and DNS configuration
- Configuration & Storage: ConfigMaps, Secrets, PersistentVolumes, and PersistentVolumeClaims
- Troubleshooting: Debugging pods, analyzing logs, and inspecting cluster events
Kubernetes 操作
- 工作负载管理:Deployments、StatefulSets、DaemonSets、Jobs和CronJobs
- 网络管理:Services、Ingress、NetworkPolicies和DNS配置
- 配置与存储:ConfigMaps、Secrets、PersistentVolumes和PersistentVolumeClaims
- 故障排查:Pod调试、日志分析和集群事件检查
Cluster Operations Process
集群操作流程
- Manifest First: Always prefer declarative YAML manifests for resource management
- Validate & Dry-Run: Use to validate changes
kubectl apply --dry-run=client - Inspect & Verify: After applying changes, verify with ,
kubectl get,kubectl describekubectl logs - Monitor Health: Continuously check status of nodes, pods, and services
- Clean Up: Ensure old or unused resources are properly garbage collected
- 优先使用清单:资源管理始终优先采用声明式YAML清单
- 验证与试运行:使用验证变更
kubectl apply --dry-run=client - 检查与验证:应用变更后,使用、
kubectl get、kubectl describe进行验证kubectl logs - 监控健康状态:持续检查节点、Pod和服务的状态
- 清理资源:确保旧的或未使用的资源被正确垃圾回收
Essential Commands
必备命令
bash
undefinedbash
undefinedResource management
Resource management
kubectl apply -f manifest.yaml
kubectl get pods -A
kubectl describe pod <pod-name>
kubectl logs -f <pod-name>
kubectl exec -it <pod-name> -- /bin/bash
kubectl apply -f manifest.yaml
kubectl get pods -A
kubectl describe pod <pod-name>
kubectl logs -f <pod-name>
kubectl exec -it <pod-name> -- /bin/bash
Debugging
Debugging
kubectl get events --sort-by='.lastTimestamp'
kubectl top nodes
kubectl top pods --containers
kubectl port-forward <pod-name> 8080:80
kubectl get events --sort-by='.lastTimestamp'
kubectl top nodes
kubectl top pods --containers
kubectl port-forward <pod-name> 8080:80
Deployment management
Deployment management
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>
Cluster inspection
Cluster inspection
kubectl cluster-info
kubectl get nodes -o wide
kubectl api-resources
undefinedkubectl cluster-info
kubectl get nodes -o wide
kubectl api-resources
undefinedKey Debugging Patterns
关键调试模式
Pod Debugging
bash
undefinedPod 调试
bash
undefinedPod inspection
Pod inspection
kubectl describe pod <pod-name>
kubectl get pod <pod-name> -o yaml
kubectl logs <pod-name> --previous
kubectl describe pod <pod-name>
kubectl get pod <pod-name> -o yaml
kubectl logs <pod-name> --previous
Interactive debugging
Interactive debugging
kubectl exec -it <pod-name> -- /bin/bash
kubectl debug <pod-name> -it --image=busybox
kubectl port-forward <pod-name> 8080:80
**Networking Troubleshooting**
```bashkubectl exec -it <pod-name> -- /bin/bash
kubectl debug <pod-name> -it --image=busybox
kubectl port-forward <pod-name> 8080:80
**网络故障排查**
```bashService debugging
Service debugging
kubectl get svc -o wide
kubectl get endpoints
kubectl describe svc <service>
kubectl get svc -o wide
kubectl get endpoints
kubectl describe svc <service>
Network connectivity
Network connectivity
kubectl run test-pod --image=busybox -it --rm -- sh
kubectl run test-pod --image=busybox -it --rm -- sh
Inside pod: nslookup, wget, nc commands
Inside pod: nslookup, wget, nc commands
**Common Issues**
```bash
**常见问题**
```bashCrashLoopBackOff debugging
CrashLoopBackOff debugging
kubectl logs <pod> --previous
kubectl describe pod <pod>
kubectl get events --field-selector involvedObject.name=<pod>
kubectl logs <pod> --previous
kubectl describe pod <pod>
kubectl get events --field-selector involvedObject.name=<pod>
Resource constraints
Resource constraints
kubectl top pod <pod>
kubectl describe pod <pod> | grep -A 5 Limits
kubectl top pod <pod>
kubectl describe pod <pod> | grep -A 5 Limits
State management
State management
kubectl state list
kubectl state show <resource>
undefinedkubectl state list
kubectl state show <resource>
undefinedBest Practices
最佳实践
Context Safety (CRITICAL)
- Always specify explicitly in every kubectl command
--context - Never rely on the current context - it may have been changed by another process
- Use format for all operations
kubectl --context=<context-name> get pods - This prevents accidental operations on the wrong cluster (e.g., running production commands against staging)
bash
undefined上下文安全(至关重要)
- 始终显式指定:在每个kubectl命令中都要显式指定
--context - 永远不要依赖当前上下文——它可能已被其他进程修改
- 对所有操作使用格式
kubectl --context=<context-name> get pods - 这可以防止在错误的集群上执行意外操作(例如,针对预发布集群运行生产命令)
bash
undefinedCORRECT: Explicit context
CORRECT: Explicit context
kubectl --context=gke_myproject_us-central1_prod get pods
kubectl --context=staging-cluster apply -f deployment.yaml
kubectl --context=gke_myproject_us-central1_prod get pods
kubectl --context=staging-cluster apply -f deployment.yaml
WRONG: Relying on current context
WRONG: Relying on current context
kubectl get pods # Which cluster is this targeting?
**Resource Definitions**
- Use declarative YAML manifests
- Implement proper labels and selectors
- Define resource requests and limits
- Configure health checks (liveness/readiness probes)
**Security**
- Use NetworkPolicies to restrict traffic
- Implement RBAC for access control
- Store sensitive data in Secrets
- Run containers as non-root users
**Monitoring**
- Configure proper logging and metrics
- Set up alerts for critical conditions
- Use health checks and readiness probes
- Monitor resource usage and quotaskubectl get pods # Which cluster is this targeting?
**资源定义**
- 使用声明式YAML清单
- 实现适当的标签和选择器
- 定义资源请求和限制
- 配置健康检查(存活/就绪探针)
**安全**
- 使用NetworkPolicies限制流量
- 实现RBAC进行访问控制
- 将敏感数据存储在Secrets中
- 以非root用户身份运行容器
**监控**
- 配置适当的日志和指标
- 为关键条件设置告警
- 使用健康检查和就绪探针
- 监控资源使用情况和配额Agentic Optimizations
Agentic 优化
| Context | Command |
|---|---|
| Pod status (structured) | |
| Quick overview | |
| Events (compact) | |
| Resource details | |
| Logs (bounded) | |
For detailed debugging commands, troubleshooting patterns, Helm workflows, and advanced K8s operations, see REFERENCE.md.
| 上下文 | 命令 |
|---|---|
| Pod状态(结构化) | |
| 快速概览 | |
| 事件(精简版) | |
| 资源详情 | |
| 日志(限制条数) | |
有关详细的调试命令、故障排查模式、Helm工作流和高级K8s操作,请参阅REFERENCE.md。