gcp-gke-troubleshooting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGKE Troubleshooting
GKE故障排查
Purpose
目的
Systematically diagnose and resolve common GKE issues. This skill provides structured debugging workflows, common causes, and proven solutions for the most frequent problems encountered in production deployments.
系统性诊断并解决常见GKE问题。本技能为生产部署中最常遇到的问题提供结构化调试流程、常见原因及经过验证的解决方案。
When to Use
适用场景
Use this skill when you need to:
- Debug pods stuck in Pending, CrashLoopBackOff, or ImagePullBackOff status
- Troubleshoot networking issues (DNS failures, service connectivity)
- Fix Cloud SQL connection problems or IAM authentication errors
- Resolve Pub/Sub message processing issues
- Investigate resource exhaustion or scheduling failures
- Debug health probe failures
- Diagnose application crashes or startup issues
Trigger phrases: "pod not starting", "CrashLoopBackOff", "debug GKE issue", "Cloud SQL connection failed", "Pub/Sub not working", "pod pending"
当你需要以下操作时使用本技能:
- 调试处于Pending、CrashLoopBackOff或ImagePullBackOff状态的Pod
- 排查网络问题(DNS故障、服务连通性问题)
- 修复Cloud SQL连接问题或IAM认证错误
- 解决Pub/Sub消息处理问题
- 调查资源耗尽或调度失败问题
- 调试健康探针故障
- 诊断应用崩溃或启动问题
触发关键词:"pod无法启动"、"CrashLoopBackOff"、"调试GKE问题"、"Cloud SQL连接失败"、"Pub/Sub无法工作"、"Pod处于Pending状态"
Table of Contents
目录
Quick Start
快速开始
Quick diagnostic flow for any pod issue:
bash
undefined针对任意Pod问题的快速诊断流程:
bash
undefined1. Check pod status
1. 检查Pod状态
kubectl get pods -n wtr-supplier-charges
kubectl get pods -n wtr-supplier-charges
2. View detailed pod information
2. 查看Pod详细信息
kubectl describe pod <pod-name> -n wtr-supplier-charges
kubectl describe pod <pod-name> -n wtr-supplier-charges
3. Check logs
3. 查看日志
kubectl logs <pod-name> -n wtr-supplier-charges
kubectl logs <pod-name> -n wtr-supplier-charges
4. Check previous logs if crashed
4. 若Pod崩溃,查看之前的日志
kubectl logs <pod-name> -n wtr-supplier-charges --previous
kubectl logs <pod-name> -n wtr-supplier-charges --previous
5. Check events for scheduling issues
5. 检查调度相关事件
kubectl get events -n wtr-supplier-charges --sort-by='.lastTimestamp'
kubectl get events -n wtr-supplier-charges --sort-by='.lastTimestamp'
6. Check resource availability
6. 检查资源可用性
kubectl top nodes
kubectl top pods -n wtr-supplier-charges
undefinedkubectl top nodes
kubectl top pods -n wtr-supplier-charges
undefinedInstructions
操作步骤
Step 1: Identify the Pod Status
步骤1:确认Pod状态
Understand what the pod status means:
bash
kubectl get pods -n wtr-supplier-charges -o wide| Status | Meaning | Action |
|---|---|---|
| Running | Pod is executing | Check logs if issues |
| Pending | Waiting to be scheduled | Check events, node resources |
| CrashLoopBackOff | App crashes repeatedly | Check logs, configuration |
| ImagePullBackOff | Can't pull image | Verify image, permissions |
| Completed | Pod ran successfully and exited | Normal for batch jobs |
| Error | Pod exited with error | Check logs |
理解Pod状态的含义:
bash
kubectl get pods -n wtr-supplier-charges -o wide| 状态 | 含义 | 操作建议 |
|---|---|---|
| Running | Pod正在执行 | 若有问题则查看日志 |
| Pending | 等待调度 | 检查事件、节点资源 |
| CrashLoopBackOff | 应用反复崩溃 | 查看日志、配置 |
| ImagePullBackOff | 无法拉取镜像 | 验证镜像、权限 |
| Completed | Pod成功运行并退出 | 批处理作业的正常状态 |
| Error | Pod退出并报错 | 查看日志 |
Step 2: Investigate Based on Status
步骤2:根据状态排查
Pod Status: ImagePullBackOff
Pod状态:ImagePullBackOff
Diagnose:
bash
undefined诊断:
bash
undefinedGet detailed error
获取详细错误信息
kubectl describe pod <pod-name> -n wtr-supplier-charges
kubectl describe pod <pod-name> -n wtr-supplier-charges
Look for "Failed to pull image" in Events section
在Events部分查找"Failed to pull image"
Example: "Failed to pull image ... access denied"
示例:"Failed to pull image ... access denied"
Check if image exists in registry
检查镜像是否存在于镜像仓库
gcloud artifacts docker images list
europe-west2-docker.pkg.dev/ecp-artifact-registry/wtr-supplier-charges-container-images
europe-west2-docker.pkg.dev/ecp-artifact-registry/wtr-supplier-charges-container-images
**Solutions:**
1. **Image doesn't exist:**
```bashgcloud artifacts docker images list
europe-west2-docker.pkg.dev/ecp-artifact-registry/wtr-supplier-charges-container-images
europe-west2-docker.pkg.dev/ecp-artifact-registry/wtr-supplier-charges-container-images
**解决方案:**
1. **镜像不存在:**
```bashVerify image tag is correct
验证镜像标签是否正确
kubectl get deployment supplier-charges-hub -n wtr-supplier-charges
-o jsonpath='{.spec.template.spec.containers[0].image}'
-o jsonpath='{.spec.template.spec.containers[0].image}'
2. **Missing Artifact Registry permissions:**
```bashkubectl get deployment supplier-charges-hub -n wtr-supplier-charges
-o jsonpath='{.spec.template.spec.containers[0].image}'
-o jsonpath='{.spec.template.spec.containers[0].image}'
2. **缺少Artifact Registry权限:**
```bashGrant Artifact Registry Reader role
授予Artifact Registry Reader角色
gcloud artifacts repositories add-iam-policy-binding
wtr-supplier-charges-container-images
--location=europe-west2
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/artifactregistry.reader"
wtr-supplier-charges-container-images
--location=europe-west2
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/artifactregistry.reader"
3. **Private image registry authentication:**
```bashgcloud artifacts repositories add-iam-policy-binding
wtr-supplier-charges-container-images
--location=europe-west2
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/artifactregistry.reader"
wtr-supplier-charges-container-images
--location=europe-west2
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/artifactregistry.reader"
3. **私有镜像仓库认证:**
```bashCreate image pull secret
创建镜像拉取密钥
kubectl create secret docker-registry regcred
--docker-server=europe-west2-docker.pkg.dev
--docker-username=_json_key
--docker-password="$(cat key.json)"
-n wtr-supplier-charges
--docker-server=europe-west2-docker.pkg.dev
--docker-username=_json_key
--docker-password="$(cat key.json)"
-n wtr-supplier-charges
kubectl create secret docker-registry regcred
--docker-server=europe-west2-docker.pkg.dev
--docker-username=_json_key
--docker-password="$(cat key.json)"
-n wtr-supplier-charges
--docker-server=europe-west2-docker.pkg.dev
--docker-username=_json_key
--docker-password="$(cat key.json)"
-n wtr-supplier-charges
Add to deployment
添加到部署配置
spec:
imagePullSecrets:
- name: regcred
undefinedspec:
imagePullSecrets:
- name: regcred
undefinedPod Status: CrashLoopBackOff
Pod状态:CrashLoopBackOff
Diagnose:
bash
undefined诊断:
bash
undefinedCheck current logs
查看当前日志
kubectl logs <pod-name> -n wtr-supplier-charges
kubectl logs <pod-name> -n wtr-supplier-charges
Check logs from previous container (if crashed)
查看之前容器的日志(若已崩溃)
kubectl logs <pod-name> -n wtr-supplier-charges --previous
kubectl logs <pod-name> -n wtr-supplier-charges --previous
Check liveness probe configuration
检查存活探针配置
kubectl describe pod <pod-name> -n wtr-supplier-charges | grep -A 10 "Liveness"
**Common Causes:**
1. **Application exits immediately:**
```bashkubectl describe pod <pod-name> -n wtr-supplier-charges | grep -A 10 "Liveness"
**常见原因:**
1. **应用立即退出:**
```bashCheck startup logs for Java/Spring Boot errors
查看Java/Spring Boot应用的启动日志
kubectl logs <pod-name> -n wtr-supplier-charges | head -50
kubectl logs <pod-name> -n wtr-supplier-charges | head -50
Look for: ClassNotFoundException, ConfigurationException, connection errors
查找:ClassNotFoundException、ConfigurationException、连接错误
2. **Liveness probe fails too early:**
```bash
2. **存活探针过早失败:**
```bashIncrease initialDelaySeconds from 20 to 60
将initialDelaySeconds从20增加到60
kubectl patch deployment supplier-charges-hub -n wtr-supplier-charges
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","livenessProbe":{"initialDelaySeconds":60}}]}}}}'
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","livenessProbe":{"initialDelaySeconds":60}}]}}}}'
3. **Out of memory:**
```bashkubectl patch deployment supplier-charges-hub -n wtr-supplier-charges
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","livenessProbe":{"initialDelaySeconds":60}}]}}}}'
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","livenessProbe":{"initialDelaySeconds":60}}]}}}}'
3. **内存不足:**
```bashCheck memory usage
检查内存使用情况
kubectl top pods <pod-name> -n wtr-supplier-charges
kubectl top pods <pod-name> -n wtr-supplier-charges
Increase memory limits
增加内存限制
kubectl patch deployment supplier-charges-hub -n wtr-supplier-charges
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","resources":{"limits":{"memory":"4Gi"}}}]}}}}'
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","resources":{"limits":{"memory":"4Gi"}}}]}}}}'
4. **Missing environment variables:**
```bashkubectl patch deployment supplier-charges-hub -n wtr-supplier-charges
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","resources":{"limits":{"memory":"4Gi"}}}]}}}}'
-p '{"spec":{"template":{"spec":{"containers":[{"name":"supplier-charges-hub-container","resources":{"limits":{"memory":"4Gi"}}}]}}}}'
4. **缺少环境变量:**
```bashCheck what env vars are set
检查已设置的环境变量
kubectl exec <pod-name> -n wtr-supplier-charges -- env | sort
kubectl exec <pod-name> -n wtr-supplier-charges -- env | sort
Verify ConfigMap/Secret values
验证ConfigMap/Secret的值
kubectl get configmap supplier-charges-hub-config -n wtr-supplier-charges -o yaml
kubectl get secret db-credentials -n wtr-supplier-charges -o yaml
undefinedkubectl get configmap supplier-charges-hub-config -n wtr-supplier-charges -o yaml
kubectl get secret db-credentials -n wtr-supplier-charges -o yaml
undefinedPod Status: Pending (Unschedulable)
Pod状态:Pending(无法调度)
Diagnose:
bash
undefined诊断:
bash
undefinedCheck events for scheduling messages
检查调度相关事件信息
kubectl describe pod <pod-name> -n wtr-supplier-charges
kubectl describe pod <pod-name> -n wtr-supplier-charges
Look for: "Insufficient memory", "Insufficient cpu", "PersistentVolumeClaim"
查找:"Insufficient memory"、"Insufficient cpu"、"PersistentVolumeClaim"
Check node capacity
检查节点容量
kubectl top nodes
kubectl describe nodes
**Solutions:**
1. **Insufficient cluster resources:**
```bashkubectl top nodes
kubectl describe nodes
**解决方案:**
1. **集群资源不足:**
```bashScale deployment down
缩减部署副本数
kubectl scale deployment supplier-charges-hub --replicas=1 -n wtr-supplier-charges
kubectl scale deployment supplier-charges-hub --replicas=1 -n wtr-supplier-charges
Or trigger autoscaling (if available)
或触发自动扩缩容(若已启用)
GKE Autopilot automatically provisions capacity
GKE Autopilot会自动配置容量
2. **Node affinity/taints preventing scheduling:**
```bash
2. **节点亲和性/污点阻止调度:**
```bashCheck node taints
检查节点污点
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
View pod's node affinity/tolerations
查看Pod的节点亲和性/容忍度配置
kubectl get pod <pod-name> -n wtr-supplier-charges -o yaml | grep -A 10 -B 2 "affinity|toleration"
kubectl get pod <pod-name> -n wtr-supplier-charges -o yaml | grep -A 10 -B 2 "affinity|toleration"
Add toleration to deployment if needed
若需要,为部署添加容忍度配置
spec:
tolerations:
- key: "dedicated" operator: "Equal" value: "compute" effect: "NoSchedule"
3. **PersistentVolumeClaim not bound:**
```bashspec:
tolerations:
- key: "dedicated" operator: "Equal" value: "compute" effect: "NoSchedule"
3. **PersistentVolumeClaim未绑定:**
```bashCheck PVC status
检查PVC状态
kubectl get pvc -n wtr-supplier-charges
kubectl get pvc -n wtr-supplier-charges
If Pending, check storage class
若处于Pending状态,检查存储类
kubectl get storageclass
undefinedkubectl get storageclass
undefinedStep 3: Network and Connectivity Issues
步骤3:网络与连通性问题
DNS Resolution Failures
DNS解析失败
Diagnose:
bash
undefined诊断:
bash
undefinedTest DNS from pod
在Pod内部测试DNS
kubectl exec <pod-name> -n wtr-supplier-charges -- nslookup postgres
kubectl exec <pod-name> -n wtr-supplier-charges -- nslookup postgres
Test connectivity to service
测试与服务的连通性
kubectl exec <pod-name> -n wtr-supplier-charges -- curl -v http://postgres:5432
**Solutions:**
1. **CoreDNS pods not running:**
```bashkubectl exec <pod-name> -n wtr-supplier-charges -- curl -v http://postgres:5432
**解决方案:**
1. **CoreDNS Pod未运行:**
```bashCheck CoreDNS
检查CoreDNS状态
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl get pods -n kube-system -l k8s-app=kube-dns
Restart CoreDNS if needed
若需要,重启CoreDNS
kubectl rollout restart deployment coredns -n kube-system
2. **Service doesn't exist or wrong namespace:**
```bashkubectl rollout restart deployment coredns -n kube-system
2. **服务不存在或命名空间错误:**
```bashVerify service exists
验证服务是否存在
kubectl get svc postgres -n wtr-supplier-charges
kubectl get svc postgres -n wtr-supplier-charges
Use fully qualified DNS name if in different namespace
若在不同命名空间,使用完整DNS名称
service-name.namespace.svc.cluster.local
undefinedservice-name.namespace.svc.cluster.local
undefinedService Not Accessible
服务无法访问
Diagnose:
bash
undefined诊断:
bash
undefinedCheck service endpoints
检查服务端点
kubectl get endpoints supplier-charges-hub -n wtr-supplier-charges
kubectl get endpoints supplier-charges-hub -n wtr-supplier-charges
If empty, no pods match the selector
若为空,说明没有Pod匹配选择器
kubectl get svc supplier-charges-hub -n wtr-supplier-charges -o yaml | grep selector
kubectl get pods -n wtr-supplier-charges --show-labels
**Solutions:**
1. **Pod labels don't match service selector:**
```bashkubectl get svc supplier-charges-hub -n wtr-supplier-charges -o yaml | grep selector
kubectl get pods -n wtr-supplier-charges --show-labels
**解决方案:**
1. **Pod标签与服务选择器不匹配:**
```bashAdd/update labels on deployment
更新部署的标签
kubectl patch deployment supplier-charges-hub -n wtr-supplier-charges
-p '{"spec":{"template":{"metadata":{"labels":{"app":"supplier-charges-hub"}}}}}'
-p '{"spec":{"template":{"metadata":{"labels":{"app":"supplier-charges-hub"}}}}}'
2. **Pods not in Ready state:**
```bashkubectl patch deployment supplier-charges-hub -n wtr-supplier-charges
-p '{"spec":{"template":{"metadata":{"labels":{"app":"supplier-charges-hub"}}}}}'
-p '{"spec":{"template":{"metadata":{"labels":{"app":"supplier-charges-hub"}}}}}'
2. **Pod未处于Ready状态:**
```bashCheck readiness probe
检查就绪探针配置
kubectl describe pod <pod-name> -n wtr-supplier-charges | grep -A 10 "Readiness"
kubectl describe pod <pod-name> -n wtr-supplier-charges | grep -A 10 "Readiness"
Check health endpoint
检查健康端点
kubectl exec <pod-name> -n wtr-supplier-charges --
curl localhost:8080/actuator/health/readiness
curl localhost:8080/actuator/health/readiness
undefinedkubectl exec <pod-name> -n wtr-supplier-charges --
curl localhost:8080/actuator/health/readiness
curl localhost:8080/actuator/health/readiness
undefinedStep 4: Database Connection Issues
步骤4:数据库连接问题
Diagnose:
bash
undefined诊断:
bash
undefinedTest connectivity to Cloud SQL Proxy
测试与Cloud SQL Proxy的连通性
kubectl exec <pod-name> -n wtr-supplier-charges -- nc -zv localhost 5432
kubectl exec <pod-name> -n wtr-supplier-charges -- nc -zv localhost 5432
Check Cloud SQL Proxy logs
查看Cloud SQL Proxy日志
kubectl logs <pod-name> -c cloud-sql-proxy -n wtr-supplier-charges
kubectl logs <pod-name> -c cloud-sql-proxy -n wtr-supplier-charges
Check application startup logs for DB connection errors
查看应用启动日志中的数据库连接错误
kubectl logs <pod-name> -c supplier-charges-hub-container -n wtr-supplier-charges | grep -i "database|connection"
**Solutions:**
1. **IAM Authentication fails:**
```bashkubectl logs <pod-name> -c supplier-charges-hub-container -n wtr-supplier-charges | grep -i "database|connection"
**解决方案:**
1. **IAM认证失败:**
```bashVerify Workload Identity binding
验证工作负载身份绑定
kubectl get sa app-runtime -n wtr-supplier-charges -o yaml | grep iam.gke.io
kubectl get sa app-runtime -n wtr-supplier-charges -o yaml | grep iam.gke.io
Grant cloudsql.client role
授予cloudsql.client角色
gcloud projects add-iam-policy-binding project-id
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/cloudsql.client"
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/cloudsql.client"
gcloud projects add-iam-policy-binding project-id
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/cloudsql.client"
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/cloudsql.client"
Check service account email format (must be {name}@{project}.iam)
检查服务账号邮箱格式(必须为{name}@{project}.iam)
2. **Wrong connection string:**
```bash
2. **连接字符串错误:**
```bashVerify DB_CONNECTION_NAME format: project:region:instance
验证DB_CONNECTION_NAME格式:project:region:instance
kubectl get configmap db-config -n wtr-supplier-charges -o yaml
kubectl get configmap db-config -n wtr-supplier-charges -o yaml
Should be something like: ecp-wtr-supplier-charges-labs:europe-west2:supplier-charges-hub
格式示例:ecp-wtr-supplier-charges-labs:europe-west2:supplier-charges-hub
3. **Cloud SQL Proxy not running:**
```bash
3. **Cloud SQL Proxy未运行:**
```bashCheck sidecar logs
查看边车容器日志
kubectl logs <pod-name> -c cloud-sql-proxy -n wtr-supplier-charges
kubectl logs <pod-name> -c cloud-sql-proxy -n wtr-supplier-charges
Check sidecar resources
检查边车容器资源配置
kubectl describe pod <pod-name> -n wtr-supplier-charges | grep -A 15 "cloud-sql-proxy"
undefinedkubectl describe pod <pod-name> -n wtr-supplier-charges | grep -A 15 "cloud-sql-proxy"
undefinedStep 5: Pub/Sub Issues
步骤5:Pub/Sub问题
Diagnose:
bash
undefined诊断:
bash
undefinedCheck subscription backlog
检查订阅积压情况
gcloud pubsub subscriptions describe supplier-charges-incoming-sub
--project=ecp-wtr-supplier-charges-labs
--project=ecp-wtr-supplier-charges-labs
gcloud pubsub subscriptions describe supplier-charges-incoming-sub
--project=ecp-wtr-supplier-charges-labs
--project=ecp-wtr-supplier-charges-labs
Check application Pub/Sub logs
查看应用的Pub/Sub日志
kubectl logs <pod-name> -c supplier-charges-hub-container
-n wtr-supplier-charges | grep -i "pubsub|subscription"
-n wtr-supplier-charges | grep -i "pubsub|subscription"
kubectl logs <pod-name> -c supplier-charges-hub-container
-n wtr-supplier-charges | grep -i "pubsub|subscription"
-n wtr-supplier-charges | grep -i "pubsub|subscription"
Test pub/sub connectivity from pod
在Pod内部测试Pub/Sub连通性
kubectl exec <pod-name> -n wtr-supplier-charges --
gcloud pubsub topics list --project=ecp-wtr-supplier-charges-labs
gcloud pubsub topics list --project=ecp-wtr-supplier-charges-labs
**Solutions:**
1. **Missing Pub/Sub permissions:**
```bashkubectl exec <pod-name> -n wtr-supplier-charges --
gcloud pubsub topics list --project=ecp-wtr-supplier-charges-labs
gcloud pubsub topics list --project=ecp-wtr-supplier-charges-labs
**解决方案:**
1. **缺少Pub/Sub权限:**
```bashGrant Pub/Sub roles
授予Pub/Sub角色
gcloud projects add-iam-policy-binding project-id
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.subscriber"
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.subscriber"
gcloud projects add-iam-policy-binding project-id
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.publisher"
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.publisher"
2. **High subscription backlog (messages not being consumed):**
```bashgcloud projects add-iam-policy-binding project-id
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.subscriber"
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.subscriber"
gcloud projects add-iam-policy-binding project-id
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.publisher"
--member="serviceAccount:app-runtime@project.iam.gserviceaccount.com"
--role="roles/pubsub.publisher"
2. **订阅积压过高(消息未被消费):**
```bashCheck if pod is running
检查Pod是否在运行
kubectl get pods -n wtr-supplier-charges
kubectl get pods -n wtr-supplier-charges
Check application logs for processing errors
查看应用日志中的处理错误
kubectl logs -f <pod-name> -c supplier-charges-hub-container
-n wtr-supplier-charges | grep -i "error|exception"
-n wtr-supplier-charges | grep -i "error|exception"
kubectl logs -f <pod-name> -c supplier-charges-hub-container
-n wtr-supplier-charges | grep -i "error|exception"
-n wtr-supplier-charges | grep -i "error|exception"
Increase message processing timeout
增加消息处理超时时间
In application.yaml:
在application.yaml中配置:
spring.cloud.gcp.pubsub.subscriber.max-ack-extension-period: 600
spring.cloud.gcp.pubsub.subscriber.max-ack-extension-period: 600
3. **Message processing failures:**
```bash
3. **消息处理失败:**
```bashCheck for poison messages (causing repeated failures)
检查毒消息(导致反复失败的消息)
Review DLQ (Dead Letter Queue) if configured
若已配置,查看死信队列(DLQ)
Implement retry logic with exponential backoff
实现带指数退避的重试逻辑
See Spring Cloud GCP documentation for retry configuration
查看Spring Cloud GCP文档中的重试配置
undefinedundefinedExamples
示例
See examples/examples.md for comprehensive examples including:
- Complete troubleshooting workflow
- Database connectivity debugging
- Pub/Sub debugging
查看examples/examples.md获取完整示例,包括:
- 完整故障排查流程
- 数据库连通性调试
- Pub/Sub调试
Requirements
前提条件
- access to the cluster
kubectl - CLI configured
gcloud - Permissions to view pod logs and describe resources
- For database debugging: access to view Cloud SQL configuration
- For Pub/Sub debugging: access to view subscription details
- 拥有集群的访问权限
kubectl - 已配置CLI
gcloud - 拥有查看Pod日志和描述资源的权限
- 数据库调试:拥有查看Cloud SQL配置的权限
- Pub/Sub调试:拥有查看订阅详情的权限
See Also
相关链接
- gcp-gke-deployment-strategies - Understand deployment health checks
- gcp-gke-monitoring-observability - Monitor applications
- gcp-gke-workload-identity - Debug IAM/Workload Identity issues
- gcp-gke-deployment-strategies - 了解部署健康检查
- gcp-gke-monitoring-observability - 监控应用
- gcp-gke-workload-identity - 调试IAM/工作负载身份问题