eg-production-guide
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseEnvoy Gateway Production Deployment
Envoy Gateway生产部署
Deployment Modes
部署模式
Single Tenant (Default)
单租户(默认)
- One GatewayClass per Envoy Gateway controller
- Simplest model; suitable for most single-team deployments
- All Gateways share the same controller and Envoy Proxy fleet
- 每个Envoy Gateway控制器对应一个GatewayClass
- 最简单的模式;适用于大多数单团队部署场景
- 所有Gateway共享同一个控制器和Envoy Proxy集群
Multi-Tenant
多租户
- Deploy separate Envoy Gateway controllers per tenant namespace
- Each controller must have a unique controller name in its GatewayClass
- Provides strong tenant isolation at the control plane level
- Install via separate Helm releases with distinct
--set config.envoyGateway.gateway.controllerName=...
- 为每个租户命名空间部署独立的Envoy Gateway控制器
- 每个控制器的GatewayClass中必须设置唯一的控制器名称
- 在控制平面层面提供强租户隔离
- 通过独立的Helm发布进行安装,需指定不同的参数
--set config.envoyGateway.gateway.controllerName=...
Gateway Namespace Mode
Gateway命名空间模式
- Envoy Proxy pods deploy in the Gateway's namespace instead of the controller namespace
- Provides stronger workload isolation: proxy runs alongside the application
- Enables JWT authentication between proxy and controller for hardened communication
- Enable with
envoyGateway.provider.kubernetes.deploy.type: Namespace
- Envoy Proxy Pod部署在Gateway所在的命名空间而非控制器命名空间
- 提供更强的工作负载隔离:代理与应用程序部署在同一位置
- 支持在代理与控制器之间启用JWT认证,强化通信安全性
- 通过设置启用
envoyGateway.provider.kubernetes.deploy.type: Namespace
Merged Gateways
合并Gateway
- Merge listeners from multiple Gateway resources into a single Envoy Proxy fleet
- All merged Gateways share a single IP address / load balancer
- Useful for consolidating ingress when teams own different Gateways but share infrastructure
- Enable with on the GatewayClass parametersRef (EnvoyProxy)
mergeGateways: true
- 将多个Gateway资源的监听器合并到单个Envoy Proxy集群中
- 所有合并后的Gateway共享同一个IP地址/负载均衡器
- 适用于多个团队拥有不同Gateway但共享基础设施的场景,可整合入口流量
- 在GatewayClass的parametersRef(EnvoyProxy)中设置启用
mergeGateways: true
Performance Tuning
性能调优
- Connection timeouts: set explicitly in ClientTrafficPolicy and BackendTrafficPolicy. Never rely on Envoy defaults.
- — total time for the client to send a complete request
timeout.http.requestTimeout - — close connections idle longer than this
timeout.http.idleTimeout
- HTTP/2 max concurrent streams: limit to 100 to prevent a single connection from monopolizing resources
- Buffer limits: set to 32 KiB for both listener and cluster buffers to cap memory under load
- Configure via EnvoyProxy or EnvoyPatchPolicy
spec.bootstrap
- Configure via EnvoyProxy
- Resource requests/limits: always set CPU and memory on Envoy Proxy pods via EnvoyProxy
spec.provider.kubernetes.envoyDeployment.container.resources - Horizontal scaling: use HPA on the Envoy Proxy Deployment; scale on CPU utilization (target 60-70%)
- Keep-alive: enable TCP keep-alive on backend connections to avoid connection resets through cloud load balancers
- 连接超时:在ClientTrafficPolicy和BackendTrafficPolicy中显式设置。切勿依赖Envoy默认值。
- —— 客户端发送完整请求的总时长
timeout.http.requestTimeout - —— 关闭空闲时间超过该值的连接
timeout.http.idleTimeout
- HTTP/2最大并发流:限制为100,防止单个连接占用过多资源
- 缓冲区限制:将监听器和集群缓冲区均设置为32 KiB,以控制负载下的内存占用
- 通过EnvoyProxy 或EnvoyPatchPolicy进行配置
spec.bootstrap
- 通过EnvoyProxy
- 资源请求/限制:始终通过EnvoyProxy 为Envoy Proxy Pod设置CPU和内存限制
spec.provider.kubernetes.envoyDeployment.container.resources - 水平扩缩容:在Envoy Proxy Deployment上使用HPA;基于CPU利用率(目标60-70%)进行扩缩容
- 长连接保持:在后端连接上启用TCP长连接,避免通过云负载均衡器时出现连接重置
Observability
可观测性
Access Logging
访问日志
- Configure via EnvoyProxy
spec.telemetry.accessLog - Sinks: File (stdout/path) or OpenTelemetry (gRPC collector)
- Use structured JSON format for machine parsing
- Include at minimum: method, path, response code, duration, upstream host
- 通过EnvoyProxy 进行配置
spec.telemetry.accessLog - 输出方式:文件(标准输出/指定路径)或OpenTelemetry(gRPC收集器)
- 使用结构化JSON格式以便机器解析
- 至少包含以下字段:请求方法、路径、响应码、耗时、上游主机
Metrics
指标
- Expose Prometheus metrics via EnvoyProxy
spec.telemetry.metrics - Scrape from Envoy Proxy pods on the admin port (default 19001)
- Key metrics: ,
envoy_http_downstream_rq_total,envoy_http_downstream_rq_xxenvoy_cluster_upstream_rq_time - Enable Envoy Gateway controller metrics for control plane health
- 通过EnvoyProxy 暴露Prometheus指标
spec.telemetry.metrics - 从Envoy Proxy Pod的管理端口(默认19001)抓取指标
- 关键指标:、
envoy_http_downstream_rq_total、envoy_http_downstream_rq_xxenvoy_cluster_upstream_rq_time - 启用Envoy Gateway控制器指标以监控控制平面健康状态
Tracing
链路追踪
- Configure distributed tracing via EnvoyProxy
spec.telemetry.tracing - Export to OpenTelemetry collector (gRPC or HTTP)
- Set appropriate sampling rate: 1-10% in production, 100% in staging
- Propagate trace context headers (,
traceparent)tracestate
- 通过EnvoyProxy 配置分布式链路追踪
spec.telemetry.tracing - 导出至OpenTelemetry收集器(gRPC或HTTP协议)
- 设置合适的采样率:生产环境1-10%,预发布环境100%
- 传递追踪上下文头(、
traceparent)tracestate
Operations
操作指南
Installation
安装
bash
helm install eg oci://docker.io/envoyproxy/gateway-helm \
--version v1.7.0 \
-n envoy-gateway-system \
--create-namespace- Always pin Helm chart versions — never use or omit
latest--version - Use for idempotent deployments
helm upgrade --install
bash
helm install eg oci://docker.io/envoyproxy/gateway-helm \
--version v1.7.0 \
-n envoy-gateway-system \
--create-namespace- 始终固定Helm Chart版本——切勿使用或省略
latest参数--version - 使用实现幂等部署
helm upgrade --install
GitOps
GitOps
- Manage all Gateway API resources via GitOps (ArgoCD, Flux)
- Store Gateway, Route, and Policy manifests in version control
- Implement mandatory PR reviews for all gateway configuration changes
- Use SCM branch protection rules on the main branch
- 通过GitOps(ArgoCD、Flux)管理所有Gateway API资源
- 将Gateway、Route和Policy清单存储在版本控制系统中
- 对所有网关配置变更实施强制PR审核
- 在主分支上启用SCM分支保护规则
Upgrade Strategy
升级策略
- Upgrade Envoy Gateway controller first, then verify CRD compatibility
- Test upgrades in a staging environment that mirrors production topology
- Review release notes for breaking changes in CRD schemas or default behavior
- Back up CRD instances before upgrading ()
kubectl get -o yaml
- 先升级Envoy Gateway控制器,然后验证CRD兼容性
- 在与生产环境拓扑一致的预发布环境中测试升级
- 查看发布说明,了解CRD schema或默认行为中的破坏性变更
- 升级前备份CRD实例()
kubectl get -o yaml