network-policy

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Network Policy Management

网络策略管理

Architecture Quick Reference

架构快速参考

All cluster traffic is implicitly denied via Cilium baseline CCNPs. Two layers control access:
  1. Baselines (cluster-wide CCNPs): DNS egress, health probes, Prometheus scrape, opt-in kube-API
  2. Profiles (per-namespace via label): Ingress/egress rules matched by
    network-policy.homelab/profile=<value>
Platform namespaces (
kube-system
,
monitoring
,
database
, etc.) use hand-crafted CNPs — never apply profiles to them.

所有集群流量都通过Cilium基线CCNP被默认拒绝。访问控制分为两层:
  1. 基线策略(集群范围的CCNP):DNS出站、健康探测、Prometheus抓取、可选加入kube-API
  2. 配置文件(通过标签按命名空间划分):由
    network-policy.homelab/profile=<value>
    匹配的入站/出站规则
平台命名空间(
kube-system
monitoring
database
等)使用手工编写的CNP —— 切勿为其应用配置文件。

Workflow: Deploy App with Network Policy

工作流:部署应用并配置网络策略

Step 1: Choose a Profile

步骤1:选择配置文件

ProfileIngressEgressUse Case
isolated
NoneDNS onlyBatch jobs, workers
internal
Internal gatewayDNS onlyInternal dashboards
internal-egress
Internal gatewayDNS + HTTPSInternal apps calling external APIs
standard
Both gatewaysDNS + HTTPSPublic-facing web apps
Decision tree:
  • Does the app need to be reached from the internet? ->
    standard
  • Internal-only but needs to call external APIs? ->
    internal-egress
  • Internal-only, no external calls? ->
    internal
  • No ingress needed at all? ->
    isolated
配置文件入站规则出站规则适用场景
isolated
仅允许DNS批处理作业、工作节点
internal
允许内部网关访问仅允许DNS内部仪表板
internal-egress
允许内部网关访问允许DNS + HTTPS需要调用外部API的内部应用
standard
允许所有网关访问允许DNS + HTTPS面向公网的Web应用
决策树:
  • 应用需要从互联网访问? -> 选择
    standard
  • 仅内部访问但需要调用外部API? -> 选择
    internal-egress
  • 仅内部访问,无需调用外部API? -> 选择
    internal
  • 完全不需要入站访问? -> 选择
    isolated

Step 2: Apply Profile Label to Namespace

步骤2:为命名空间添加配置文件标签

In the namespace YAML (committed to git, not
kubectl apply
):
yaml
apiVersion: v1
kind: Namespace
metadata:
  name: my-app
  labels:
    network-policy.homelab/profile: standard
在命名空间YAML文件中(需提交到Git,禁止直接使用
kubectl apply
):
yaml
apiVersion: v1
kind: Namespace
metadata:
  name: my-app
  labels:
    network-policy.homelab/profile: standard

Step 3: Add Shared Resource Access Labels

步骤3:添加共享资源访问标签

If the app needs database, cache, or S3 access, add access labels to the namespace:
yaml
labels:
  network-policy.homelab/profile: standard
  access.network-policy.homelab/postgres: "true"     # PostgreSQL (port 5432)
  access.network-policy.homelab/dragonfly: "true"    # Dragonfly/Redis (port 6379)
  access.network-policy.homelab/garage-s3: "true"    # Garage S3 (port 3900)
  access.network-policy.homelab/kube-api: "true"     # Kubernetes API (port 6443)
如果应用需要访问数据库、缓存或S3服务,为命名空间添加以下访问标签:
yaml
labels:
  network-policy.homelab/profile: standard
  access.network-policy.homelab/postgres: "true"     # PostgreSQL(端口5432)
  access.network-policy.homelab/dragonfly: "true"    # Dragonfly/Redis(端口6379)
  access.network-policy.homelab/garage-s3: "true"    # Garage S3(端口3900)
  access.network-policy.homelab/kube-api: "true"     # Kubernetes API(端口6443)

Step 4: Verify Connectivity

步骤4:验证连通性

After deployment, check for dropped traffic:
bash
hubble observe --verdict DROPPED --namespace my-app --since 5m
If drops appear, see the Debugging section below.

部署完成后,检查是否有被拦截的流量:
bash
hubble observe --verdict DROPPED --namespace my-app --since 5m
如果出现流量拦截情况,请查看下方的调试章节。

Workflow: Debug Blocked Traffic

工作流:调试被拦截的流量

Step 1: Identify Drops

步骤1:识别被拦截的流量

bash
undefined
bash
undefined

All drops in a namespace

查看命名空间内所有被拦截的流量

hubble observe --verdict DROPPED --namespace my-app --since 5m
hubble observe --verdict DROPPED --namespace my-app --since 5m

With source/destination details

查看包含源/目标详情的拦截记录

hubble observe --verdict DROPPED --namespace my-app --since 5m -o json |
jq '{src: .source.namespace + "/" + .source.pod_name, dst: .destination.namespace + "/" + .destination.pod_name, port: (.l4.TCP.destination_port // .l4.UDP.destination_port)}'
undefined
hubble observe --verdict DROPPED --namespace my-app --since 5m -o json |
jq '{src: .source.namespace + "/" + .source.pod_name, dst: .destination.namespace + "/" + .destination.pod_name, port: (.l4.TCP.destination_port // .l4.UDP.destination_port)}'
undefined

Step 2: Classify the Drop

步骤2:分类拦截原因

Drop PatternLikely CauseFix
Egress to
kube-system:53
dropped
Missing DNS baselineShould not happen — check if baseline CCNP exists
Egress to
database:5432
dropped
Missing postgres access labelAdd
access.network-policy.homelab/postgres=true
Egress to
database:6379
dropped
Missing dragonfly access labelAdd
access.network-policy.homelab/dragonfly=true
Egress to internet
:443
dropped
Profile doesn't allow HTTPS egressSwitch to
internal-egress
or
standard
Ingress from
istio-gateway
dropped
Profile doesn't allow gateway ingressSwitch to
internal
,
internal-egress
, or
standard
Ingress from
monitoring:prometheus
dropped
Missing baselineShould not happen — check baseline CCNP
拦截模式可能原因修复方案
出站到
kube-system:53
被拦截
缺少DNS基线策略正常情况下不应出现此问题 —— 检查基线CCNP是否存在
出站到
database:5432
被拦截
缺少PostgreSQL访问标签添加
access.network-policy.homelab/postgres=true
标签
出站到
database:6379
被拦截
缺少Dragonfly访问标签添加
access.network-policy.homelab/dragonfly=true
标签
出站到互联网
:443
被拦截
配置文件不允许HTTPS出站切换到
internal-egress
standard
配置文件
来自
istio-gateway
的入站流量被拦截
配置文件不允许网关入站切换到
internal
internal-egress
standard
配置文件
来自
monitoring:prometheus
的入站流量被拦截
缺少基线策略正常情况下不应出现此问题 —— 检查基线CCNP是否存在

Step 3: Verify Specific Flows

步骤3:验证特定流量流

bash
undefined
bash
undefined

DNS resolution

验证DNS解析

hubble observe --namespace my-app --protocol UDP --port 53 --since 5m
hubble observe --namespace my-app --protocol UDP --port 53 --since 5m

Database connectivity

验证数据库连通性

hubble observe --namespace my-app --to-namespace database --port 5432 --since 5m
hubble observe --namespace my-app --to-namespace database --port 5432 --since 5m

Internet egress

验证互联网出站

hubble observe --namespace my-app --to-identity world --port 443 --since 5m
hubble observe --namespace my-app --to-identity world --port 443 --since 5m

Gateway ingress

验证网关入站

hubble observe --from-namespace istio-gateway --to-namespace my-app --since 5m
hubble observe --from-namespace istio-gateway --to-namespace my-app --since 5m

Prometheus scraping

验证Prometheus抓取

hubble observe --from-namespace monitoring --to-namespace my-app --since 5m
undefined
hubble observe --from-namespace monitoring --to-namespace my-app --since 5m
undefined

Step 4: Check Policy Status

步骤4:检查策略状态

bash
undefined
bash
undefined

List all policies affecting a namespace

查看影响命名空间的所有策略

kubectl get cnp -n my-app kubectl get ccnp | grep -E 'baseline|profile'
kubectl get cnp -n my-app kubectl get ccnp | grep -E 'baseline|profile'

Check which profile is active

检查当前激活的配置文件

kubectl get namespace my-app --show-labels | grep network-policy

---
kubectl get namespace my-app --show-labels | grep network-policy

---

Workflow: Emergency Escape Hatch

工作流:紧急逃生舱功能

Use only when network policies block legitimate traffic and you need immediate relief.
仅当网络策略拦截了合法流量且需要立即恢复时使用。

Step 1: Disable Enforcement

步骤1:禁用策略执行

bash
kubectl label namespace <ns> network-policy.homelab/enforcement=disabled
This triggers alerts:
  • NetworkPolicyEnforcementDisabled
    (warning) after 5 minutes
  • NetworkPolicyEnforcementDisabledLong
    (critical) after 24 hours
bash
kubectl label namespace <ns> network-policy.homelab/enforcement=disabled
此操作会触发以下告警:
  • NetworkPolicyEnforcementDisabled
    (警告):5分钟后触发
  • NetworkPolicyEnforcementDisabledLong
    (严重):24小时后触发

Step 2: Verify Traffic Flows

步骤2:验证流量恢复

bash
hubble observe --namespace <ns> --since 1m
bash
hubble observe --namespace <ns> --since 1m

Step 3: Investigate Root Cause

步骤3:调查根本原因

Use the debugging workflow above to identify what policy is missing or misconfigured.
使用上述调试工作流识别缺失或配置错误的策略。

Step 4: Fix the Policy (via GitOps)

步骤4:修复策略(通过GitOps)

Apply the fix through a PR — never
kubectl apply
directly.
通过PR提交修复 —— 禁止直接使用
kubectl apply

Step 5: Re-enable Enforcement

步骤5:重新启用策略执行

bash
kubectl label namespace <ns> network-policy.homelab/enforcement-
See
docs/runbooks/network-policy-escape-hatch.md
for the full procedure.

bash
kubectl label namespace <ns> network-policy.homelab/enforcement-
完整流程请查看
docs/runbooks/network-policy-escape-hatch.md

Workflow: Add Platform Namespace CNP

工作流:添加平台命名空间CNP

Platform namespaces need hand-crafted CNPs (not profiles). Create in
kubernetes/platform/config/network-policy/platform/
.
平台命名空间需要手工编写的CNP(而非配置文件)。请在
kubernetes/platform/config/network-policy/platform/
目录下创建。

Required Rules

必备规则

Every platform CNP must include:
  1. DNS egress to
    kube-system/kube-dns
    (port 53 UDP/TCP)
  2. Prometheus scrape ingress from
    monitoring
    namespace
  3. Health probe ingress from
    health
    entity and
    169.254.0.0/16
  4. HBONE rules if namespace participates in Istio mesh (port 15008 to/from
    istio-system/ztunnel
    )
  5. Service-specific rules for the namespace's actual traffic patterns
每个平台CNP必须包含以下规则:
  1. DNS出站:允许访问
    kube-system/kube-dns
    (53端口 UDP/TCP)
  2. Prometheus抓取入站:允许来自
    monitoring
    命名空间的访问
  3. 健康探测入站:允许来自
    health
    实体和
    169.254.0.0/16
    网段的访问
  4. HBONE规则:如果命名空间参与Istio网格,允许与
    istio-system/ztunnel
    的15008端口通信
  5. 服务特定规则:根据命名空间的实际流量模式添加对应规则

Template

模板

yaml
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: <namespace>-default
  namespace: <namespace>
spec:
  description: "<Namespace purpose>: describe allowed traffic"
  endpointSelector: {}
  ingress:
    # Health probes
    - fromEntities: [health]
    - fromCIDR: ["169.254.0.0/16"]
    # Prometheus scraping
    - fromEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: monitoring
            app.kubernetes.io/name: prometheus
      toPorts:
        - ports:
            - port: "<metrics-port>"
              protocol: TCP
    # HBONE (if mesh participant)
    - fromEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: istio-system
            app: ztunnel
      toPorts:
        - ports:
            - port: "15008"
              protocol: TCP
  egress:
    # DNS
    - toEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: kube-system
            k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: UDP
            - port: "53"
              protocol: TCP
    # HBONE (if mesh participant)
    - toEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: istio-system
            app: ztunnel
      toPorts:
        - ports:
            - port: "15008"
              protocol: TCP
After creating, add to
kubernetes/platform/config/network-policy/platform/kustomization.yaml
.

yaml
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: <namespace>-default
  namespace: <namespace>
spec:
  description: "<Namespace purpose>: describe allowed traffic"
  endpointSelector: {}
  ingress:
    # 健康探测
    - fromEntities: [health]
    - fromCIDR: ["169.254.0.0/16"]
    # Prometheus抓取
    - fromEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: monitoring
            app.kubernetes.io/name: prometheus
      toPorts:
        - ports:
            - port: "<metrics-port>"
              protocol: TCP
    # HBONE(如果参与网格)
    - fromEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: istio-system
            app: ztunnel
      toPorts:
        - ports:
            - port: "15008"
              protocol: TCP
  egress:
    # DNS
    - toEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: kube-system
            k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: UDP
            - port: "53"
              protocol: TCP
    # HBONE(如果参与网格)
    - toEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: istio-system
            app: ztunnel
      toPorts:
        - ports:
            - port: "15008"
              protocol: TCP
创建完成后,将其添加到
kubernetes/platform/config/network-policy/platform/kustomization.yaml
文件中。

Anti-Patterns

反模式

  • NEVER create explicit
    default-deny
    policies — baselines provide implicit deny
  • NEVER use profiles for platform namespaces — they need custom CNPs
  • NEVER hardcode IP addresses — use endpoint selectors and entities
  • NEVER allow
    any
    port — always specify explicit port lists
  • NEVER disable enforcement without following the escape hatch runbook
  • NEVER apply network policy changes via
    kubectl
    on integration/live — always through GitOps
  • Dev cluster exception: Direct
    kubectl apply
    of network policies is permitted on dev for testing

  • 切勿创建显式的
    default-deny
    策略 —— 基线策略已提供默认拒绝功能
  • 切勿为平台命名空间使用配置文件 —— 它们需要自定义CNP
  • 切勿硬编码IP地址 —— 使用端点选择器和实体
  • 切勿允许
    any
    端口 —— 始终指定明确的端口列表
  • 切勿不遵循逃生舱运行手册就禁用策略执行
  • 切勿在集成/生产环境中通过
    kubectl
    直接应用网络策略变更 —— 始终通过GitOps流程
  • 开发集群例外:在开发集群中允许直接使用
    kubectl apply
    测试网络策略

Cross-References

交叉引用

  • network-policy/CLAUDE.md — Full architecture and directory structure
  • docs/runbooks/network-policy-escape-hatch.md — Emergency bypass procedure
  • docs/runbooks/network-policy-verification.md — Hubble verification commands
  • network-policy/CLAUDE.md —— 完整架构和目录结构
  • docs/runbooks/network-policy-escape-hatch.md —— 紧急绕过流程
  • docs/runbooks/network-policy-verification.md —— Hubble验证命令