cloudflare-traffic-investigator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Investigating Traffic on Cloudflare-Protected Domains

受Cloudflare保护的域名流量排查

Arguments

参数

ArgumentDescription
$ARGUMENTS[0]
Cloudflare-protected domain to investigate (e.g.,
example.com
)
$ARGUMENTS[1]
Cloudflare zone ID for the domain (e.g.,
abc123def456
)
$ARGUMENTS[2]
(optional) Time range to investigate (e.g.,
"2025-06-01 04:00-05:00 NZST"
,
"today 9:00-10:00 AEDT"
). In current agent's local timezone (detect via system clock), not UTC.
If domain or zone ID is not provided, ask the user via
AskUserQuestion
. Time range is collected in Step 1 if not passed here.

Investigate unusual traffic patterns on Cloudflare-protected domains that cause downstream service failures (e.g., service overload, database saturation, API rate limiting). This skill walks through a structured investigation from confirming the spike through to a full incident report.
参数说明
$ARGUMENTS[0]
要排查的受Cloudflare保护的域名(例如
example.com
$ARGUMENTS[1]
对应域名的Cloudflare区域ID(例如
abc123def456
$ARGUMENTS[2]
(可选) 排查的时间范围(例如
"2025-06-01 04:00-05:00 NZST"
"today 9:00-10:00 AEDT"
)。使用当前Agent的本地时区(通过系统时钟检测),而非UTC时区。
如果未提供域名或区域ID,请通过
AskUserQuestion
询问用户。如果此处未传入时间范围,将在步骤1中收集。

用于排查受Cloudflare保护的域名上导致下游服务故障(例如服务过载、数据库饱和、API速率限制)的异常流量模式。该技能提供了结构化的排查流程,从确认流量突增到生成完整的事件报告。

Investigation Workflow

排查工作流

Follow these steps in order. Each step file contains detailed instructions and example Cloudflare GraphQL queries.
  1. Get parameters — Collect time range and zone info
  2. Confirm spike — Query hourly traffic to verify the anomaly
  3. Minute-level detail — Narrow to exact spike timing
  4. Identify culprit JA4 — Find JA4 fingerprints with highest request counts
  5. Analyze traffic — For top JA4s, identify paths, user IDs, ASNs
  6. Verify legitimacy — Check bot scores, WAF scores, User-Agent
  7. Extract top users — Find which users made the most requests
  8. Synthesize & report — Combine findings into an incident report
请按顺序执行以下步骤。每个步骤文件都包含详细说明和Cloudflare GraphQL查询示例。
  1. 获取参数 — 收集时间范围和区域信息
  2. 确认流量突增 — 查询每小时流量以验证异常
  3. 分钟级明细排查 — 缩小范围定位流量突增的准确时间
  4. 识别异常JA4指纹 — 查找请求量最高的JA4指纹
  5. 流量分析 — 针对Top JA4指纹,识别请求路径、用户ID、ASN
  6. 合法性校验 — 检查爬虫评分、WAF评分、User-Agent
  7. 提取Top用户 — 查找发起请求最多的用户
  8. 结果汇总与报告 — 将所有发现整合为事件报告

Cloudflare API MCP

Cloudflare API MCP

All Cloudflare interactions use two tools:
  • mcp__cloudflare-api__search
    — Discover API endpoints by searching the OpenAPI spec
  • mcp__cloudflare-api__execute
    — Execute API calls via
    cloudflare.request()
    (GraphQL analytics via POST to
    /graphql
    , Radar via REST, zone operations via
    /zones
    )
See Cloudflare API MCP Reference for query patterns and examples.
所有Cloudflare交互使用两个工具:
  • mcp__cloudflare-api__search
    — 通过搜索OpenAPI规范查找API端点
  • mcp__cloudflare-api__execute
    — 通过
    cloudflare.request()
    执行API调用(GraphQL分析通过POST请求发送到
    /graphql
    ,Radar相关调用走REST,区域操作通过
    /zones
    接口)
查看 Cloudflare API MCP参考文档 了解查询模式和示例。

JA4 TLS Fingerprints

JA4 TLS指纹

  • Format:
    t13d311200_e8f1e7e78f70_d339722ba4af
  • A single fingerprint across millions of requests indicates backend service configuration, not individual users
  • Useful for identifying automated/service-to-service traffic
  • Cross-reference with Known Fingerprints before flagging as unknown
  • 格式:
    t13d311200_e8f1e7e78f70_d339722ba4af
  • 数百万请求共享同一个指纹代表的是后端服务配置,而非单个用户
  • 可用于识别自动化/服务间流量
  • 标记为未知指纹前,请先对照已知指纹库进行交叉校验

Cloudflare Sampled Data

Cloudflare采样数据

Firewall events use adaptive sampling. Numbers are sampled counts, not actual totals. Use them for pattern identification and relative comparisons — top users in sample likely represent top users overall. Always note this in reports.
防火墙事件使用自适应采样。数值为采样计数,而非实际总量。可用于模式识别和相对比较——采样中的Top用户大概率代表整体的Top用户。请务必在报告中注明这一点。

Common Failure Patterns

常见故障模式

Quickly identify root causes using these patterns:
PatternSignalResolution
Circuit Breaker Cascade429 → timeout → breaker opensScale service or add rate limiting
Retry StormError count exceeds initial trafficAdd exponential backoff, client-side circuit breaker
Single User AmplificationOne user dominates request countContact user, fix frontend logic
Undersized ServiceNormal distribution, fails at <10 req/secScale service capacity urgently
Cascading FailureMultiple services failing sequentiallyIsolate fault, restart root service
Cache StampedeSpike after cache expirationCache lock, stale-while-revalidate
Detailed descriptions and resolution steps: Failure Patterns Reference
使用以下模式可快速识别根因:
模式信号解决方案
熔断级联429 → 超时 → 熔断器开启扩容服务或新增速率限制
重试风暴错误量超过初始流量新增指数退避策略、客户端熔断器
单用户流量放大单个用户占据绝大多数请求量联系用户、修复前端逻辑
服务容量不足流量分布正常,请求量<10 req/sec时就发生故障紧急扩容服务容量
级联故障多个服务依次故障隔离故障点、重启根因服务
缓存雪崩缓存过期后出现流量突增加缓存锁、采用stale-while-revalidate策略
详细说明和解决步骤:故障模式参考文档

Escalation Criteria

升级标准

PriorityCondition
P1 — ImmediateService 429 errors / circuit breaker open, >10% error rate, cascading failures
P2 — HighSingle user >500 req/hour on critical endpoint, sustained spike >50% above baseline, multiple dependencies affected
P3 — MonitorModerate increase <50% above baseline, isolated user anomalies
优先级触发条件
P1 — 紧急服务出现429错误/熔断器开启、错误率>10%、发生级联故障
P2 — 高优先级单用户在关键端点请求量>500次/小时、持续流量突增超出基线50%以上、多个依赖服务受影响
P3 — 监控流量小幅上涨低于基线50%、孤立的用户异常

Incident Report

事件报告

Document findings using the Incident Report Template covering metrics, timeline, security analysis, root cause, and recommendations.
使用**事件报告模板**记录发现,内容需覆盖指标、时间线、安全分析、根因和建议。

Tips

提示

  • Ask for time range first using
    AskUserQuestion
    if not provided
  • Identify JA4 dynamically — query Cloudflare, don't assume
  • Only ask the user about unknown/suspicious User-Agents — skip well-known bots and clearly internal services
  • Calculate actual req/sec to understand service load
  • Document findings immediately using the incident template
  • 如果未提供时间范围,请先通过
    AskUserQuestion
    询问用户
  • 动态识别JA4指纹——查询Cloudflare数据,不要主观假设
  • 仅需要向用户询问未知/可疑的User-Agent——跳过知名爬虫和明确的内部服务
  • 计算实际的req/sec数值以了解服务负载
  • 立即使用事件模板记录发现

Reference Files

参考文件

Steps

步骤

  1. Get parameters
  2. Confirm spike
  3. Minute-level detail
  4. Identify culprit JA4
  5. Analyze traffic
  6. Verify legitimacy
  7. Extract top users
  8. Synthesize & report
  1. 获取参数
  2. 确认流量突增
  3. 分钟级明细排查
  4. 识别异常JA4指纹
  5. 流量分析
  6. 合法性校验
  7. 提取Top用户
  8. 结果汇总与报告

References

参考文档

  • Cloudflare API MCP
  • Known Fingerprints
  • Security Scores
  • Failure Patterns
  • Incident Report Template
  • Cloudflare API MCP
  • 已知指纹库
  • 安全评分
  • 故障模式
  • 事件报告模板