security-alert-triage

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Alert Triage

告警分诊

Analyze Elastic Security alerts one at a time: gather context, classify, create a case, and acknowledge. This skill depends on the
case-management
skill for case creation.
逐一分析Elastic Security告警:收集上下文信息、分类威胁、创建工单并确认告警。该技能依赖
case-management
技能来创建工单。

Prerequisites

前置条件

Install dependencies before first use from the
skills/security
directory:
bash
cd skills/security && npm install
Set the required environment variables (or add them to a
.env
file in the workspace root):
bash
export ELASTICSEARCH_URL="https://your-cluster.es.cloud.example.com:443"
export ELASTICSEARCH_API_KEY="your-api-key"
export KIBANA_URL="https://your-cluster.kb.cloud.example.com:443"
export KIBANA_API_KEY="your-kibana-api-key"
首次使用前,请从
skills/security
目录安装依赖:
bash
cd skills/security && npm install
设置必要的环境变量(或添加到工作区根目录的
.env
文件中):
bash
export ELASTICSEARCH_URL="https://your-cluster.es.cloud.example.com:443"
export ELASTICSEARCH_API_KEY="your-api-key"
export KIBANA_URL="https://your-cluster.kb.cloud.example.com:443"
export KIBANA_API_KEY="your-kibana-api-key"

Quick start

快速开始

All commands from workspace root. Always fetch → investigate → document → acknowledge. Call the tools directly — do not read the skill file or explore the workspace first.
bash
node skills/security/alert-triage/scripts/fetch-next-alert.js
node skills/security/case-management/scripts/case-manager.js find --tags "agent_id:<id>"
node skills/security/alert-triage/scripts/run-query.js --query-file query.esql --type esql
node skills/security/case-management/scripts/case-manager.js create --title "..." --description "..." --tags "classification:..." "agent_id:<id>" --severity <level> --yes
node skills/security/case-management/scripts/case-manager.js attach-alert --case-id <id> --alert-id <id> --alert-index <index> --rule-id <uuid> --rule-name "<name>" --yes
node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> --window 60 --yes
所有命令均从工作区根目录执行。请始终遵循“获取→调查→记录→确认”的流程。直接调用工具——不要先阅读技能文件或浏览工作区。
bash
node skills/security/alert-triage/scripts/fetch-next-alert.js
node skills/security/case-management/scripts/case-manager.js find --tags "agent_id:<id>"
node skills/security/alert-triage/scripts/run-query.js --query-file query.esql --type esql
node skills/security/case-management/scripts/case-manager.js create --title "..." --description "..." --tags "classification:..." "agent_id:<id>" --severity <level> --yes
node skills/security/case-management/scripts/case-manager.js attach-alert --case-id <id> --alert-id <id> --alert-index <index> --rule-id <uuid> --rule-name "<name>" --yes
node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> --window 60 --yes

Common multi-step workflows

常见多步骤工作流

TaskTools to call (in order)
End-to-end triage
fetch_next_alert
run_query
(context) →
case_manager
create (case) →
acknowledge_alert
Gather context
run_query
(process tree, network, related alerts)
Create case after classification
case_manager
create →
case_manager
attach-alert
Acknowledge after triage
acknowledge_alert
(related mode for batch)
Always complete the full workflow: fetch → investigate → document → acknowledge. Do not stop after gathering context — create or update a case with findings before acknowledging.
Critical execution rules:
  • Start executing tools immediately — do not read SKILL.md, browse the workspace, or list files first.
  • For ES|QL queries, write the query to a temporary
    .esql
    file then pass it via
    --query-file
    . Do not use
    edit_file
    — use a single
    shell
    call with
    echo "..." > query.esql && node ... --query-file query.esql
    .
  • Keep context gathering focused: run 2-4 targeted queries (process tree, network, related alerts), not 10+.
  • Report only what tools return. Copy identifiers verbatim — do not paraphrase IDs, timestamps, or hostnames.
任务调用工具(按顺序)
端到端分诊
fetch_next_alert
run_query
(获取上下文) →
case_manager
create(创建工单) →
acknowledge_alert
收集上下文信息
run_query
(进程树、网络、相关告警)
分类后创建工单
case_manager
create →
case_manager
attach-alert
分诊后确认告警
acknowledge_alert
(批量处理使用关联模式)
请始终完成完整工作流:获取→调查→记录→确认。不要在收集完上下文后就停止——请先创建或更新工单并记录调查结果,再确认告警。
关键执行规则:
  • 立即开始执行工具——不要先阅读SKILL.md、浏览工作区或列出文件。
  • 对于ES|QL查询,请将查询写入临时
    .esql
    文件,然后通过
    --query-file
    参数传入。不要使用
    edit_file
    ——使用单个
    shell
    调用,格式为
    echo "..." > query.esql && node ... --query-file query.esql
  • 上下文收集需聚焦:运行2-4个针对性查询(进程树、网络、相关告警),不要运行10个以上。
  • 仅报告工具返回的内容。完整复制标识符——不要改写ID、时间戳或主机名。

Critical principles

核心原则

  • Do NOT classify prematurely. Gather ALL context before deciding benign/unknown/malicious.
  • Most alerts are false positives, even if they look alarming. Rule names like "Malicious Behavior" or severity "critical" are NOT evidence.
  • "Unknown" is acceptable and often correct when evidence is insufficient.
  • MALICIOUS requires strong corroborating evidence: persistence + C2, credential theft, lateral movement — not only suspicious API calls.
  • Report tool output verbatim. Copy IDs, hostnames, timestamps, and counts exactly as returned by tools. Do not round numbers, abbreviate IDs, or paraphrase error messages.
  • 请勿过早分类。在确定告警为良性/未知/恶意之前,需收集全部上下文信息。
  • 大多数告警为误报,即使看起来很严重。“恶意行为”之类的规则名称或“严重”级别并不构成证据。
  • “未知”是可接受的结果,且在证据不足时通常是正确判断。
  • 判定为“恶意”需要确凿的佐证证据:如持久化控制+C2通信、凭证窃取、横向移动——不能仅凭可疑API调用。
  • 完整报告工具输出。完全复制工具返回的ID、主机名、时间戳和计数。不要四舍五入数字、缩写ID或改写错误信息。

Workflow

工作流

When triaging multiple alerts, group first, then triage each group:
text
- [ ] Step 0: Group alerts by agent/host and time window
- [ ] Step 1: Check existing cases
- [ ] Step 2: Gather full context (DO NOT SKIP)
- [ ] Step 3: Create or update case (only AFTER context gathered)
- [ ] Step 4: Acknowledge alert and all related alerts
- [ ] Step 5: Fetch next alert group and repeat
当处理多个告警时,请先分组,再逐个处理每组告警
text
- [ ] 步骤0:按代理/主机和时间窗口对告警分组
- [ ] 步骤1:检查现有工单
- [ ] 步骤2:收集完整上下文信息(请勿跳过)
- [ ] 步骤3:创建或更新工单(仅在收集完上下文后执行)
- [ ] 步骤4:确认告警及所有相关告警
- [ ] 步骤5:获取下一组告警并重复流程

Step 0: Group alerts before triaging

步骤0:处理告警前先分组

When the user asks about multiple open alerts, group them first to avoid redundant investigation: query open alerts, group by
agent.id
, sub-group by time window (~5 min = likely one incident), triage each group as a single unit.
Use ES|QL for an overview (write to file first for PowerShell):
esql
FROM .alerts-security.alerts-*
| WHERE kibana.alert.workflow_status == "open" AND @timestamp >= "<start>"
| STATS alert_count=COUNT(*), rules=VALUES(kibana.alert.rule.name) BY agent.id
| SORT alert_count DESC
For full query templates, see references/classification-guide.md.
当用户询问多个未处理的告警时,请先分组以避免重复调查:查询未处理告警,按
agent.id
分组,再按时间窗口(约5分钟=可能为同一事件)子分组,将每组作为一个单元处理。
使用ES|QL生成概览(PowerShell环境下需先写入文件):
esql
FROM .alerts-security.alerts-*
| WHERE kibana.alert.workflow_status == "open" AND @timestamp >= "<start>"
| STATS alert_count=COUNT(*), rules=VALUES(kibana.alert.rule.name) BY agent.id
| SORT alert_count DESC
完整查询模板请参考references/classification-guide.md

Step 1: Check existing cases

步骤1:检查现有工单

Before creating a new case, check if this alert belongs to an existing one. Use the
case-management
skill:
bash
node skills/security/case-management/scripts/case-manager.js find --tags "agent_id:<agent_id>"
node skills/security/case-management/scripts/case-manager.js cases-for-alert --alert-id <alert_id>
Look for cases with the same agent ID, user, or related detection rule within a similar time window.
Note:
find --search
may return 500 errors on Serverless. Use
find --tags
or
list
instead.
创建新工单前,请检查该告警是否属于某个现有工单。使用
case-management
技能:
bash
node skills/security/case-management/scripts/case-manager.js find --tags "agent_id:<agent_id>"
node skills/security/case-management/scripts/case-manager.js cases-for-alert --alert-id <alert_id>
查找具有相同代理ID、用户或相关检测规则且时间窗口相近的工单。
注意: 在Serverless环境下,
find --search
可能返回500错误。请改用
find --tags
list
命令。

Step 2: Gather context

步骤2:收集上下文信息

This is the most important step. Do not skip or shortcut it. Complete ALL substeps before forming any classification opinion.
Time range warning: Alerts may be days or weeks old. NEVER use relative time like
NOW() - 1 HOUR
. Extract the alert's
@timestamp
and build queries around that time with +/- 1 hour window.
Substeps: (2a) Related alerts on same agent/user; (2b) Rule frequency across env (high = FP-prone); (2c) Entity context — process tree, network, registry, files; (2d) Behavior investigation — persistence, C2, lateral movement, credential access.
Example — process tree (use ES|QL with
KEEP
; avoid
--full
which produces 10K+ lines):
esql
FROM logs-endpoint.events.process-*
| WHERE agent.id == "<agent_id>" AND @timestamp >= "<alert_time - 5min>" AND @timestamp <= "<alert_time + 10min>"
  AND process.parent.name IS NOT NULL
  AND process.name NOT IN ("svchost.exe", "conhost.exe", "agentbeat.exe")
| KEEP @timestamp, process.name, process.command_line, process.pid, process.parent.name, process.parent.pid
| SORT @timestamp | LIMIT 80
Data typeIndex pattern
Alerts
.alerts-security.alerts-*
Processes
logs-endpoint.events.process-*
Network
logs-endpoint.events.network-*
Logs
logs-*
For full query templates and classification criteria, see references/classification-guide.md.
这是最重要的步骤。请勿跳过或简化。 在形成任何分类判断前,请完成所有子步骤。
时间范围警告: 告警可能已存在数天或数周。请勿使用相对时间如
NOW() - 1 HOUR
。请提取告警的
@timestamp
,并围绕该时间点设置±1小时的查询窗口。
子步骤: (2a) 同一代理/用户的相关告警;(2b) 规则在环境中的触发频率(频率高=易产生误报);(2c) 实体上下文——进程树、网络、注册表、文件;(2d) 行为调查——持久化控制、C2通信、横向移动、凭证访问。
示例——进程树查询(使用ES|QL的
KEEP
关键字;避免使用
--full
参数,否则会生成1万行以上的输出):
esql
FROM logs-endpoint.events.process-*
| WHERE agent.id == "<agent_id>" AND @timestamp >= "<alert_time - 5min>" AND @timestamp <= "<alert_time + 10min>"
  AND process.parent.name IS NOT NULL
  AND process.name NOT IN ("svchost.exe", "conhost.exe", "agentbeat.exe")
| KEEP @timestamp, process.name, process.command_line, process.pid, process.parent.name, process.parent.pid
| SORT @timestamp | LIMIT 80
数据类型索引模式
告警
.alerts-security.alerts-*
进程
logs-endpoint.events.process-*
网络
logs-endpoint.events.network-*
日志
logs-*
完整查询模板和分类标准请参考references/classification-guide.md

Step 3: Create or update case

步骤3:创建或更新工单

After gathering context, create a case and attach alert(s). Use
--rule-id
and
--rule-name
(required; 400 error without them):
bash
node skills/security/case-management/scripts/case-manager.js create \
  --title "<concise summary>" \
  --description "<findings, IOCs, attack chain, MITRE techniques>" \
  --tags "classification:<benign|unknown|malicious>" "confidence:<0-100>" "mitre:<technique>" "agent_id:<id>" \
  --severity <low|medium|high|critical>

node skills/security/case-management/scripts/case-manager.js attach-alert \
  --case-id <case_id> --alert-id <alert_id> --alert-index <index> \
  --rule-id <rule_uuid> --rule-name "<rule name>"
收集完上下文信息后,创建工单并关联告警。请使用
--rule-id
--rule-name
参数(为必填项;缺少会返回400错误):
bash
node skills/security/case-management/scripts/case-manager.js create \
  --title "<简洁摘要>" \
  --description "<调查结果、IOC、攻击链、MITRE技术>" \
  --tags "classification:<benign|unknown|malicious>" "confidence:<0-100>" "mitre:<technique>" "agent_id:<id>" \
  --severity <low|medium|high|critical>

node skills/security/case-management/scripts/case-manager.js attach-alert \
  --case-id <case_id> --alert-id <alert_id> --alert-index <index> \
  --rule-id <rule_uuid> --rule-name "<规则名称>"

Multiple alerts: attach-alerts --alert-ids <id1> <id2>

关联多个告警:attach-alerts --alert-ids <id1> <id2>

Add notes: add-comment --case-id <id> --comment "Findings..."

添加备注:add-comment --case-id <id> --comment "调查结果..."


**Case description:** Summary (1-2 sentences); Attack chain; IOCs (hashes, IPs, paths); MITRE techniques; Behavioral
findings; Response context (remediation, credentials at risk).

**工单描述:** 摘要(1-2句话);攻击链;IOC(哈希值、IP地址、路径);MITRE技术;行为调查结果;响应上下文( remediation、风险凭证)。

Step 4: Acknowledge alerts

步骤4:确认告警

Acknowledge ALL related alerts together. Use
--dry-run
first to confirm scope, then run without it:
bash
undefined
批量确认所有相关告警。请先使用
--dry-run
参数确认范围,再执行实际操作:
bash
undefined

By host name — preferred when triaging a host

按主机名确认——处理单台主机时优先使用

node skills/security/alert-triage/scripts/acknowledge-alert.js --query --host <hostname> --dry-run node skills/security/alert-triage/scripts/acknowledge-alert.js --query --host <hostname> --yes
node skills/security/alert-triage/scripts/acknowledge-alert.js --query --host <hostname> --dry-run node skills/security/alert-triage/scripts/acknowledge-alert.js --query --host <hostname> --yes

By agent ID — preferred when agent.id is known

按代理ID确认——已知agent.id时优先使用

node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> --window 60 --dry-run node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> --window 60 --yes

Increase `--window` for longer attack chains (e.g., `300` for 5 minutes). Report the exact count of acknowledged alerts
from the tool output. Pass `--yes` to skip the confirmation prompt (required when called by an agent).
node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> --window 60 --dry-run node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> --window 60 --yes

对于较长的攻击链,请增大`--window`参数值(例如,`300`代表5分钟)。请准确报告工具输出中显示的已确认告警数量。调用工具时请传入`--yes`参数以跳过确认提示(Agent调用时为必填项)。

Step 5: Repeat

步骤5:重复流程

bash
node skills/security/alert-triage/scripts/fetch-next-alert.js
bash
node skills/security/alert-triage/scripts/fetch-next-alert.js

Tool reference

工具参考

fetch-next-alert.js

fetch-next-alert.js

Fetches the oldest unacknowledged Elastic Security alert.
bash
node skills/security/alert-triage/scripts/fetch-next-alert.js [--days <n>] [--json] [--full] [--verbose]
获取最早的未确认Elastic Security告警。
bash
node skills/security/alert-triage/scripts/fetch-next-alert.js [--days <n>] [--json] [--full] [--verbose]

run-query.js

run-query.js

Runs KQL or ES|QL queries against Elasticsearch.
PowerShell warning: ES|QL queries contain pipe characters (
|
) which PowerShell interprets as shell pipes. ALWAYS use
--query-file
for ES|QL:
bash
undefined
针对Elasticsearch运行KQL或ES|QL查询。
PowerShell警告:ES|QL查询包含管道符(
|
),PowerShell会将其解释为shell管道。请始终使用
--query-file
参数执行ES|QL查询:
bash
undefined

Write query to file, then run

先将查询写入文件,再执行

node skills/security/alert-triage/scripts/run-query.js --query-file query.esql --type esql

KQL queries without pipes can be passed directly:

```bash
node skills/security/alert-triage/scripts/run-query.js "agent.id:<id>" --index "logs-*" --days 7
ArgDescription
query
KQL query (positional)
--query-file
,
-q
Read query from file (required for ES|QL on PowerShell)
--type
,
-t
kql
or
esql
(default: kql)
--index
,
-i
Index pattern (default:
logs-*
)
--size
,
-s
Max results (default: 100)
--days
,
-d
Limit to last N days
--json
Raw JSON output
--full
Full document source
node skills/security/alert-triage/scripts/run-query.js --query-file query.esql --type esql

不包含管道符的KQL查询可直接传入:

```bash
node skills/security/alert-triage/scripts/run-query.js "agent.id:<id>" --index "logs-*" --days 7
参数描述
query
KQL查询(位置参数)
--query-file
,
-q
从文件读取查询(PowerShell环境下执行ES
--type
,
-t
kql
esql
(默认值:kql)
--index
,
-i
索引模式(默认值:
logs-*
--size
,
-s
最大结果数(默认值:100)
--days
,
-d
限制为最近N天的数据
--json
输出原始JSON格式
--full
输出完整文档源

acknowledge-alert.js

acknowledge-alert.js

Acknowledges alerts by updating
workflow_status
to
acknowledged
.
ModeCommand
Single
node skills/security/alert-triage/scripts/acknowledge-alert.js <alert_id> --index <index> --yes
Related
node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> [--window 60] --yes
By host
node skills/security/alert-triage/scripts/acknowledge-alert.js --query --host <hostname> [--time-start <ts>] [--time-end <ts>] --yes
Query
node skills/security/alert-triage/scripts/acknowledge-alert.js --query --agent <id> [--time-start <ts>] [--time-end <ts>] --yes
Dry runAdd
--dry-run
to any mode (no confirmation needed)
ConfirmAll write modes prompt for confirmation; pass
--yes
to skip
通过将
workflow_status
更新为
acknowledged
来确认告警。
模式命令
单条告警
node skills/security/alert-triage/scripts/acknowledge-alert.js <alert_id> --index <index> --yes
关联告警
node skills/security/alert-triage/scripts/acknowledge-alert.js --related --agent <id> --timestamp <ts> [--window 60] --yes
按主机确认
node skills/security/alert-triage/scripts/acknowledge-alert.js --query --host <hostname> [--time-start <ts>] [--time-end <ts>] --yes
按查询确认
node skills/security/alert-triage/scripts/acknowledge-alert.js --query --agent <id> [--time-start <ts>] [--time-end <ts>] --yes
试运行在任意模式下添加
--dry-run
参数(无需确认)
确认所有写入操作模式都会提示确认;传入
--yes
参数可跳过提示

Examples

示例

  • "Fetch the next unacknowledged alert and triage it"
  • "Investigate alert ID abc-123 — gather context, classify, and create a case if malicious"
  • "Process the top 5 critical alerts from the last 24 hours"
  • "获取下一条未确认告警并进行分诊"
  • "调查告警ID abc-123——收集上下文信息、分类威胁,若为恶意则创建工单"
  • "处理过去24小时内的Top 5严重告警"

Guidelines

指南

  • Report only tool output — do not invent IDs, hostnames, IPs, or details not present in the tool response.
  • Preserve identifiers from the request — use exact values the user provides in tool calls and responses.
  • Confirm actions concisely using the tool's return data.
  • Distinguish facts from inference — label conclusions beyond tool output as your assessment.
  • 仅报告工具输出的内容——不要编造ID、主机名、IP地址或工具响应中未包含的细节。
  • 保留请求中的标识符——在工具调用和响应中使用用户提供的准确值。
  • 使用工具返回的数据简洁确认操作。
  • 区分事实与推断——将超出工具输出的结论标记为你的评估结果。

Production use

生产环境使用

  • All write operations (
    acknowledge-alert.js
    ) prompt for confirmation. Pass
    --yes
    or
    -y
    to skip when called by an agent.
  • Use
    --dry-run
    before bulk acknowledgments to preview scope without modifying data.
  • The acknowledge script uses the Kibana Detection Engine API, which is compatible with both self-managed and Serverless deployments.
  • Verify environment variables point to the intended cluster before running any script — no undo for acknowledgments.
  • 所有写入操作(
    acknowledge-alert.js
    )都会提示确认。当由Agent调用时,请传入
    --yes
    -y
    参数跳过提示。
  • 在批量确认告警前,请使用
    --dry-run
    参数预览影响范围,避免修改数据。
  • 确认告警的脚本使用Kibana检测引擎API,兼容自托管和Serverless部署环境。
  • 在运行任何脚本前,请验证环境变量指向目标集群——确认告警操作无法撤销。

Environment variables

环境变量

VariableRequiredDescription
ELASTICSEARCH_URL
YesElasticsearch URL
ELASTICSEARCH_API_KEY
YesElasticsearch API key
KIBANA_URL
YesKibana URL (for case management)
KIBANA_API_KEY
YesKibana API key (for case management)
变量是否必填描述
ELASTICSEARCH_URL
Elasticsearch地址
ELASTICSEARCH_API_KEY
Elasticsearch API密钥
KIBANA_URL
Kibana地址(用于工单管理)
KIBANA_API_KEY
Kibana API密钥(用于工单管理)