ingest-pipelines

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ingest-pipelines

ingest-pipelines

Skill authority

技能权威规范

The rules and patterns defined in this skill and its reference files are the authoritative source of truth. When examining existing integrations in the
elastic/integrations
repository for reference, you may encounter patterns that conflict with what is specified here — many integrations contain legacy patterns that predate current standards. Always follow this skill over patterns observed in other integrations. If a reference integration uses a deprecated or prohibited pattern, do not copy it.
本技能及其参考文件中定义的规则和模式是权威的事实来源。在
elastic/integrations
仓库中查看现有集成作为参考时,你可能会遇到与本文档指定内容冲突的模式——许多集成包含早于当前标准的遗留模式。始终遵循本技能中的规范,而非其他集成中观察到的模式。如果某个参考集成使用了已弃用或被禁止的模式,请勿复制。

When to use

使用场景

Use this skill when tasks include:
  • building or modifying
    elasticsearch/ingest_pipeline/default.yml
    for a data stream
  • choosing parser and normalization processors (
    grok
    ,
    dissect
    ,
    json
    ,
    kv
    ,
    date
    ,
    convert
    )
  • designing conditional branches and sub-pipeline routing with
    pipeline
    processors
  • implementing resilient error handling with top-level
    on_failure
  • tuning processor order for ingest performance and maintainability
当你需要完成以下任务时,使用本技能:
  • 为数据流构建或修改
    elasticsearch/ingest_pipeline/default.yml
  • 选择解析和标准化处理器(
    grok
    dissect
    json
    kv
    date
    convert
  • 使用
    pipeline
    处理器设计条件分支和子管道路由
  • 实现顶层
    on_failure
    的弹性错误处理
  • 调整处理器顺序以优化摄入性能和可维护性

When not to use

非适用场景

Do not use this skill as the primary guide for:
  • ECS field selection, categorization values, and field mapping strategy (
    ecs-field-mappings
    )
  • elastic-package command and stack lifecycle workflows (
    elastic-package-cli
    )
  • test fixture authoring and expected output workflows (
    integration-testing
    references/pipeline-testing.md
    )
请勿将本技能作为以下内容的主要指南:
  • ECS字段选择、分类值和字段映射策略(
    ecs-field-mappings
  • elastic-package命令和堆栈生命周期工作流(
    elastic-package-cli
  • 测试 fixture 编写和预期输出工作流(
    integration-testing
    references/pipeline-testing.md

Pipeline anatomy

管道结构

In integration packages, ingest pipelines live under:
data_stream/<stream>/elasticsearch/ingest_pipeline/
Every stream usually has a
default.yml
with:
  • description
  • processors
    list
  • optional pipeline-level
    on_failure
Keep
default.yml
readable and focused. Move large format-specific logic into sub-pipelines where needed.
在集成包中,ingest pipeline位于:
data_stream/<stream>/elasticsearch/ingest_pipeline/
每个数据流通常有一个
default.yml
,包含:
  • description
    (描述)
  • processors
    列表(处理器列表)
  • 可选的管道级
    on_failure
    (失败处理)
保持
default.yml
的可读性和聚焦性。必要时将大型格式特定逻辑移至子管道中。

ECS version

ECS版本

Set the pipeline ECS reference version explicitly at the top of
processors
(after any introductory processors you already use). Use
9.3.0
— do not pin an older ECS version.
yaml
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'
processors
顶部(在你已使用的任何介绍性处理器之后)显式设置管道ECS参考版本。请使用
9.3.0
——不要固定旧版本的ECS。
yaml
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

Rename vs set (mapping to ECS)

Rename与Set对比(映射到ECS)

When moving a value from a custom or vendor field into an ECS field, prefer the
rename
processor
so the source field is removed and you avoid duplicate data. Use
set
with
copy_from
only when you must keep the source field or when
rename
is not applicable.
当将值从自定义或厂商字段移动到ECS字段时,优先使用
rename
处理器
,这样源字段会被移除,避免重复数据。仅当必须保留源字段或
rename
不适用时,才使用带有
copy_from
set

Processor tags

处理器标签

Every processor in the pipeline should have a
tag
(not only processors that can fail). Tags make failures and telemetry attributable to a specific step.
管道中的每个处理器都应有一个
tag
(不仅是可能失败的处理器)。标签可将故障和遥测数据归因于特定步骤。

CEL-only opening processors (Agentless metadata and error-only documents)

仅CEL的起始处理器(无Agent元数据和仅错误文档)

For CEL-based integrations only, include these before the standard
message
event.original
handling when they apply:
  • remove
    : drop Agentless metadata fields (
    organization
    ,
    division
    ,
    team
    ) when all are strings, so they do not collide with ECS. Use
    ignore_missing: true
    and a conditional
    if
    .
  • terminate
    : stop processing when the document is an error placeholder from the collector (
    ctx.error?.message != null && ctx.message == null && ctx.event?.original == null
    ).
Non-CEL integrations (logs, syslog, filebeat-style inputs) must not copy this block blindly — those fields and error shapes are specific to the CEL/Agentless path. See the
create-integration
skill: the orchestrator must only expect this block when the data stream uses CEL input.
仅对于基于CEL的集成,当适用时,在标准的
message
event.original
处理之前添加以下处理器:
  • remove
    :当
    organization
    division
    team
    均为字符串时,删除无Agent元数据字段,避免与ECS冲突。使用
    ignore_missing: true
    和条件
    if
  • terminate
    :当文档是收集器返回的错误占位符时(
    ctx.error?.message != null && ctx.message == null && ctx.event?.original == null
    ),停止处理。
非CEL集成(日志、syslog、filebeat风格的输入)不得盲目复制此代码块——这些字段和错误格式是CEL/无Agent路径特有的。请参阅
create-integration
技能:编排器仅在数据流使用CEL输入时才会预期此代码块。

Standard opening: ECS, optional CEL block, JSE00001, then parse
event.original

标准起始流程:ECS、可选CEL块、JSE00001,然后解析
event.original

After the optional CEL-only processors, the pipeline should follow this shape. All parsing (
json
,
csv
,
grok
, etc.) runs on
event.original
. Never overwrite or mutate
event.original
in later processors — derive structured fields into other paths (for example
json
,
_temp.*
, ECS fields).
yaml
description: Parse <dataset> events.
processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

  # --- CEL input only (omit for log/syslog-only streams) ---
  - remove:
      field:
        - organization
        - division
        - team
      ignore_missing: true
      if: ctx.organization instanceof String && ctx.division instanceof String && ctx.team instanceof String
      tag: remove_agentless_tags
      description: >-
        Removes the fields added by Agentless as metadata,
        as they can collide with ECS fields.
  - terminate:
      tag: data_collection_error
      if: ctx.error?.message != null && ctx.message == null && ctx.event?.original == null
      description: error message set and no data to process.
  # --- end CEL-only ---

  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      description: The `message` field is no longer required if the document has an `event.original` field.
      if: ctx.event?.original != null

  # Parse (always read from event.original; do not modify event.original)
  - json:
      field: event.original
      target_field: json
      tag: parse_json
      if: ctx.event?.original != null

  # ... normalize, enrich, ECS categorization, cleanup ...

  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false
      if: ctx.error?.message != null

on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false
在可选的仅CEL处理器之后,管道应遵循以下结构。所有解析
json
csv
grok
等)都基于**
event.original
**执行。切勿在后续处理器中覆盖或修改
event.original
——将结构化字段派生到其他路径(例如
json
_temp.*
、ECS字段)。
yaml
description: Parse <dataset> events.
processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

  # --- CEL input only (omit for log/syslog-only streams) ---
  - remove:
      field:
        - organization
        - division
        - team
      ignore_missing: true
      if: ctx.organization instanceof String && ctx.division instanceof String && ctx.team instanceof String
      tag: remove_agentless_tags
      description: >-
        Removes the fields added by Agentless as metadata,
        as they can collide with ECS fields.
  - terminate:
      tag: data_collection_error
      if: ctx.error?.message != null && ctx.message == null && ctx.event?.original == null
      description: error message set and no data to process.
  # --- end CEL-only ---

  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      description: The `message` field is no longer required if the document has an `event.original` field.
      if: ctx.event?.original != null

  # Parse (always read from event.original; do not modify event.original)
  - json:
      field: event.original
      target_field: json
      tag: parse_json
      if: ctx.event?.original != null

  # ... normalize, enrich, ECS categorization, cleanup ...

  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false
      if: ctx.error?.message != null

on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

Single-path pattern (linear pipeline)

单路径模式(线性管道)

Use this pattern when one parser flow handles all events. Combine the standard opening (ECS version, optional CEL-only block, JSE00001 rename/remove, parse from
event.original
without mutating it), middle processors with tags on every step, and the pipeline-level
on_failure
and conditional
append
for
preserve_original_event
shown above.
Example middle section (illustrative):
yaml
  - grok:
      field: event.original
      patterns:
        - '^...$'
      tag: parse_main

  - date:
      field: some.time
      target_field: '@timestamp'
      formats: [ISO8601]
      tag: parse_timestamp
  - convert:
      field: http.response.status_code
      type: long
      ignore_missing: true
      tag: convert_status

  - user_agent:
      field: user_agent.original
      ignore_missing: true
      tag: enrich_user_agent
  - geoip:
      field: source.ip
      target_field: source.geo
      ignore_missing: true
      tag: enrich_source_geo
  - geoip:
      database_file: GeoLite2-ASN.mmdb
      field: source.ip
      target_field: source.as
      properties:
        - asn
        - organization_name
      ignore_missing: true
      tag: enrich_source_asn
  - rename:
      field: source.as.asn
      target_field: source.as.number
      ignore_missing: true
      tag: rename_source_asn
  - rename:
      field: source.as.organization_name
      target_field: source.as.organization.name
      ignore_missing: true
      tag: rename_source_as_org

  - set:
      field: event.kind
      tag: set_event_kind
      value: event
  - append:
      field: event.category
      tag: append_event_category_web
      value: web
  - remove:
      field: temp
      ignore_missing: true
      tag: remove_temp
当一个解析流处理所有事件时,使用此模式。结合标准起始流程(ECS版本、可选仅CEL块、JSE00001重命名/移除、基于
event.original
解析且不修改它)、每个步骤都带有标签的中间处理器,以及上述的管道级
on_failure
preserve_original_event
的条件
append
示例中间部分(仅供说明):
yaml
  - grok:
      field: event.original
      patterns:
        - '^...$'
      tag: parse_main

  - date:
      field: some.time
      target_field: '@timestamp'
      formats: [ISO8601]
      tag: parse_timestamp
  - convert:
      field: http.response.status_code
      type: long
      ignore_missing: true
      tag: convert_status

  - user_agent:
      field: user_agent.original
      ignore_missing: true
      tag: enrich_user_agent
  - geoip:
      field: source.ip
      target_field: source.geo
      ignore_missing: true
      tag: enrich_source_geo
  - geoip:
      database_file: GeoLite2-ASN.mmdb
      field: source.ip
      target_field: source.as
      properties:
        - asn
        - organization_name
      ignore_missing: true
      tag: enrich_source_asn
  - rename:
      field: source.as.asn
      target_field: source.as.number
      ignore_missing: true
      tag: rename_source_asn
  - rename:
      field: source.as.organization_name
      target_field: source.as.organization.name
      ignore_missing: true
      tag: rename_source_as_org

  - set:
      field: event.kind
      tag: set_event_kind
      value: event
  - append:
      field: event.category
      tag: append_event_category_web
      value: web
  - remove:
      field: temp
      ignore_missing: true
      tag: remove_temp

Branching pattern (router + sub-pipelines)

分支模式(路由器+子管道)

Use branching when event formats or object models diverge:
  • format-based branching (for example JSON vs text)
  • class/category-based branching (for example OCSF class/category routing)
  • object-presence branching (
    ctx.ocsf.user != null
    )
Pattern:
yaml
processors:
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_json" }}'
      if: ctx.event?.original != null && ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_json
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_text" }}'
      if: ctx.event?.original != null && !ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_text
In large integrations, keep
default.yml
as the router and put branch logic in files like:
  • pipeline_object_<name>.yml
  • pipeline_category_<name>.yml
See
references/branching-patterns.md
for full patterns from
amazon_security_lake
.
当事件格式或对象模型存在差异时使用分支:
  • 基于格式的分支(例如JSON vs 文本)
  • 基于类/分类的分支(例如OCSF类/分类路由)
  • 基于对象存在的分支(
    ctx.ocsf.user != null
模式:
yaml
processors:
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_json" }}'
      if: ctx.event?.original != null && ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_json
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_text" }}'
      if: ctx.event?.original != null && !ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_text
在大型集成中,将
default.yml
作为路由器,并将分支逻辑放在以下文件中:
  • pipeline_object_<name>.yml
  • pipeline_category_<name>.yml
有关完整模式,请参阅
amazon_security_lake
中的
references/branching-patterns.md

Sub-pipeline routing for multi-log-type integrations

多日志类型集成的子管道路由

When a data stream receives multiple distinct log types (for example a firewall that emits traffic, auth, and DNS logs in the same stream), do not implement all parsing in a single monolithic
default.yml
. Use
default.yml
as a thin router that detects the log type and delegates to a dedicated sub-pipeline per type.
当一个数据流接收多种不同类型的日志时(例如防火墙在同一流中发送流量、认证和DNS日志),不要在单个庞大的
default.yml
中实现所有解析逻辑
。将
default.yml
作为轻量路由器,检测日志类型并将任务委托给每种类型对应的专用子管道。

File layout

文件结构

text
elasticsearch/ingest_pipeline/
  default.yml              # router only — detects log type, calls sub-pipelines
  pipeline-<type>.yml      # one file per log type (e.g. pipeline-traffic.yml)
text
elasticsearch/ingest_pipeline/
  default.yml              # 仅作为路由器——检测日志类型,调用子管道
  pipeline-<type>.yml      # 每种日志类型对应一个文件(例如pipeline-traffic.yml)

Router pattern in
default.yml

default.yml
中的路由器模式

Use the same
ecs.version
, JSE00001
rename
/
remove
pair for
message
, and full pipeline-level
on_failure
as in the standard opening. The router only branches sub-pipelines; it does not parse payloads.
yaml
processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'
  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      if: ctx.event?.original != null
  - pipeline:
      name: '{{ IngestPipeline "pipeline-traffic" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("TRAFFIC")'
      tag: route_traffic
  - pipeline:
      name: '{{ IngestPipeline "pipeline-auth" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("AUTH")'
      tag: route_auth
  - pipeline:
      name: '{{ IngestPipeline "pipeline-dns" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("DNS")'
      tag: route_dns
on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false
使用与标准起始流程相同的**
ecs.version
、用于
message
JSE00001重命名/移除对,以及完整的管道级
on_failure
**。路由器仅负责分支到子管道;不解析负载。
yaml
processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'
  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      if: ctx.event?.original != null
  - pipeline:
      name: '{{ IngestPipeline "pipeline-traffic" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("TRAFFIC")'
      tag: route_traffic
  - pipeline:
      name: '{{ IngestPipeline "pipeline-auth" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("AUTH")'
      tag: route_auth
  - pipeline:
      name: '{{ IngestPipeline "pipeline-dns" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("DNS")'
      tag: route_dns
on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

Rules

规则

  • default.yml
    must contain only routing logic and
    on_failure
    handling — no field parsing.
  • Each sub-pipeline handles parsing, ECS mapping, and categorization for its own log type.
  • Each sub-pipeline must have its own
    on_failure
    block.
  • Name sub-pipeline files
    pipeline-<type>.yml
    where
    <type>
    matches the log type identifier used in the routing condition.
  • Each log type gets its own pipeline test fixture file following the naming convention
    test-<package>-<datastream>-<type>-sample.log
    .
  • default.yml
    必须包含路由逻辑和
    on_failure
    处理——不包含字段解析。
  • 每个子管道负责其对应日志类型的解析、ECS映射和分类。
  • 每个子管道必须有自己的
    on_failure
    块。
  • 子管道文件命名为
    pipeline-<type>.yml
    ,其中
    <type>
    与路由条件中使用的日志类型标识符匹配。
  • 每种日志类型都有自己的管道测试fixture文件,遵循命名约定
    test-<package>-<datastream>-<type>-sample.log

Processor ordering and performance

处理器排序与性能

  • run cheap existence checks before expensive operations
  • drop early if records are out of scope
  • prefer
    dissect
    over
    grok
    for stable delimited formats
  • never use a
    script
    processor when a built-in processor can do the job
    set
    ,
    rename
    ,
    remove
    ,
    append
    ,
    convert
    ,
    dissect
    ,
    grok
    ,
    gsub
    ,
    lowercase
    ,
    uppercase
    , and
    trim
    are all faster than Painless and easier to review. See the cost tiers in
    references/processor-cookbook.md
    Processor performance guide.
  • use enrichment processors (
    geoip
    ,
    user_agent
    ) only when needed
  • always anchor
    grok
    patterns with
    ^
    and
    $
    — without anchors the regex engine scans the entire input string looking for a partial match, which is slow and can produce incorrect results on noisy log lines
  • 在执行昂贵操作之前先运行低成本的存在性检查
  • 尽早丢弃超出范围的记录
  • 对于稳定的分隔格式,优先使用
    dissect
    而非
    grok
  • 当内置处理器可以完成任务时,切勿使用
    script
    处理器
    ——
    set
    rename
    remove
    append
    convert
    dissect
    grok
    gsub
    lowercase
    uppercase
    trim
    都比Painless更快且更易于审查。请参阅
    references/processor-cookbook.md
    中的成本层级→处理器性能指南
  • 仅在需要时使用富化处理器(
    geoip
    user_agent
  • 始终使用
    ^
    $
    锚定
    grok
    模式——没有锚点的话,正则引擎会扫描整个输入字符串以寻找部分匹配,这会很慢且可能在嘈杂的日志行上产生错误结果

Mustache template syntax in processor values

处理器值中的Mustache模板语法

Ingest pipeline processors use Mustache templates to reference field values in
value
,
message
, and similar string parameters. Use triple braces
{{{field}}}
with single quotes — never double braces or double quotes:
yaml
undefined
Ingest pipeline处理器使用Mustache模板在
value
message
和类似字符串参数中引用字段值。使用三重大括号
{{{field}}}
并搭配单引号——切勿使用双大括号或双引号:
yaml
undefined

CORRECT — triple braces, single quotes

CORRECT — triple braces, single quotes

  • append: field: related.user value: '{{{user.target.email}}}' allow_duplicates: false if: ctx.user?.target?.email != null
  • append: field: related.user value: '{{{user.target.email}}}' allow_duplicates: false if: ctx.user?.target?.email != null

WRONG — double braces HTML-escape the value; double quotes

WRONG — double braces HTML-escape the value; double quotes

  • append: field: related.user value: "{{user.target.email}}" allow_duplicates: false if: ctx.user?.target?.email != null

Why: Mustache double braces `{{...}}` HTML-encode the value (e.g., `&` becomes `&amp;`), which corrupts data in ingest pipelines. Triple braces `{{{...}}}` emit the raw value. Single quotes prevent YAML from interpreting braces.

**Exception:** `{{ IngestPipeline "..." }}` in `pipeline.name` is a Go template directive processed at build time, not a Mustache template — it correctly uses double braces.
  • append: field: related.user value: "{{user.target.email}}" allow_duplicates: false if: ctx.user?.target?.email != null

原因:Mustache双大括号`{{...}}`会对值进行HTML编码(例如`&`变为`&amp;`),这会破坏ingest pipeline中的数据。三重大括号`{{{...}}}`会输出原始值。单引号可防止YAML解析大括号。

**例外情况:**`pipeline.name`中的`{{ IngestPipeline "..." }}`是构建时处理的Go模板指令,而非Mustache模板——它正确使用双大括号。

Error handling essentials

错误处理要点

Use pipeline-level
on_failure
as the main error reporting mechanism.
Recommended baseline (order matters):
  • append contextual
    error.message
    first using
    _ingest.on_failure_*
    variables (full template in the standard opening example)
  • set
    event.kind: pipeline_error
    (with a
    tag
    on the
    set
    processor)
  • append
    preserve_original_event
    to
    tags
    when you need to retain the failed document for triage
  • give every processor a
    tag
    (not only processors that can fail)
Use processor-level
on_failure
for local cleanup or fallback parsing, not as the primary global error message path.
See
references/error-handling-patterns.md
for full examples and tradeoffs (
ignore_failure
,
fail
, processor-level
on_failure
).
使用管道级
on_failure
作为主要的错误报告机制。
推荐的基线(顺序很重要):
  • 首先使用
    _ingest.on_failure_*
    变量追加上下文相关的
    error.message
    (标准起始流程示例中有完整模板)
  • 设置
    event.kind: pipeline_error
    (在
    set
    处理器上添加
    tag
  • 当需要保留失败文档以进行分类时,追加
    preserve_original_event
    tags
  • 每个处理器添加
    tag
    (不仅是可能失败的处理器)
使用处理器级
on_failure
进行本地清理或备用解析,而非作为主要的全局错误消息路径。
有关完整示例和权衡(
ignore_failure
fail
、处理器级
on_failure
),请参阅
references/error-handling-patterns.md

event.original handling (JSE00001)

event.original处理(JSE00001)

The
elastic-package build
validator enforces that pipelines correctly handle the
message
to
event.original
rename. This check is known as JSE00001. New packages must comply; some legacy packages exclude it via
validation.yml
.
elastic-package build
验证器会强制要求管道正确处理
message
event.original
的重命名。此检查称为JSE00001。新包必须遵守;一些遗留包通过
validation.yml
排除了此检查。

Required two-processor pattern

必需的双处理器模式

Every pipeline that consumes a
message
field must include both processors (typically after
ecs.version
and after any CEL-only
remove
/
terminate
steps when applicable):
yaml
- rename:
    field: message
    tag: rename_message_to_event_original
    target_field: event.original
    ignore_missing: true
    description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
    if: ctx.event?.original == null
- remove:
    field: message
    tag: remove_message
    ignore_missing: true
    description: The `message` field is no longer required if the document has an `event.original` field.
    if: ctx.event?.original != null
Step 1 (
rename
): moves
message
into
event.original
, but only when
event.original
is not already populated (idempotent when a prior pipeline or Logstash has already set it).
Step 2 (
remove
): removes the redundant
message
field when
event.original
is present (after rename or from an upstream producer).
每个使用
message
字段的管道必须包含这两个处理器(通常在
ecs.version
之后,以及适用时在任何仅CEL的
remove
/
terminate
步骤之后):
yaml
- rename:
    field: message
    tag: rename_message_to_event_original
    target_field: event.original
    ignore_missing: true
    description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
    if: ctx.event?.original == null
- remove:
    field: message
    tag: remove_message
    ignore_missing: true
    description: The `message` field is no longer required if the document has an `event.original` field.
    if: ctx.event?.original != null
步骤1(
rename
):将
message
移动到
event.original
,但仅当
event.original
尚未填充时(当先前的管道或Logstash已设置它时,此操作是幂等的)。
步骤2(
remove
):当
event.original
存在时(重命名后或来自上游生产者),移除冗余的
message
字段。

Do NOT add an
event.original
removal processor at the end of the pipeline

请勿在管道末尾添加
event.original
移除处理器

Some existing integrations contain a
remove
processor that deletes
event.original
at the end of the pipeline when
preserve_original_event
is not in
tags
. This pattern is deprecated and must not be used in new pipelines. The removal of
event.original
for storage optimization is now handled by a separate final pipeline outside the integration. Do not copy this pattern from reference integrations that still have it — it is legacy.
一些现有集成包含一个
remove
处理器,当
preserve_original_event
不在
tags
中时,会在管道末尾删除
event.original
此模式已弃用,不得在新管道中使用。现在,为优化存储而移除
event.original
的操作由集成之外的单独最终管道处理。请勿从仍有此模式的参考集成中复制——这是遗留内容。

Reference

参考

The two-processor JSE00001 pattern (rename + remove of
message
) shown above is required and complete. Do not add any additional
event.original
processors beyond those two.
上述的双处理器JSE00001模式(
message
的重命名+移除)是必需且完整的。除了这两个处理器之外,请勿添加任何其他
event.original
处理器。

Timezone handling (
tz_offset
)

时区处理(
tz_offset

For data streams that include the
tz_offset
manifest var (syslog streams where messages lack a timezone), set
event.timezone
from
_conf.tz_offset
early in the pipeline, before any date parsing:
yaml
- set:
    field: event.timezone
    tag: set_event_timezone
    value: '{{{_conf.tz_offset}}}'
    if: ctx._conf?.tz_offset != null && ctx._conf.tz_offset != ''
This ensures date processors can apply the correct timezone when parsing timestamps that have no timezone component.
对于包含
tz_offset
清单变量的数据流(消息中缺少时区的syslog流),在管道早期、任何日期解析之前,从
_conf.tz_offset
设置
event.timezone
yaml
- set:
    field: event.timezone
    tag: set_event_timezone
    value: '{{{_conf.tz_offset}}}'
    if: ctx._conf?.tz_offset != null && ctx._conf.tz_offset != ''
这确保日期处理器在解析无时区组件的时间戳时可以应用正确的时区。

Syslog structured data (RFC 5424 SD-ELEMENT) parsing

Syslog结构化数据(RFC 5424 SD-ELEMENT)解析

For vendor
key=value
payloads and RFC 5424 SD-ELEMENT blocks, three strategies are available: KV with
trim_value
(simplest, Strategy 1),
SYSLOG5424SD
grok + KV with regex splits (Strategy 2), and Painless for edge cases with embedded equals or mixed quoting (Strategy 3).
Prefer Strategy 1 or 2; use Painless only when KV edge cases demand it.
See
references/grok-recipes.md
Syslog structured data strategies for full code examples, key settings, and reference implementations.
对于厂商
key=value
负载和RFC 5424 SD-ELEMENT块,有三种策略可用:带
trim_value
的KV(最简单,策略1)、
SYSLOG5424SD
grok + 带正则分割的KV(策略2),以及用于包含嵌入等号或混合引号边缘情况的Painless(策略3)。
优先选择策略1或2;仅当KV边缘情况需要时才使用Painless。
有关完整代码示例、关键设置和参考实现,请参阅
references/grok-recipes.md
Syslog结构化数据策略

Keyword fields delivered as numbers

以数字形式传递的Keyword字段

Fields that carry identifiers, protocol codes, or other opaque values must be declared as
keyword
in
fields.yml
— even when the source data delivers them as numbers. Common examples:
  • network protocol numbers (
    network.iana_number
    )
  • port numbers used as identifiers
  • error codes, result codes, status codes
  • SNMP OIDs, event IDs, object class codes
Do not add a
convert
processor to stringify these values. Elasticsearch silently coerces numbers into
keyword
strings at index time, so the pipeline can pass the raw numeric value through unchanged.
The field declaration in
fields.yml
:
yaml
- name: network.iana_number
  type: keyword
  description: IANA protocol number.
Because the test runner compares raw value types against declared field types, it will flag
6
(long) as a mismatch for
keyword
. Declare the field in
numeric_keyword_fields
in the pipeline test config so the runner accepts the numeric representation without requiring the fixture to artificially stringify the value. See
integration-testing/references/pipeline-testing.md
for the config syntax.
携带标识符、协议代码或其他不透明值的字段必须在
fields.yml
中声明为
keyword
——即使源数据以数字形式传递它们。常见示例:
  • 网络协议编号(
    network.iana_number
  • 用作标识符的端口号
  • 错误代码、结果代码、状态代码
  • SNMP OID、事件ID、对象类代码
不要添加
convert
处理器将这些值转换为字符串。Elasticsearch会在索引时自动将数字转换为
keyword
字符串,因此管道可以保持原始数值不变。
fields.yml
中的字段声明:
yaml
- name: network.iana_number
  type: keyword
  description: IANA protocol number.
由于测试运行器会将原始值类型与声明的字段类型进行比较,它会将
6
(long)标记为与
keyword
不匹配。在管道测试配置的
numeric_keyword_fields
中声明该字段,这样运行器就可以接受数值表示,而无需fixture人为地将值转换为字符串。有关配置语法,请参阅
integration-testing/references/pipeline-testing.md

Vendor field naming

厂商字段命名

Preserve vendor field names exactly as they appear in the source. Do not rename, reformat, or normalize vendor-specific field names — the only permitted renaming is mapping a vendor field to an ECS field (e.g. renaming
src_ip
to
source.ip
). When a vendor field has no ECS equivalent, keep it under a vendor-namespaced prefix (e.g.
vendor.product.field_name
) using the original name from the source.
完全保留源数据中出现的厂商字段名称。请勿重命名、重新格式化或标准化特定于厂商的字段名称——唯一允许的重命名是将厂商字段映射到ECS字段(例如将
src_ip
重命名为
source.ip
)。当厂商字段没有对应的ECS字段时,将其保留在厂商命名空间前缀下(例如
vendor.product.field_name
),使用源数据中的原始名称。

related.ip population

related.ip填充

Every IP address present in the document must be appended to
related.ip
.
This includes source, destination, client, server, host, and any other IP fields — whatever applies to the event type.
Use one
append
processor per IP field, with
ignore_missing: true
so it is a no-op when the field is absent. Place these processors after all IP fields have been set (for example after
geoip
,
convert
, and any ECS rename steps) and before the cleanup
remove
processors.
yaml
  - append:
      field: related.ip
      tag: append_source_ip_to_related
      value: '{{{source.ip}}}'
      allow_duplicates: false
      if: ctx.source?.ip != null
  - append:
      field: related.ip
      tag: append_destination_ip_to_related
      value: '{{{destination.ip}}}'
      allow_duplicates: false
      if: ctx.destination?.ip != null
  # repeat the same pattern for client.ip, server.ip, host.ip, and any other IP fields the pipeline sets
Rules:
  • Use
    allow_duplicates: false
    on every append to avoid repeated values.
  • Add an
    if
    guard on every processor so it skips fields absent in the event.
  • Add one
    append
    per IP field the pipeline actually writes — do not add processors for fields the pipeline never sets.
文档中存在的每个IP地址都必须追加到
related.ip
。这包括源、目标、客户端、服务器、主机以及任何其他IP字段——无论事件类型适用哪些字段。
每个IP字段使用一个
append
处理器,设置
ignore_missing: true
,这样当字段不存在时,该处理器不会执行任何操作。将这些处理器放在所有IP字段都已设置之后(例如在
geoip
convert
和任何ECS重命名步骤之后),以及清理
remove
处理器之前。
yaml
  - append:
      field: related.ip
      tag: append_source_ip_to_related
      value: '{{{source.ip}}}'
      allow_duplicates: false
      if: ctx.source?.ip != null
  - append:
      field: related.ip
      tag: append_destination_ip_to_related
      value: '{{{destination.ip}}}'
      allow_duplicates: false
      if: ctx.destination?.ip != null
  # 对client.ip、server.ip、host.ip以及管道设置的任何其他IP字段重复相同的模式
规则:
  • 在每个append处理器上使用
    allow_duplicates: false
    以避免重复值。
  • 在每个处理器上添加
    if
    条件,以便在事件中缺少字段时跳过。
  • 仅为管道实际写入的IP字段添加append处理器——不要为管道从未设置的字段添加处理器。

Painless script best practices

Painless脚本最佳实践

Before writing any
script
processor, you MUST check whether a built-in processor can do the same job.
script
is the slowest general-purpose processor (Painless compilation + per-document execution). The following operations have dedicated processors that are cheaper and easier to review:
If you need to …Use this processor, not
script
Copy, move, or rename a field
rename
or
set
with
copy_from
Set a constant or derived value
set
Add a value to a list
append
Change a field's type
convert
Extract a substring from a delimited string
dissect
Extract a substring with regex
grok
Replace characters in a string
gsub
Normalize case
lowercase
/
uppercase
Only reach for
script
when no combination of built-in processors can express the logic — for example, ECS categorization lookup tables with 5+ entries (Pattern A), complex conditional arithmetic, or edge-case string parsing that
dissect
and
grok
genuinely cannot handle.
Case-insensitive comparisons — use
equalsIgnoreCase()
when casing is unpredictable
Syslog and vendor devices are often inconsistent about casing, so Painless scripts comparing vendor-specific free-text fields should use
equalsIgnoreCase()
rather than
==
. However, apply this judgement contextually, not blanket:
  • Use
    equalsIgnoreCase()
    when the vendor field value may vary in casing between devices, firmware versions, or log sources (e.g. action fields like
    allow/Allow/ALLOW
    , severity strings, free-text status fields).
  • Use
    ==
    when the API or spec defines a fixed lowercase enum and the values are always delivered as-specified (e.g. ECS categorization fields, API response fields documented as lowercase-only enums). Adding
    equalsIgnoreCase()
    to fixed-enum fields adds noise without value.
painless
// Correct for unpredictable vendor casing
if (ctx.vendor?.action?.equalsIgnoreCase('allow')) { ... }

// Correct for a fixed lowercase API enum — == is appropriate here
if (ctx.json?.event_type == 'login') { ... }

// Incorrect for unpredictable casing — breaks on "Allow", "ALLOW"
if (ctx.vendor?.action == 'allow') { ... }
Access
ctx
directly in script bodies — no null-safe operators
In
script
processor
source
blocks, access
ctx
fields directly. Use explicit null checks instead of the null-safe
?.
operator.
painless
// Correct — direct access with explicit null check
if (ctx.source != null && ctx.source.ip != null) { ... }

// Incorrect — null-safe operator in a script body
if (ctx.source?.ip != null) { ... }
Note: null-safe
?.
is acceptable in processor
if
conditions (YAML), which are a different Painless execution context:
yaml
- append:
    field: related.ip
    value: '{{{source.ip}}}'
    if: ctx.source?.ip != null
Other rules
  • Every
    script
    processor must have a
    tag
    and a
    description
    .
  • Keep scripts short and scoped — move complex logic into helper variables inside the script, not across multiple script processors.
  • Do not use
    script
    when built-in processors suffice
    — see the mandatory checklist table at the top of this section.
在编写任何
script
处理器之前,你必须检查是否有内置处理器可以完成相同的工作
script
是最慢的通用处理器(Painless编译+逐文档执行)。以下操作都有专用处理器,它们成本更低且更易于审查:
如果你需要…使用此处理器,而非
script
复制、移动或重命名字段
rename
或带
copy_from
set
设置常量或派生值
set
向列表中添加值
append
更改字段类型
convert
从分隔字符串中提取子字符串
dissect
使用正则表达式提取子字符串
grok
替换字符串中的字符
gsub
规范化大小写
lowercase
/
uppercase
仅当没有内置处理器的组合可以表达逻辑时,才使用
script
——例如,包含5个以上条目的ECS分类查找表(模式A)、复杂的条件算术,或者
dissect
grok
确实无法处理的边缘情况字符串解析。
不区分大小写的比较——当大小写不可预测时使用
equalsIgnoreCase()
Syslog和厂商设备的大小写通常不一致,因此比较厂商特定自由文本字段的Painless脚本应使用
equalsIgnoreCase()
而非
==
。但是,请根据上下文判断,不要一概而论
  • 使用
    equalsIgnoreCase()
    :当厂商字段值可能因设备、固件版本或日志源而大小写不同时(例如
    allow/Allow/ALLOW
    等操作字段、严重性字符串、自由文本状态字段)。
  • 使用
    ==
    :当API或规范定义了固定的小写枚举,且值始终按指定方式传递时(例如ECS分类字段、文档中说明为仅小写枚举的API响应字段)。对固定枚举字段添加
    equalsIgnoreCase()
    会增加不必要的复杂性。
painless
// 对于不可预测的厂商大小写,正确的写法
if (ctx.vendor?.action?.equalsIgnoreCase('allow')) { ... }

// 对于固定小写API枚举,正确的写法——==是合适的
if (ctx.json?.event_type == 'login') { ... }

// 对于不可预测的大小写,错误的写法——在"Allow"、"ALLOW"时会失效
if (ctx.vendor?.action == 'allow') { ... }
在脚本主体中直接访问
ctx
——不要使用空安全运算符
script
处理器的
source
块中,直接访问
ctx
字段。使用显式的空检查而非空安全
?.
运算符。
painless
// 正确——直接访问并显式空检查
if (ctx.source != null && ctx.source.ip != null) { ... }

// 错误——在脚本主体中使用空安全运算符
if (ctx.source?.ip != null) { ... }
注意:在处理器的
if
条件(YAML)中,空安全
?.
是可接受的,这是不同的Painless执行上下文:
yaml
- append:
    field: related.ip
    value: '{{{source.ip}}}'
    if: ctx.source?.ip != null
其他规则
  • 每个
    script
    处理器必须有一个
    tag
    和一个
    description
  • 保持脚本简短且范围明确——将复杂逻辑移至脚本内部的辅助变量中,而非跨多个脚本处理器。
  • 当内置处理器足够时,请勿使用
    script
    ——请参阅本节顶部的强制检查表。

ECS categorization mapping

ECS分类映射

When mapping source event types or actions to
event.category
,
event.type
,
event.outcome
, and
event.action
, use the patterns in
references/processor-cookbook.md
ECS categorization mapping patterns:
  • Pattern A (script with
    params
    lookup table): recommended for 5+ mappings. Mapping data in
    params
    enables Painless compilation caching and keeps the script body generic.
  • Pattern B (
    set
    processors with conditionals): for fewer than 5 mappings where a script is overkill.
  • Pattern C (sub-pipeline): for 100+ mappings, extract the categorization into a dedicated sub-pipeline file.
Do NOT use bulk
append
processors (2 per event type = 50+ processors for 25 types) or inline Painless
if
/
else
chains without
params
(defeats compilation caching). These are explicit anti-patterns — see the cookbook for details.
将源事件类型或操作映射到
event.category
event.type
event.outcome
event.action
时,请使用
references/processor-cookbook.md
ECS分类映射模式中的模式:
  • 模式A(带
    params
    查找表的脚本):推荐用于5个以上映射。将映射数据放在
    params
    中可以启用Painless编译缓存,并保持脚本主体通用。
  • 模式B(带条件的
    set
    处理器):用于少于5个映射的场景,此时使用脚本过于繁琐。
  • 模式C(子管道):用于100个以上映射的场景,将分类提取到专用的子管道文件中。
请勿使用批量
append
处理器(每种事件类型2个=25种类型需要50个以上处理器)或不带
params
的内联Painless
if
/
else
链(会破坏编译缓存)。这些是明确的反模式——请参阅手册了解详细信息。

Grok best practices

Grok最佳实践

  • prefer
    dissect
    when structure is fixed
  • use simpler grok patterns where possible
  • always anchor grok patterns with
    ^
    and
    $
    :
    yaml
    # Correct — anchored, fails fast on non-matching lines
    patterns:
      - '^%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}$'
    
    # Incorrect — unanchored, scans the whole string for a partial match
    patterns:
      - '%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}'
  • avoid unnecessary backtracking-heavy custom regex
  • add a
    tag
    to every grok (and every other) processor
For grok syntax (three expression forms, inline regex, type coercion,
pattern_definitions
), syslog header splitting recipes, and common mistakes, see
references/grok-recipes.md
.
  • 当结构固定时,优先使用
    dissect
  • 尽可能使用更简单的grok模式
  • 始终使用
    ^
    $
    锚定grok模式:
    yaml
    # 正确——锚定,在不匹配的行上快速失败
    patterns:
      - '^%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}$'
    
    # 错误——未锚定,扫描整个字符串寻找部分匹配
    patterns:
      - '%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}'
  • 避免不必要的、回溯严重的自定义正则表达式
  • 为每个grok(以及其他所有)处理器添加
    tag
有关grok语法(三种表达式形式、内联正则、类型转换、
pattern_definitions
)、syslog头拆分方案和常见错误,请参阅
references/grok-recipes.md

Prohibited patterns

禁止使用的模式

These patterns exist in many legacy integrations but must not be used in new or updated pipelines. Do not copy them from reference integrations.
这些模式存在于许多遗留集成中,但不得在新管道或更新后的管道中使用。请勿从参考集成中复制它们。

Never set
event.ingested

切勿设置
event.ingested

The
event.ingested
field is managed by Elasticsearch outside the integration pipeline. Do not add a
set
processor for
event.ingested
in any integration pipeline. This includes patterns like:
yaml
undefined
event.ingested
字段由Elasticsearch在集成管道之外管理。请勿在任何集成管道中添加
event.ingested
set
处理器。这包括以下模式:
yaml
undefined

PROHIBITED — do not use

禁止使用——请勿使用

  • set: field: event.ingested value: '{{{_ingest.timestamp}}}'

The pipeline **should** set `@timestamp` from the original event's timestamp. When the source data contains multiple timestamps, map them as follows:

- **`@timestamp`**: the primary event timestamp parsed from the source data. This is required.
- **`event.created`**: when the event was first created or recorded by the source system (if different from `@timestamp`).
- **`event.start`**: when an activity or period began (e.g., session start, connection start).
- **`event.end`**: when an activity or period ended (e.g., session end, connection close).

If a source timestamp does not match the semantics of `event.created`, `event.start`, or `event.end`, map it to a custom field under the vendor namespace with `type: date` in `fields.yml` and use a `date` processor with the appropriate `target_field`.
  • set: field: event.ingested value: '{{{_ingest.timestamp}}}'

管道**应**从原始事件的时间戳设置`@timestamp`。当源数据包含多个时间戳时,按以下方式映射:

- **`@timestamp`**:从源数据解析的主事件时间戳。这是必需的。
- **`event.created`**:源系统首次创建或记录事件的时间(如果与`@timestamp`不同)。
- **`event.start`**:活动或周期开始的时间(例如会话开始、连接开始)。
- **`event.end`**:活动或周期结束的时间(例如会话结束、连接关闭)。

如果源时间戳与`event.created`、`event.start`或`event.end`的语义不匹配,请将其映射到厂商命名空间下的自定义字段,并在`fields.yml`中设置`type: date`,然后使用带有适当`target_field`的`date`处理器。

Never use
preserve_duplicate_custom_fields

切勿使用
preserve_duplicate_custom_fields

The
preserve_duplicate_custom_fields
tag pattern — where source fields are copied to ECS fields using
set
with
copy_from
and the originals are conditionally retained — is a legacy anti-pattern. Do not use it in any new or updated pipeline. Do not add a
preserve_duplicate_custom_fields
manifest variable, tag, or conditional logic.
Instead, follow these field mapping rules:
  • When a source field maps to an ECS field, use
    rename
    to move it directly. The source field is removed and no duplicate exists.
  • When a type conversion is needed (e.g., string to date, string to long), use the appropriate processor (
    date
    ,
    convert
    ,
    set
    with
    copy_from
    ) to populate the ECS target field, then
    remove
    the source field in the cleanup section at the end of the pipeline.
  • Never design a pipeline that needs to preserve both the original vendor field and the ECS copy. The ECS field is the canonical location.
If you encounter this pattern in a reference integration, ignore it — it is legacy.
preserve_duplicate_custom_fields
标签模式——使用带
copy_from
set
将源字段复制到ECS字段,并有条件地保留原始字段——是一种遗留反模式。请勿在任何新管道或更新后的管道中使用它。请勿添加
preserve_duplicate_custom_fields
清单变量、标签或条件逻辑。
相反,请遵循以下字段映射规则:
  • 当源字段映射到ECS字段时,使用
    rename
    直接移动它。源字段会被移除,不会存在重复。
  • 当需要类型转换时(例如字符串转日期、字符串转长整型),使用适当的处理器(
    date
    convert
    、带
    copy_from
    set
    )填充ECS目标字段,然后在管道末尾的清理部分
    remove
    源字段。
  • 切勿设计需要同时保留原始厂商字段和ECS副本的管道。ECS字段是规范的存储位置。
如果你在参考集成中遇到此模式,请忽略它——这是遗留内容。

Never add an
event.original
removal processor at the end

请勿在管道末尾添加
event.original
移除处理器

As documented in the JSE00001 section above: do not add a
remove
processor for
event.original
at the end of the pipeline. This is handled by a separate final pipeline.
如JSE00001部分所述:请勿在管道末尾添加
event.original
remove
处理器。此操作由单独的最终管道处理。

References

参考资料

  • references/processor-cookbook.md
    — processor selection, parsing/normalization/enrichment examples, ECS categorization mapping patterns (Pattern A/B/C + anti-patterns)
  • references/branching-patterns.md
  • references/error-handling-patterns.md
  • references/grok-recipes.md
    — grok syntax, type coercion, syslog header recipes, common mistakes, pattern library link
  • references/builder-subagent-guidance.md
    always-embedded subagent operating manual: scope boundaries, skill-load sequence, input data paths (CEL-first vs Direct), 9-step pipeline build workflow, "review generated output, never hand-edit expected JSON", reporting contract
  • references/processor-cookbook.md
    ——处理器选择、解析/标准化/富化示例、ECS分类映射模式(模式A/B/C + 反模式)
  • references/branching-patterns.md
  • references/error-handling-patterns.md
  • references/grok-recipes.md
    ——grok语法、类型转换、syslog头方案、常见错误、模式库链接
  • references/builder-subagent-guidance.md
    ——始终嵌入的子代理操作手册:范围边界、技能加载顺序、输入数据路径(CEL优先 vs 直接)、9步管道构建工作流、“审查生成的输出,切勿手动编辑预期JSON”、报告契约