ingest-pipelines

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ingest-pipelines

Skill authority

技能权威规范

The rules and patterns defined in this skill and its reference files are the authoritative source of truth. When examining existing integrations in the

elastic/integrations

repository for reference, you may encounter patterns that conflict with what is specified here — many integrations contain legacy patterns that predate current standards. Always follow this skill over patterns observed in other integrations. If a reference integration uses a deprecated or prohibited pattern, do not copy it.

本技能及其参考文件中定义的规则和模式是权威的事实来源。在

elastic/integrations

仓库中查看现有集成作为参考时，你可能会遇到与本文档指定内容冲突的模式——许多集成包含早于当前标准的遗留模式。始终遵循本技能中的规范，而非其他集成中观察到的模式。如果某个参考集成使用了已弃用或被禁止的模式，请勿复制。

When to use

使用场景

Use this skill when tasks include:

building or modifying

elasticsearch/ingest_pipeline/default.yml

for a data stream

choosing parser and normalization processors (
```
grok
```
,
```
dissect
```
,
```
json
```
,
```
kv
```
,
```
date
```
,
```
convert
```
)
designing conditional branches and sub-pipeline routing with
```
pipeline
```
processors
implementing resilient error handling with top-level
```
on_failure
```
tuning processor order for ingest performance and maintainability

当你需要完成以下任务时，使用本技能：

为数据流构建或修改

elasticsearch/ingest_pipeline/default.yml

选择解析和标准化处理器（
```
grok
```
、
```
dissect
```
、
```
json
```
、
```
kv
```
、
```
date
```
、
```
convert
```
）
使用
```
pipeline
```
处理器设计条件分支和子管道路由
实现顶层
```
on_failure
```
的弹性错误处理
调整处理器顺序以优化摄入性能和可维护性

When not to use

非适用场景

Do not use this skill as the primary guide for:

ECS field selection, categorization values, and field mapping strategy (
```
ecs-field-mappings
```
)
elastic-package command and stack lifecycle workflows (
```
elastic-package-cli
```
)
test fixture authoring and expected output workflows (
```
integration-testing
```
→
```
references/pipeline-testing.md
```
)

请勿将本技能作为以下内容的主要指南：

ECS字段选择、分类值和字段映射策略（
```
ecs-field-mappings
```
）
elastic-package命令和堆栈生命周期工作流（
```
elastic-package-cli
```
）
测试 fixture 编写和预期输出工作流（
```
integration-testing
```
→
```
references/pipeline-testing.md
```
）

Pipeline anatomy

管道结构

In integration packages, ingest pipelines live under:

data_stream/<stream>/elasticsearch/ingest_pipeline/

Every stream usually has a

default.yml

with:

```
description
```
```
processors
```
list
optional pipeline-level
```
on_failure
```

Keep

default.yml

readable and focused. Move large format-specific logic into sub-pipelines where needed.

在集成包中，ingest pipeline位于：

data_stream/<stream>/elasticsearch/ingest_pipeline/

每个数据流通常有一个

default.yml

，包含：

```
description
```
（描述）
```
processors
```
列表（处理器列表）
可选的管道级
```
on_failure
```
（失败处理）

保持

default.yml

的可读性和聚焦性。必要时将大型格式特定逻辑移至子管道中。

ECS version

ECS版本

Set the pipeline ECS reference version explicitly at the top of

processors

(after any introductory processors you already use). Use
9.3.0
— do not pin an older ECS version.

yaml

  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

在

processors

顶部（在你已使用的任何介绍性处理器之后）显式设置管道ECS参考版本。请使用
9.3.0
——不要固定旧版本的ECS。

yaml

  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

Rename vs set (mapping to ECS)

Rename与Set对比（映射到ECS）

When moving a value from a custom or vendor field into an ECS field, prefer the
rename
processor so the source field is removed and you avoid duplicate data. Use

set

with

copy_from

only when you must keep the source field or when

rename

is not applicable.

当将值从自定义或厂商字段移动到ECS字段时，优先使用
rename
处理器，这样源字段会被移除，避免重复数据。仅当必须保留源字段或

rename

不适用时，才使用带有

copy_from

的

set

。

Processor tags

处理器标签

Every processor in the pipeline should have a

tag

(not only processors that can fail). Tags make failures and telemetry attributable to a specific step.

管道中的每个处理器都应有一个

tag

（不仅是可能失败的处理器）。标签可将故障和遥测数据归因于特定步骤。

CEL-only opening processors (Agentless metadata and error-only documents)

仅CEL的起始处理器（无Agent元数据和仅错误文档）

For CEL-based integrations only, include these before the standard

message

→

event.original

handling when they apply:

remove
: drop Agentless metadata fields (
```
organization
```
,
```
division
```
,
```
team
```
) when all are strings, so they do not collide with ECS. Use
```
ignore_missing: true
```
and a conditional
```
if
```
.

terminate
: stop processing when the document is an error placeholder from the collector (

ctx.error?.message != null && ctx.message == null && ctx.event?.original == null

Non-CEL integrations (logs, syslog, filebeat-style inputs) must not copy this block blindly — those fields and error shapes are specific to the CEL/Agentless path. See the

create-integration

skill: the orchestrator must only expect this block when the data stream uses CEL input.

仅对于基于CEL的集成，当适用时，在标准的

message

→

event.original

处理之前添加以下处理器：

remove
：当
```
organization
```
、
```
division
```
、
```
team
```
均为字符串时，删除无Agent元数据字段，避免与ECS冲突。使用
```
ignore_missing: true
```
和条件
```
if
```
。

terminate
：当文档是收集器返回的错误占位符时（

ctx.error?.message != null && ctx.message == null && ctx.event?.original == null

），停止处理。

非CEL集成（日志、syslog、filebeat风格的输入）不得盲目复制此代码块——这些字段和错误格式是CEL/无Agent路径特有的。请参阅

create-integration

技能：编排器仅在数据流使用CEL输入时才会预期此代码块。

Standard opening: ECS, optional CEL block, JSE00001, then parse

event.original

标准起始流程：ECS、可选CEL块、JSE00001，然后解析

event.original

After the optional CEL-only processors, the pipeline should follow this shape. All parsing (

json

csv

grok

, etc.) runs on event.original
. Never overwrite or mutate
event.original
in later processors — derive structured fields into other paths (for example

json

_temp.*

, ECS fields).

yaml

description: Parse <dataset> events.
processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

  # --- CEL input only (omit for log/syslog-only streams) ---
  - remove:
      field:
        - organization
        - division
        - team
      ignore_missing: true
      if: ctx.organization instanceof String && ctx.division instanceof String && ctx.team instanceof String
      tag: remove_agentless_tags
      description: >-
        Removes the fields added by Agentless as metadata,
        as they can collide with ECS fields.
  - terminate:
      tag: data_collection_error
      if: ctx.error?.message != null && ctx.message == null && ctx.event?.original == null
      description: error message set and no data to process.
  # --- end CEL-only ---

  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      description: The `message` field is no longer required if the document has an `event.original` field.
      if: ctx.event?.original != null

  # Parse (always read from event.original; do not modify event.original)
  - json:
      field: event.original
      target_field: json
      tag: parse_json
      if: ctx.event?.original != null

  # ... normalize, enrich, ECS categorization, cleanup ...

  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false
      if: ctx.error?.message != null

on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

在可选的仅CEL处理器之后，管道应遵循以下结构。所有解析（

json

、

csv

、

grok

等）都基于**

event.original

**执行。切勿在后续处理器中覆盖或修改
event.original
——将结构化字段派生到其他路径（例如

json

、

_temp.*

、ECS字段）。

yaml

description: Parse <dataset> events.
processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'

  # --- CEL input only (omit for log/syslog-only streams) ---
  - remove:
      field:
        - organization
        - division
        - team
      ignore_missing: true
      if: ctx.organization instanceof String && ctx.division instanceof String && ctx.team instanceof String
      tag: remove_agentless_tags
      description: >-
        Removes the fields added by Agentless as metadata,
        as they can collide with ECS fields.
  - terminate:
      tag: data_collection_error
      if: ctx.error?.message != null && ctx.message == null && ctx.event?.original == null
      description: error message set and no data to process.
  # --- end CEL-only ---

  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      description: The `message` field is no longer required if the document has an `event.original` field.
      if: ctx.event?.original != null

  # Parse (always read from event.original; do not modify event.original)
  - json:
      field: event.original
      target_field: json
      tag: parse_json
      if: ctx.event?.original != null

  # ... normalize, enrich, ECS categorization, cleanup ...

  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false
      if: ctx.error?.message != null

on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

Single-path pattern (linear pipeline)

单路径模式（线性管道）

Use this pattern when one parser flow handles all events. Combine the standard opening (ECS version, optional CEL-only block, JSE00001 rename/remove, parse from

event.original

without mutating it), middle processors with tags on every step, and the pipeline-level
on_failure
and conditional
append
for
preserve_original_event
shown above.

Example middle section (illustrative):

yaml

  - grok:
      field: event.original
      patterns:
        - '^...$'
      tag: parse_main

  - date:
      field: some.time
      target_field: '@timestamp'
      formats: [ISO8601]
      tag: parse_timestamp
  - convert:
      field: http.response.status_code
      type: long
      ignore_missing: true
      tag: convert_status

  - user_agent:
      field: user_agent.original
      ignore_missing: true
      tag: enrich_user_agent
  - geoip:
      field: source.ip
      target_field: source.geo
      ignore_missing: true
      tag: enrich_source_geo
  - geoip:
      database_file: GeoLite2-ASN.mmdb
      field: source.ip
      target_field: source.as
      properties:
        - asn
        - organization_name
      ignore_missing: true
      tag: enrich_source_asn
  - rename:
      field: source.as.asn
      target_field: source.as.number
      ignore_missing: true
      tag: rename_source_asn
  - rename:
      field: source.as.organization_name
      target_field: source.as.organization.name
      ignore_missing: true
      tag: rename_source_as_org

  - set:
      field: event.kind
      tag: set_event_kind
      value: event
  - append:
      field: event.category
      tag: append_event_category_web
      value: web
  - remove:
      field: temp
      ignore_missing: true
      tag: remove_temp

当一个解析流处理所有事件时，使用此模式。结合标准起始流程（ECS版本、可选仅CEL块、JSE00001重命名/移除、基于

event.original

解析且不修改它）、每个步骤都带有标签的中间处理器，以及上述的管道级
on_failure
和
preserve_original_event
的条件
append
。

示例中间部分（仅供说明）：

yaml

  - grok:
      field: event.original
      patterns:
        - '^...$'
      tag: parse_main

  - date:
      field: some.time
      target_field: '@timestamp'
      formats: [ISO8601]
      tag: parse_timestamp
  - convert:
      field: http.response.status_code
      type: long
      ignore_missing: true
      tag: convert_status

  - user_agent:
      field: user_agent.original
      ignore_missing: true
      tag: enrich_user_agent
  - geoip:
      field: source.ip
      target_field: source.geo
      ignore_missing: true
      tag: enrich_source_geo
  - geoip:
      database_file: GeoLite2-ASN.mmdb
      field: source.ip
      target_field: source.as
      properties:
        - asn
        - organization_name
      ignore_missing: true
      tag: enrich_source_asn
  - rename:
      field: source.as.asn
      target_field: source.as.number
      ignore_missing: true
      tag: rename_source_asn
  - rename:
      field: source.as.organization_name
      target_field: source.as.organization.name
      ignore_missing: true
      tag: rename_source_as_org

  - set:
      field: event.kind
      tag: set_event_kind
      value: event
  - append:
      field: event.category
      tag: append_event_category_web
      value: web
  - remove:
      field: temp
      ignore_missing: true
      tag: remove_temp

Branching pattern (router + sub-pipelines)

分支模式（路由器+子管道）

Use branching when event formats or object models diverge:

format-based branching (for example JSON vs text)
class/category-based branching (for example OCSF class/category routing)
object-presence branching (
```
ctx.ocsf.user != null
```
)

Pattern:

yaml

processors:
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_json" }}'
      if: ctx.event?.original != null && ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_json
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_text" }}'
      if: ctx.event?.original != null && !ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_text

In large integrations, keep

default.yml

as the router and put branch logic in files like:

```
pipeline_object_<name>.yml
```
```
pipeline_category_<name>.yml
```

See

references/branching-patterns.md

for full patterns from

amazon_security_lake

当事件格式或对象模型存在差异时使用分支：

基于格式的分支（例如JSON vs 文本）
基于类/分类的分支（例如OCSF类/分类路由）
基于对象存在的分支（
```
ctx.ocsf.user != null
```
）

模式：

yaml

processors:
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_json" }}'
      if: ctx.event?.original != null && ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_json
  - pipeline:
      name: '{{ IngestPipeline "pipeline_branch_text" }}'
      if: ctx.event?.original != null && !ctx.event.original.startsWith('{')
      ignore_missing_pipeline: true
      tag: route_text

在大型集成中，将

default.yml

作为路由器，并将分支逻辑放在以下文件中：

```
pipeline_object_<name>.yml
```
```
pipeline_category_<name>.yml
```

有关完整模式，请参阅

amazon_security_lake

中的

references/branching-patterns.md

。

Sub-pipeline routing for multi-log-type integrations

多日志类型集成的子管道路由

When a data stream receives multiple distinct log types (for example a firewall that emits traffic, auth, and DNS logs in the same stream), do not implement all parsing in a single monolithic
default.yml
. Use

default.yml

as a thin router that detects the log type and delegates to a dedicated sub-pipeline per type.

当一个数据流接收多种不同类型的日志时（例如防火墙在同一流中发送流量、认证和DNS日志），不要在单个庞大的
default.yml
中实现所有解析逻辑。将

default.yml

作为轻量路由器，检测日志类型并将任务委托给每种类型对应的专用子管道。

File layout

文件结构

text

elasticsearch/ingest_pipeline/
  default.yml              # router only — detects log type, calls sub-pipelines
  pipeline-<type>.yml      # one file per log type (e.g. pipeline-traffic.yml)

text

elasticsearch/ingest_pipeline/
  default.yml              # 仅作为路由器——检测日志类型，调用子管道
  pipeline-<type>.yml      # 每种日志类型对应一个文件（例如pipeline-traffic.yml）

Router pattern in

default.yml

default.yml

中的路由器模式

Use the same ecs.version
, JSE00001

rename

remove

pair for

message

, and full pipeline-level
on_failure
as in the standard opening. The router only branches sub-pipelines; it does not parse payloads.

yaml

processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'
  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      if: ctx.event?.original != null
  - pipeline:
      name: '{{ IngestPipeline "pipeline-traffic" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("TRAFFIC")'
      tag: route_traffic
  - pipeline:
      name: '{{ IngestPipeline "pipeline-auth" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("AUTH")'
      tag: route_auth
  - pipeline:
      name: '{{ IngestPipeline "pipeline-dns" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("DNS")'
      tag: route_dns
on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

使用与标准起始流程相同的**

ecs.version

、用于
message
的JSE00001重命名/移除对，以及完整的管道级

on_failure

**。路由器仅负责分支到子管道；不解析负载。

yaml

processors:
  - set:
      field: ecs.version
      tag: set_ecs_version
      value: '9.3.0'
  - rename:
      field: message
      tag: rename_message_to_event_original
      target_field: event.original
      ignore_missing: true
      if: ctx.event?.original == null
  - remove:
      field: message
      tag: remove_message
      ignore_missing: true
      if: ctx.event?.original != null
  - pipeline:
      name: '{{ IngestPipeline "pipeline-traffic" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("TRAFFIC")'
      tag: route_traffic
  - pipeline:
      name: '{{ IngestPipeline "pipeline-auth" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("AUTH")'
      tag: route_auth
  - pipeline:
      name: '{{ IngestPipeline "pipeline-dns" }}'
      if: 'ctx.event?.original != null && ctx.event.original.contains("DNS")'
      tag: route_dns
on_failure:
  - append:
      field: error.message
      value: >-
        Processor '{{{ _ingest.on_failure_processor_type }}}'
        {{{#_ingest.on_failure_processor_tag}}}with tag '{{{ _ingest.on_failure_processor_tag }}}'
        {{{/_ingest.on_failure_processor_tag}}}failed with message '{{{ _ingest.on_failure_message }}}'
  - set:
      field: event.kind
      tag: set_pipeline_error_to_event_kind
      value: pipeline_error
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

Rules

规则

```
default.yml
```
must contain only routing logic and
```
on_failure
```
handling — no field parsing.
Each sub-pipeline handles parsing, ECS mapping, and categorization for its own log type.
Each sub-pipeline must have its own
```
on_failure
```
block.
Name sub-pipeline files
```
pipeline-<type>.yml
```
where
```
<type>
```
matches the log type identifier used in the routing condition.
Each log type gets its own pipeline test fixture file following the naming convention
```
test-<package>-<datastream>-<type>-sample.log
```
.

```
default.yml
```
必须仅包含路由逻辑和
```
on_failure
```
处理——不包含字段解析。
每个子管道负责其对应日志类型的解析、ECS映射和分类。
每个子管道必须有自己的
```
on_failure
```
块。
子管道文件命名为
```
pipeline-<type>.yml
```
，其中
```
<type>
```
与路由条件中使用的日志类型标识符匹配。
每种日志类型都有自己的管道测试fixture文件，遵循命名约定
```
test-<package>-<datastream>-<type>-sample.log
```
。

Processor ordering and performance

处理器排序与性能

run cheap existence checks before expensive operations
drop early if records are out of scope
prefer
```
dissect
```
over
```
grok
```
for stable delimited formats
never use a
script
processor when a built-in processor can do the job —
```
set
```
,
```
rename
```
,
```
remove
```
,
```
append
```
,
```
convert
```
,
```
dissect
```
,
```
grok
```
,
```
gsub
```
,
```
lowercase
```
,
```
uppercase
```
, and
```
trim
```
are all faster than Painless and easier to review. See the cost tiers in
```
references/processor-cookbook.md
```
→ Processor performance guide.
use enrichment processors (
```
geoip
```
,
```
user_agent
```
) only when needed
always anchor
```
grok
```
patterns with
```
^
```
and
```
$
```
— without anchors the regex engine scans the entire input string looking for a partial match, which is slow and can produce incorrect results on noisy log lines

在执行昂贵操作之前先运行低成本的存在性检查
尽早丢弃超出范围的记录
对于稳定的分隔格式，优先使用
```
dissect
```
而非
```
grok
```
当内置处理器可以完成任务时，切勿使用
script
处理器——
```
set
```
、
```
rename
```
、
```
remove
```
、
```
append
```
、
```
convert
```
、
```
dissect
```
、
```
grok
```
、
```
gsub
```
、
```
lowercase
```
、
```
uppercase
```
和
```
trim
```
都比Painless更快且更易于审查。请参阅
```
references/processor-cookbook.md
```
中的成本层级→处理器性能指南。
仅在需要时使用富化处理器（
```
geoip
```
、
```
user_agent
```
）
始终使用
```
^
```
和
```
$
```
锚定
```
grok
```
模式——没有锚点的话，正则引擎会扫描整个输入字符串以寻找部分匹配，这会很慢且可能在嘈杂的日志行上产生错误结果

Mustache template syntax in processor values

处理器值中的Mustache模板语法

Ingest pipeline processors use Mustache templates to reference field values in

value

message

, and similar string parameters. Use triple braces

{{{field}}}

with single quotes — never double braces or double quotes:

yaml

undefined

Ingest pipeline处理器使用Mustache模板在

value

、

message

和类似字符串参数中引用字段值。使用三重大括号

{{{field}}}

并搭配单引号——切勿使用双大括号或双引号：

yaml

undefined

CORRECT — triple braces, single quotes

append: field: related.user value: '{{{user.target.email}}}' allow_duplicates: false if: ctx.user?.target?.email != null

append: field: related.user value: '{{{user.target.email}}}' allow_duplicates: false if: ctx.user?.target?.email != null

WRONG — double braces HTML-escape the value; double quotes

append: field: related.user value: "{{user.target.email}}" allow_duplicates: false if: ctx.user?.target?.email != null


Why: Mustache double braces `{{...}}` HTML-encode the value (e.g., `&` becomes `&amp;`), which corrupts data in ingest pipelines. Triple braces `{{{...}}}` emit the raw value. Single quotes prevent YAML from interpreting braces.

**Exception:** `{{ IngestPipeline "..." }}` in `pipeline.name` is a Go template directive processed at build time, not a Mustache template — it correctly uses double braces.

append: field: related.user value: "{{user.target.email}}" allow_duplicates: false if: ctx.user?.target?.email != null


原因：Mustache双大括号`{{...}}`会对值进行HTML编码（例如`&`变为`&amp;`），这会破坏ingest pipeline中的数据。三重大括号`{{{...}}}`会输出原始值。单引号可防止YAML解析大括号。

**例外情况：**`pipeline.name`中的`{{ IngestPipeline "..." }}`是构建时处理的Go模板指令，而非Mustache模板——它正确使用双大括号。

Error handling essentials

错误处理要点

Use pipeline-level

on_failure

as the main error reporting mechanism.

Recommended baseline (order matters):

append contextual
```
error.message
```
first using
```
_ingest.on_failure_*
```
variables (full template in the standard opening example)
set
```
event.kind: pipeline_error
```
(with a
```
tag
```
on the
```
set
```
processor)
append
```
preserve_original_event
```
to
```
tags
```
when you need to retain the failed document for triage
give every processor a
```
tag
```
(not only processors that can fail)

Use processor-level

on_failure

for local cleanup or fallback parsing, not as the primary global error message path.

See

references/error-handling-patterns.md

for full examples and tradeoffs (

ignore_failure

fail

, processor-level

on_failure

使用管道级

on_failure

作为主要的错误报告机制。

推荐的基线（顺序很重要）：

首先使用
```
_ingest.on_failure_*
```
变量追加上下文相关的
```
error.message
```
（标准起始流程示例中有完整模板）
设置
```
event.kind: pipeline_error
```
（在
```
set
```
处理器上添加
```
tag
```
）
当需要保留失败文档以进行分类时，追加
```
preserve_original_event
```
到
```
tags
```
为每个处理器添加
```
tag
```
（不仅是可能失败的处理器）

使用处理器级

on_failure

进行本地清理或备用解析，而非作为主要的全局错误消息路径。

有关完整示例和权衡（

ignore_failure

、

fail

、处理器级

on_failure

），请参阅

references/error-handling-patterns.md

。

event.original handling (JSE00001)

event.original处理（JSE00001）

The

elastic-package build

validator enforces that pipelines correctly handle the

message

event.original

rename. This check is known as JSE00001. New packages must comply; some legacy packages exclude it via

validation.yml

elastic-package build

验证器会强制要求管道正确处理

message

到

event.original

的重命名。此检查称为JSE00001。新包必须遵守；一些遗留包通过

validation.yml

排除了此检查。

Required two-processor pattern

必需的双处理器模式

Every pipeline that consumes a

message

field must include both processors (typically after

ecs.version

and after any CEL-only

remove

terminate

steps when applicable):

yaml

- rename:
    field: message
    tag: rename_message_to_event_original
    target_field: event.original
    ignore_missing: true
    description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
    if: ctx.event?.original == null
- remove:
    field: message
    tag: remove_message
    ignore_missing: true
    description: The `message` field is no longer required if the document has an `event.original` field.
    if: ctx.event?.original != null

Step 1 (

rename

): moves

message

into

event.original

, but only when

event.original

is not already populated (idempotent when a prior pipeline or Logstash has already set it).

Step 2 (

remove

): removes the redundant

message

field when

event.original

is present (after rename or from an upstream producer).

每个使用

message

字段的管道必须包含这两个处理器（通常在

ecs.version

之后，以及适用时在任何仅CEL的

remove

terminate

步骤之后）：

yaml

- rename:
    field: message
    tag: rename_message_to_event_original
    target_field: event.original
    ignore_missing: true
    description: Renames the original `message` field to `event.original` to store a copy of the original message. The `event.original` field is not touched if the document already has one; it may happen when Logstash sends the document.
    if: ctx.event?.original == null
- remove:
    field: message
    tag: remove_message
    ignore_missing: true
    description: The `message` field is no longer required if the document has an `event.original` field.
    if: ctx.event?.original != null

步骤1（

rename

）：将

message

移动到

event.original

，但仅当

event.original

尚未填充时（当先前的管道或Logstash已设置它时，此操作是幂等的）。

步骤2（

remove

）：当

event.original

存在时（重命名后或来自上游生产者），移除冗余的

message

字段。

Do NOT add an

event.original

removal processor at the end of the pipeline

请勿在管道末尾添加

event.original

移除处理器

Some existing integrations contain a

remove

processor that deletes

event.original

at the end of the pipeline when

preserve_original_event

is not in

tags

. This pattern is deprecated and must not be used in new pipelines. The removal of

event.original

for storage optimization is now handled by a separate final pipeline outside the integration. Do not copy this pattern from reference integrations that still have it — it is legacy.

一些现有集成包含一个

remove

处理器，当

preserve_original_event

不在

tags

中时，会在管道末尾删除

event.original

。此模式已弃用，不得在新管道中使用。现在，为优化存储而移除

event.original

的操作由集成之外的单独最终管道处理。请勿从仍有此模式的参考集成中复制——这是遗留内容。

Reference

参考

The two-processor JSE00001 pattern (rename + remove of

message

) shown above is required and complete. Do not add any additional

event.original

processors beyond those two.

上述的双处理器JSE00001模式（

message

的重命名+移除）是必需且完整的。除了这两个处理器之外，请勿添加任何其他

event.original

处理器。

Timezone handling (

tz_offset

)

时区处理（

tz_offset

）

For data streams that include the

tz_offset

manifest var (syslog streams where messages lack a timezone), set

event.timezone

from

_conf.tz_offset

early in the pipeline, before any date parsing:

yaml

- set:
    field: event.timezone
    tag: set_event_timezone
    value: '{{{_conf.tz_offset}}}'
    if: ctx._conf?.tz_offset != null && ctx._conf.tz_offset != ''

This ensures date processors can apply the correct timezone when parsing timestamps that have no timezone component.

对于包含

tz_offset

清单变量的数据流（消息中缺少时区的syslog流），在管道早期、任何日期解析之前，从

_conf.tz_offset

设置

event.timezone

：

yaml

- set:
    field: event.timezone
    tag: set_event_timezone
    value: '{{{_conf.tz_offset}}}'
    if: ctx._conf?.tz_offset != null && ctx._conf.tz_offset != ''

这确保日期处理器在解析无时区组件的时间戳时可以应用正确的时区。

Syslog structured data (RFC 5424 SD-ELEMENT) parsing

Syslog结构化数据（RFC 5424 SD-ELEMENT）解析

For vendor

key=value

payloads and RFC 5424 SD-ELEMENT blocks, three strategies are available: KV with

trim_value

(simplest, Strategy 1),

SYSLOG5424SD

grok + KV with regex splits (Strategy 2), and Painless for edge cases with embedded equals or mixed quoting (Strategy 3).

Prefer Strategy 1 or 2; use Painless only when KV edge cases demand it.

See

references/grok-recipes.md

→ Syslog structured data strategies for full code examples, key settings, and reference implementations.

对于厂商

key=value

负载和RFC 5424 SD-ELEMENT块，有三种策略可用：带

trim_value

的KV（最简单，策略1）、

SYSLOG5424SD

grok + 带正则分割的KV（策略2），以及用于包含嵌入等号或混合引号边缘情况的Painless（策略3）。

优先选择策略1或2；仅当KV边缘情况需要时才使用Painless。

有关完整代码示例、关键设置和参考实现，请参阅

references/grok-recipes.md

→Syslog结构化数据策略。

Keyword fields delivered as numbers

以数字形式传递的Keyword字段

Fields that carry identifiers, protocol codes, or other opaque values must be declared as

keyword

fields.yml

— even when the source data delivers them as numbers. Common examples:

network protocol numbers (
```
network.iana_number
```
)
port numbers used as identifiers
error codes, result codes, status codes
SNMP OIDs, event IDs, object class codes

Do not add a

convert

processor to stringify these values. Elasticsearch silently coerces numbers into

keyword

strings at index time, so the pipeline can pass the raw numeric value through unchanged.

The field declaration in

fields.yml

yaml

- name: network.iana_number
  type: keyword
  description: IANA protocol number.

Because the test runner compares raw value types against declared field types, it will flag

(long) as a mismatch for

keyword

. Declare the field in

numeric_keyword_fields

in the pipeline test config so the runner accepts the numeric representation without requiring the fixture to artificially stringify the value. See

integration-testing/references/pipeline-testing.md

for the config syntax.

携带标识符、协议代码或其他不透明值的字段必须在

fields.yml

中声明为

keyword

——即使源数据以数字形式传递它们。常见示例：

网络协议编号（
```
network.iana_number
```
）
用作标识符的端口号
错误代码、结果代码、状态代码
SNMP OID、事件ID、对象类代码

不要添加

convert

处理器将这些值转换为字符串。Elasticsearch会在索引时自动将数字转换为

keyword

字符串，因此管道可以保持原始数值不变。

fields.yml

中的字段声明：

yaml

- name: network.iana_number
  type: keyword
  description: IANA protocol number.

由于测试运行器会将原始值类型与声明的字段类型进行比较，它会将

（long）标记为与

keyword

不匹配。在管道测试配置的

numeric_keyword_fields

中声明该字段，这样运行器就可以接受数值表示，而无需fixture人为地将值转换为字符串。有关配置语法，请参阅

integration-testing/references/pipeline-testing.md

。

Vendor field naming

厂商字段命名

Preserve vendor field names exactly as they appear in the source. Do not rename, reformat, or normalize vendor-specific field names — the only permitted renaming is mapping a vendor field to an ECS field (e.g. renaming

src_ip

source.ip

). When a vendor field has no ECS equivalent, keep it under a vendor-namespaced prefix (e.g.

vendor.product.field_name

) using the original name from the source.

完全保留源数据中出现的厂商字段名称。请勿重命名、重新格式化或标准化特定于厂商的字段名称——唯一允许的重命名是将厂商字段映射到ECS字段（例如将

src_ip

重命名为

source.ip

）。当厂商字段没有对应的ECS字段时，将其保留在厂商命名空间前缀下（例如

vendor.product.field_name

），使用源数据中的原始名称。

related.ip population

related.ip填充

Every IP address present in the document must be appended to
related.ip
. This includes source, destination, client, server, host, and any other IP fields — whatever applies to the event type.

Use one

append

processor per IP field, with

ignore_missing: true

so it is a no-op when the field is absent. Place these processors after all IP fields have been set (for example after

geoip

convert

, and any ECS rename steps) and before the cleanup

remove

processors.

yaml

  - append:
      field: related.ip
      tag: append_source_ip_to_related
      value: '{{{source.ip}}}'
      allow_duplicates: false
      if: ctx.source?.ip != null
  - append:
      field: related.ip
      tag: append_destination_ip_to_related
      value: '{{{destination.ip}}}'
      allow_duplicates: false
      if: ctx.destination?.ip != null
  # repeat the same pattern for client.ip, server.ip, host.ip, and any other IP fields the pipeline sets

Rules:

Use
```
allow_duplicates: false
```
on every append to avoid repeated values.
Add an
```
if
```
guard on every processor so it skips fields absent in the event.
Add one
```
append
```
per IP field the pipeline actually writes — do not add processors for fields the pipeline never sets.

文档中存在的每个IP地址都必须追加到
related.ip
中。这包括源、目标、客户端、服务器、主机以及任何其他IP字段——无论事件类型适用哪些字段。

每个IP字段使用一个

append

处理器，设置

ignore_missing: true

，这样当字段不存在时，该处理器不会执行任何操作。将这些处理器放在所有IP字段都已设置之后（例如在

geoip

、

convert

和任何ECS重命名步骤之后），以及清理

remove

处理器之前。

yaml

  - append:
      field: related.ip
      tag: append_source_ip_to_related
      value: '{{{source.ip}}}'
      allow_duplicates: false
      if: ctx.source?.ip != null
  - append:
      field: related.ip
      tag: append_destination_ip_to_related
      value: '{{{destination.ip}}}'
      allow_duplicates: false
      if: ctx.destination?.ip != null
  # 对client.ip、server.ip、host.ip以及管道设置的任何其他IP字段重复相同的模式

规则：

在每个append处理器上使用
```
allow_duplicates: false
```
以避免重复值。
在每个处理器上添加
```
if
```
条件，以便在事件中缺少字段时跳过。
仅为管道实际写入的IP字段添加append处理器——不要为管道从未设置的字段添加处理器。

Painless script best practices

Painless脚本最佳实践

Before writing any
script
processor, you MUST check whether a built-in processor can do the same job.

script

is the slowest general-purpose processor (Painless compilation + per-document execution). The following operations have dedicated processors that are cheaper and easier to review:

If you need to …	Use this processor, not `script`
Copy, move, or rename a field	`rename` or `set` with `copy_from`
Set a constant or derived value	`set`
Add a value to a list	`append`
Change a field's type	`convert`
Extract a substring from a delimited string	`dissect`
Extract a substring with regex	`grok`
Replace characters in a string	`gsub`
Normalize case	`lowercase` / `uppercase`

Only reach for

script

when no combination of built-in processors can express the logic — for example, ECS categorization lookup tables with 5+ entries (Pattern A), complex conditional arithmetic, or edge-case string parsing that

dissect

and

grok

genuinely cannot handle.

Case-insensitive comparisons — use
equalsIgnoreCase()
when casing is unpredictable

Syslog and vendor devices are often inconsistent about casing, so Painless scripts comparing vendor-specific free-text fields should use

equalsIgnoreCase()

rather than

==

. However, apply this judgement contextually, not blanket:

Use
equalsIgnoreCase()
when the vendor field value may vary in casing between devices, firmware versions, or log sources (e.g. action fields like
```
allow/Allow/ALLOW
```
, severity strings, free-text status fields).
Use
==
when the API or spec defines a fixed lowercase enum and the values are always delivered as-specified (e.g. ECS categorization fields, API response fields documented as lowercase-only enums). Adding
```
equalsIgnoreCase()
```
to fixed-enum fields adds noise without value.

painless

// Correct for unpredictable vendor casing
if (ctx.vendor?.action?.equalsIgnoreCase('allow')) { ... }

// Correct for a fixed lowercase API enum — == is appropriate here
if (ctx.json?.event_type == 'login') { ... }

// Incorrect for unpredictable casing — breaks on "Allow", "ALLOW"
if (ctx.vendor?.action == 'allow') { ... }

Access
ctx
directly in script bodies — no null-safe operators

script

processor

source

blocks, access

ctx

fields directly. Use explicit null checks instead of the null-safe

?.

operator.

painless

// Correct — direct access with explicit null check
if (ctx.source != null && ctx.source.ip != null) { ... }

// Incorrect — null-safe operator in a script body
if (ctx.source?.ip != null) { ... }

Note: null-safe

?.

is acceptable in processor

if

conditions (YAML), which are a different Painless execution context:

yaml

- append:
    field: related.ip
    value: '{{{source.ip}}}'
    if: ctx.source?.ip != null

Other rules

Every
```
script
```
processor must have a
```
tag
```
and a
```
description
```
.
Keep scripts short and scoped — move complex logic into helper variables inside the script, not across multiple script processors.
Do not use
script
when built-in processors suffice — see the mandatory checklist table at the top of this section.

在编写任何
script
处理器之前，你必须检查是否有内置处理器可以完成相同的工作。

script

是最慢的通用处理器（Painless编译+逐文档执行）。以下操作都有专用处理器，它们成本更低且更易于审查：

如果你需要…	使用此处理器，而非 `script`
复制、移动或重命名字段	`rename` 或带 `copy_from` 的 `set`
设置常量或派生值	`set`
向列表中添加值	`append`
更改字段类型	`convert`
从分隔字符串中提取子字符串	`dissect`
使用正则表达式提取子字符串	`grok`
替换字符串中的字符	`gsub`
规范化大小写	`lowercase` / `uppercase`

仅当没有内置处理器的组合可以表达逻辑时，才使用

script

——例如，包含5个以上条目的ECS分类查找表（模式A）、复杂的条件算术，或者

dissect

和

grok

确实无法处理的边缘情况字符串解析。

不区分大小写的比较——当大小写不可预测时使用
equalsIgnoreCase()

Syslog和厂商设备的大小写通常不一致，因此比较厂商特定自由文本字段的Painless脚本应使用

equalsIgnoreCase()

而非

==

。但是，请根据上下文判断，不要一概而论：

使用
equalsIgnoreCase()
：当厂商字段值可能因设备、固件版本或日志源而大小写不同时（例如
```
allow/Allow/ALLOW
```
等操作字段、严重性字符串、自由文本状态字段）。
使用
==
：当API或规范定义了固定的小写枚举，且值始终按指定方式传递时（例如ECS分类字段、文档中说明为仅小写枚举的API响应字段）。对固定枚举字段添加
```
equalsIgnoreCase()
```
会增加不必要的复杂性。

painless

// 对于不可预测的厂商大小写，正确的写法
if (ctx.vendor?.action?.equalsIgnoreCase('allow')) { ... }

// 对于固定小写API枚举，正确的写法——==是合适的
if (ctx.json?.event_type == 'login') { ... }

// 对于不可预测的大小写，错误的写法——在"Allow"、"ALLOW"时会失效
if (ctx.vendor?.action == 'allow') { ... }

在脚本主体中直接访问
ctx
——不要使用空安全运算符

在

script

处理器的

source

块中，直接访问

ctx

字段。使用显式的空检查而非空安全

?.

运算符。

painless

// 正确——直接访问并显式空检查
if (ctx.source != null && ctx.source.ip != null) { ... }

// 错误——在脚本主体中使用空安全运算符
if (ctx.source?.ip != null) { ... }

注意：在处理器的

if

条件（YAML）中，空安全

?.

是可接受的，这是不同的Painless执行上下文：

yaml

- append:
    field: related.ip
    value: '{{{source.ip}}}'
    if: ctx.source?.ip != null

其他规则

每个
```
script
```
处理器必须有一个
```
tag
```
和一个
```
description
```
。
保持脚本简短且范围明确——将复杂逻辑移至脚本内部的辅助变量中，而非跨多个脚本处理器。
当内置处理器足够时，请勿使用
script
——请参阅本节顶部的强制检查表。

ECS categorization mapping

ECS分类映射

When mapping source event types or actions to

event.category

event.type

event.outcome

, and

event.action

, use the patterns in

references/processor-cookbook.md

→ ECS categorization mapping patterns:

Pattern A (script with
```
params
```
lookup table): recommended for 5+ mappings. Mapping data in
```
params
```
enables Painless compilation caching and keeps the script body generic.
Pattern B (
```
set
```
processors with conditionals): for fewer than 5 mappings where a script is overkill.
Pattern C (sub-pipeline): for 100+ mappings, extract the categorization into a dedicated sub-pipeline file.

Do NOT use bulk

append

processors (2 per event type = 50+ processors for 25 types) or inline Painless

if

else

chains without

params

(defeats compilation caching). These are explicit anti-patterns — see the cookbook for details.

将源事件类型或操作映射到

event.category

、

event.type

、

event.outcome

和

event.action

时，请使用

references/processor-cookbook.md

→ECS分类映射模式中的模式：

模式A（带
```
params
```
查找表的脚本）：推荐用于5个以上映射。将映射数据放在
```
params
```
中可以启用Painless编译缓存，并保持脚本主体通用。
模式B（带条件的
```
set
```
处理器）：用于少于5个映射的场景，此时使用脚本过于繁琐。
模式C（子管道）：用于100个以上映射的场景，将分类提取到专用的子管道文件中。

请勿使用批量

append

处理器（每种事件类型2个=25种类型需要50个以上处理器）或不带

params

的内联Painless

if

else

链（会破坏编译缓存）。这些是明确的反模式——请参阅手册了解详细信息。

Grok best practices

Grok最佳实践

prefer
```
dissect
```
when structure is fixed
use simpler grok patterns where possible

always anchor grok patterns with

and

yaml

# Correct — anchored, fails fast on non-matching lines
patterns:
  - '^%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}$'

# Incorrect — unanchored, scans the whole string for a partial match
patterns:
  - '%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}'

avoid unnecessary backtracking-heavy custom regex
add a
```
tag
```
to every grok (and every other) processor

For grok syntax (three expression forms, inline regex, type coercion,

pattern_definitions

), syslog header splitting recipes, and common mistakes, see

references/grok-recipes.md

当结构固定时，优先使用
```
dissect
```
尽可能使用更简单的grok模式

始终使用

和

锚定grok模式：

yaml

# 正确——锚定，在不匹配的行上快速失败
patterns:
  - '^%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}$'

# 错误——未锚定，扫描整个字符串寻找部分匹配
patterns:
  - '%{IPORHOST:source.ip} %{USER:user.name} %{DATA:message}'

避免不必要的、回溯严重的自定义正则表达式
为每个grok（以及其他所有）处理器添加
```
tag
```

有关grok语法（三种表达式形式、内联正则、类型转换、

pattern_definitions

）、syslog头拆分方案和常见错误，请参阅

references/grok-recipes.md

。

Prohibited patterns

禁止使用的模式

These patterns exist in many legacy integrations but must not be used in new or updated pipelines. Do not copy them from reference integrations.

这些模式存在于许多遗留集成中，但不得在新管道或更新后的管道中使用。请勿从参考集成中复制它们。

Never set

event.ingested

切勿设置

event.ingested

The

event.ingested

field is managed by Elasticsearch outside the integration pipeline. Do not add a

set

processor for

event.ingested

in any integration pipeline. This includes patterns like:

yaml

undefined

event.ingested

字段由Elasticsearch在集成管道之外管理。请勿在任何集成管道中添加

event.ingested

的

set

处理器。这包括以下模式：

yaml

undefined

PROHIBITED — do not use

禁止使用——请勿使用

set: field: event.ingested value: '{{{_ingest.timestamp}}}'


The pipeline **should** set `@timestamp` from the original event's timestamp. When the source data contains multiple timestamps, map them as follows:

- **`@timestamp`**: the primary event timestamp parsed from the source data. This is required.
- **`event.created`**: when the event was first created or recorded by the source system (if different from `@timestamp`).
- **`event.start`**: when an activity or period began (e.g., session start, connection start).
- **`event.end`**: when an activity or period ended (e.g., session end, connection close).

If a source timestamp does not match the semantics of `event.created`, `event.start`, or `event.end`, map it to a custom field under the vendor namespace with `type: date` in `fields.yml` and use a `date` processor with the appropriate `target_field`.

set: field: event.ingested value: '{{{_ingest.timestamp}}}'


管道**应**从原始事件的时间戳设置`@timestamp`。当源数据包含多个时间戳时，按以下方式映射：

- **`@timestamp`**：从源数据解析的主事件时间戳。这是必需的。
- **`event.created`**：源系统首次创建或记录事件的时间（如果与`@timestamp`不同）。
- **`event.start`**：活动或周期开始的时间（例如会话开始、连接开始）。
- **`event.end`**：活动或周期结束的时间（例如会话结束、连接关闭）。

如果源时间戳与`event.created`、`event.start`或`event.end`的语义不匹配，请将其映射到厂商命名空间下的自定义字段，并在`fields.yml`中设置`type: date`，然后使用带有适当`target_field`的`date`处理器。

Never use

preserve_duplicate_custom_fields

切勿使用

preserve_duplicate_custom_fields

The

preserve_duplicate_custom_fields

tag pattern — where source fields are copied to ECS fields using

set

with

copy_from

and the originals are conditionally retained — is a legacy anti-pattern. Do not use it in any new or updated pipeline. Do not add a

preserve_duplicate_custom_fields

manifest variable, tag, or conditional logic.

Instead, follow these field mapping rules:

When a source field maps to an ECS field, use
```
rename
```
to move it directly. The source field is removed and no duplicate exists.
When a type conversion is needed (e.g., string to date, string to long), use the appropriate processor (
```
date
```
,
```
convert
```
,
```
set
```
with
```
copy_from
```
) to populate the ECS target field, then
```
remove
```
the source field in the cleanup section at the end of the pipeline.
Never design a pipeline that needs to preserve both the original vendor field and the ECS copy. The ECS field is the canonical location.

If you encounter this pattern in a reference integration, ignore it — it is legacy.

preserve_duplicate_custom_fields

标签模式——使用带

copy_from

的

set

将源字段复制到ECS字段，并有条件地保留原始字段——是一种遗留反模式。请勿在任何新管道或更新后的管道中使用它。请勿添加

preserve_duplicate_custom_fields

清单变量、标签或条件逻辑。

相反，请遵循以下字段映射规则：

当源字段映射到ECS字段时，使用
```
rename
```
直接移动它。源字段会被移除，不会存在重复。
当需要类型转换时（例如字符串转日期、字符串转长整型），使用适当的处理器（
```
date
```
、
```
convert
```
、带
```
copy_from
```
的
```
set
```
）填充ECS目标字段，然后在管道末尾的清理部分
```
remove
```
源字段。
切勿设计需要同时保留原始厂商字段和ECS副本的管道。ECS字段是规范的存储位置。

如果你在参考集成中遇到此模式，请忽略它——这是遗留内容。

Never add an

event.original

removal processor at the end

请勿在管道末尾添加

event.original

移除处理器

As documented in the JSE00001 section above: do not add a

remove

processor for

event.original

at the end of the pipeline. This is handled by a separate final pipeline.

如JSE00001部分所述：请勿在管道末尾添加

event.original

的

remove

处理器。此操作由单独的最终管道处理。

References

参考资料

```
references/processor-cookbook.md
```
— processor selection, parsing/normalization/enrichment examples, ECS categorization mapping patterns (Pattern A/B/C + anti-patterns)
```
references/branching-patterns.md
```
```
references/error-handling-patterns.md
```
```
references/grok-recipes.md
```
— grok syntax, type coercion, syslog header recipes, common mistakes, pattern library link
```
references/builder-subagent-guidance.md
```
— always-embedded subagent operating manual: scope boundaries, skill-load sequence, input data paths (CEL-first vs Direct), 9-step pipeline build workflow, "review generated output, never hand-edit expected JSON", reporting contract

```
references/processor-cookbook.md
```
——处理器选择、解析/标准化/富化示例、ECS分类映射模式（模式A/B/C + 反模式）
```
references/branching-patterns.md
```
```
references/error-handling-patterns.md
```
```
references/grok-recipes.md
```
——grok语法、类型转换、syslog头方案、常见错误、模式库链接
```
references/builder-subagent-guidance.md
```
——始终嵌入的子代理操作手册：范围边界、技能加载顺序、输入数据路径（CEL优先 vs 直接）、9步管道构建工作流、“审查生成的输出，切勿手动编辑预期JSON”、报告契约

ingest-pipelines

Original

Translation

ingest-pipelines

ingest-pipelines

Skill authority

技能权威规范

When to use

使用场景

When not to use

非适用场景

Pipeline anatomy

管道结构

ECS version

ECS版本

Rename vs set (mapping to ECS)

Rename与Set对比（映射到ECS）

Processor tags

处理器标签

CEL-only opening processors (Agentless metadata and error-only documents)

仅CEL的起始处理器（无Agent元数据和仅错误文档）

Standard opening: ECS, optional CEL block, JSE00001, then parse event.original

标准起始流程：ECS、可选CEL块、JSE00001，然后解析event.original

Single-path pattern (linear pipeline)

单路径模式（线性管道）

Branching pattern (router + sub-pipelines)

分支模式（路由器+子管道）

Sub-pipeline routing for multi-log-type integrations

多日志类型集成的子管道路由

File layout

文件结构

Router pattern in default.yml

default.yml中的路由器模式

Rules

规则

Processor ordering and performance

处理器排序与性能

Mustache template syntax in processor values

处理器值中的Mustache模板语法

CORRECT — triple braces, single quotes

CORRECT — triple braces, single quotes

WRONG — double braces HTML-escape the value; double quotes

WRONG — double braces HTML-escape the value; double quotes

Error handling essentials

错误处理要点

event.original handling (JSE00001)

event.original处理（JSE00001）

Required two-processor pattern

必需的双处理器模式

Do NOT add an event.original removal processor at the end of the pipeline

请勿在管道末尾添加event.original移除处理器

Reference

参考

Timezone handling (tz_offset)

时区处理（tz_offset）

Syslog structured data (RFC 5424 SD-ELEMENT) parsing

Syslog结构化数据（RFC 5424 SD-ELEMENT）解析

Keyword fields delivered as numbers

以数字形式传递的Keyword字段

Vendor field naming

厂商字段命名

related.ip population

related.ip填充

Painless script best practices

Painless脚本最佳实践

ECS categorization mapping

ECS分类映射

Grok best practices

Grok最佳实践

Prohibited patterns

禁止使用的模式

Never set event.ingested

切勿设置event.ingested

PROHIBITED — do not use

禁止使用——请勿使用

Never use preserve_duplicate_custom_fields

切勿使用preserve_duplicate_custom_fields

Never add an event.original removal processor at the end

请勿在管道末尾添加event.original移除处理器

References

Standard opening: ECS, optional CEL block, JSE00001, then parse
`event.original`

标准起始流程：ECS、可选CEL块、JSE00001，然后解析
`event.original`

Router pattern in
`default.yml`

`default.yml`
中的路由器模式

Do NOT add an
`event.original`
removal processor at the end of the pipeline

请勿在管道末尾添加
`event.original`
移除处理器

Timezone handling (
`tz_offset`
)

时区处理（
`tz_offset`
）

Never set
`event.ingested`

切勿设置
`event.ingested`

Never use
`preserve_duplicate_custom_fields`

切勿使用
`preserve_duplicate_custom_fields`

Never add an
`event.original`
removal processor at the end

请勿在管道末尾添加
`event.original`
移除处理器