fluentbit-generator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Fluent Bit Config Generator

Fluent Bit 配置生成器

Overview

概述

This skill provides a comprehensive workflow for generating production-ready Fluent Bit configurations with best practices built-in. Generate complete pipelines or individual sections (SERVICE, INPUT, FILTER, OUTPUT, PARSER) with proper syntax, optimal performance settings, and automatic validation.
Fluent Bit is a fast and lightweight telemetry agent for logs, metrics, and traces. It's part of the CNCF (Cloud Native Computing Foundation) and is commonly used in Kubernetes environments for log aggregation, forwarding, and processing.
本技能提供了一套完整的工作流,用于生成内置最佳实践的生产就绪型Fluent Bit配置。可以生成完整的管道或单独的配置段(SERVICE、INPUT、FILTER、OUTPUT、PARSER),确保语法正确、性能设置最优并自动完成验证。
Fluent Bit是一款轻量、高性能的日志、指标和追踪遥测代理,隶属于CNCF(云原生计算基金会),在Kubernetes环境中常用于日志聚合、转发和处理。

When to Use This Skill

适用场景

Invoke this skill when:
  • Creating new Fluent Bit configurations from scratch
  • Implementing log collection pipelines (INPUT → FILTER → OUTPUT)
  • Configuring Kubernetes log collection with metadata enrichment
  • Setting up log forwarding to destinations (Elasticsearch, Loki, S3, Kafka, CloudWatch, etc.)
  • Building multi-line log parsing for stack traces
  • Converting existing logging configurations to Fluent Bit
  • Implementing custom parsers for structured logging
  • Working with Fluent Bit plugins that require documentation lookup
  • The user asks to "create", "generate", "build", or "configure" Fluent Bit configs
  • Setting up telemetry pipelines with filters and transformations
在以下场景中调用此技能:
  • 从零开始创建新的Fluent Bit配置
  • 实现日志收集管道(INPUT → FILTER → OUTPUT)
  • 配置带元数据增强的Kubernetes日志收集
  • 设置日志转发到目标系统(Elasticsearch、Loki、S3、Kafka、CloudWatch等)
  • 为堆栈跟踪构建多行日志解析规则
  • 将现有日志配置转换为Fluent Bit格式
  • 为结构化日志实现自定义解析器
  • 使用需要查阅文档的Fluent Bit插件
  • 用户要求“创建”“生成”“构建”或“配置”Fluent Bit配置时
  • 设置带过滤和转换功能的遥测管道

Configuration Generation Workflow

配置生成工作流

Follow this workflow when generating Fluent Bit configurations. Adapt based on user needs:
遵循以下工作流生成Fluent Bit配置,并根据用户需求灵活调整:

Stage 1: Understand Requirements

阶段1:明确需求

Gather information about the logging infrastructure needs:
  1. Use case identification:
    • Kubernetes log collection (DaemonSet deployment)
    • Application log forwarding
    • System log collection (syslog, systemd)
    • Multi-line log parsing (stack traces, JSON logs)
    • Log aggregation from multiple sources
    • Metrics collection and forwarding
  2. Input sources:
    • tail (file tailing)
    • systemd (systemd journal)
    • tcp/udp (network input)
    • forward (Fluent protocol)
    • http (HTTP endpoint)
    • kubernetes (K8s pod logs)
    • docker (Docker container logs)
    • syslog
    • exec (command execution)
  3. Processing requirements:
    • Parsing (JSON, regex, logfmt)
    • Multi-line handling (stack traces)
    • Filtering (grep, modify, lua)
    • Enrichment (Kubernetes metadata)
    • Transformation (nest, rewrite_tag)
    • Throttling (rate limiting)
  4. Output destinations:
    • Elasticsearch
    • Grafana Loki
    • AWS S3/CloudWatch
    • Kafka
    • HTTP endpoint
    • File
    • stdout (debugging)
    • forward (Fluent protocol)
    • Prometheus remote write
  5. Performance and reliability:
    • Buffer limits (memory constraints)
    • Flush intervals
    • Retry logic
    • TLS/SSL requirements
    • Worker threads (parallelism)
Use AskUserQuestion if information is missing or unclear.
收集日志基础设施的相关需求信息:
  1. 用例识别
    • Kubernetes日志收集(DaemonSet部署)
    • 应用日志转发
    • 系统日志收集(syslog、systemd)
    • 多行日志解析(堆栈跟踪、JSON日志)
    • 多源日志聚合
    • 指标收集与转发
  2. 输入源
    • tail(文件尾追)
    • systemd(systemd日志)
    • tcp/udp(网络输入)
    • forward(Fluent协议)
    • http(HTTP端点)
    • kubernetes(K8s Pod日志)
    • docker(Docker容器日志)
    • syslog
    • exec(命令执行)
  3. 处理需求
    • 解析(JSON、正则、logfmt)
    • 多行处理(堆栈跟踪)
    • 过滤(grep、modify、lua)
    • 增强(Kubernetes元数据)
    • 转换(nest、rewrite_tag)
    • 限流(速率限制)
  4. 输出目标
    • Elasticsearch
    • Grafana Loki
    • AWS S3/CloudWatch
    • Kafka
    • HTTP端点
    • 文件
    • stdout(调试用)
    • forward(Fluent协议)
    • Prometheus远程写入
  5. 性能与可靠性
    • 缓冲区限制(内存约束)
    • 刷新间隔
    • 重试逻辑
    • TLS/SSL要求
    • 工作线程(并行处理)
若信息缺失或不明确,使用AskUserQuestion工具询问用户。

Script vs Manual Generation

脚本生成 vs 手动生成

Step 1: Always verify script capabilities first:
bash
undefined
步骤1:始终先验证脚本功能
bash
undefined

REQUIRED: Run --help to check if your use case is supported

必须执行:运行--help检查你的用例是否受支持

python3 scripts/generate_config.py --help

**Step 2: For supported use cases, prefer using `generate_config.py`** for consistency and tested templates:

```bash
python3 scripts/generate_config.py --help

**步骤2:对于支持的用例,优先使用`generate_config.py`**以保证一致性和经过测试的模板:

```bash

Generate configuration for a supported use case

为支持的用例生成配置

python3 scripts/generate_config.py --use-case kubernetes-elasticsearch --output fluent-bit.conf python3 scripts/generate_config.py --use-case kubernetes-opentelemetry --cluster-name my-cluster --output fluent-bit.conf

**Supported use cases:** kubernetes-elasticsearch, kubernetes-loki, kubernetes-cloudwatch, kubernetes-opentelemetry, application-multiline, syslog-forward, file-tail-s3, http-kafka, multi-destination, prometheus-metrics, lua-filtering, stream-processor, custom.

**Step 3: Use manual generation (Stages 3-8)** when:
- The use case is not supported by the script (verified via `--help`)
- Custom plugins or complex filter chains are required (e.g., grep filtering for log levels)
- Non-standard configurations are needed
- The user explicitly requests manual configuration

**Document your decision:** When choosing manual generation, explicitly state why the script was not suitable (e.g., "Manual generation chosen because grep filter for log levels is not supported by the script").
python3 scripts/generate_config.py --use-case kubernetes-elasticsearch --output fluent-bit.conf python3 scripts/generate_config.py --use-case kubernetes-opentelemetry --cluster-name my-cluster --output fluent-bit.conf

**支持的用例**:kubernetes-elasticsearch、kubernetes-loki、kubernetes-cloudwatch、kubernetes-opentelemetry、application-multiline、syslog-forward、file-tail-s3、http-kafka、multi-destination、prometheus-metrics、lua-filtering、stream-processor、custom。

**步骤3:在以下场景使用手动生成(阶段3-8)**:
- 脚本不支持该用例(已通过`--help`验证)
- 需要自定义插件或复杂过滤链(例如,按日志级别进行grep过滤)
- 需要非标准配置
- 用户明确要求手动配置

**记录决策依据**:选择手动生成时,需明确说明不使用脚本的原因(例如:“选择手动生成,因为脚本不支持按日志级别进行grep过滤”)。

Consulting Examples Before Manual Generation

手动生成前查阅示例

REQUIRED before writing any manual configuration:
  1. Identify the closest matching example from the
    examples/
    directory:
    • For Kubernetes + Elasticsearch: Read
      examples/kubernetes-elasticsearch.conf
    • For Kubernetes + Loki: Read
      examples/kubernetes-loki.conf
    • For Kubernetes + OpenTelemetry: Read
      examples/kubernetes-opentelemetry.conf
    • For application logs with multiline: Read
      examples/application-multiline.conf
    • For syslog forwarding: Read
      examples/syslog-forward.conf
    • For S3 output: Read
      examples/file-tail-s3.conf
    • For Kafka output: Read
      examples/http-input-kafka.conf
    • For multi-destination: Read
      examples/multi-destination.conf
    • For Prometheus metrics: Read
      examples/prometheus-metrics.conf
    • For Lua filtering: Read
      examples/lua-filtering.conf
    • For stream processing: Read
      examples/stream-processor.conf
    • For production setup: Read
      examples/full-production.conf
  2. Read the example file using the Read tool to understand:
    • Section structure and ordering
    • Parameter values and best practices
    • Comments and documentation style
  3. Read
    examples/parsers.conf
    for parser definitions - reuse existing parsers rather than recreating them.
  4. Use examples as templates - copy relevant sections and customize for the user's requirements.
在编写任何手动配置前必须执行
  1. examples/
    目录中找到最匹配的示例
    • Kubernetes + Elasticsearch:查阅
      examples/kubernetes-elasticsearch.conf
    • Kubernetes + Loki:查阅
      examples/kubernetes-loki.conf
    • Kubernetes + OpenTelemetry:查阅
      examples/kubernetes-opentelemetry.conf
    • 带多行处理的应用日志:查阅
      examples/application-multiline.conf
    • Syslog转发:查阅
      examples/syslog-forward.conf
    • S3输出:查阅
      examples/file-tail-s3.conf
    • Kafka输出:查阅
      examples/http-input-kafka.conf
    • 多目标输出:查阅
      examples/multi-destination.conf
    • Prometheus指标:查阅
      examples/prometheus-metrics.conf
    • Lua过滤:查阅
      examples/lua-filtering.conf
    • 流处理:查阅
      examples/stream-processor.conf
    • 生产环境部署:查阅
      examples/full-production.conf
  2. 使用Read工具读取示例文件,了解:
    • 配置段的结构与顺序
    • 参数值与最佳实践
    • 注释与文档风格
  3. 查阅
    examples/parsers.conf
    获取解析器定义
    :优先复用现有解析器,而非重新创建。
  4. 将示例作为模板:复制相关配置段并根据用户需求进行定制。

Stage 2: Plugin Documentation Lookup (if applicable)

阶段2:插件文档查阅(如适用)

If the configuration requires specific plugins or custom output destinations:
  1. Identify plugins needing documentation:
    • Custom output plugins (proprietary systems)
    • Less common input plugins
    • Complex filter configurations
    • Parser patterns for specific log formats
    • Cloud provider integrations (AWS, GCP, Azure)
  2. Try context7 MCP first (preferred):
    Use mcp__context7__resolve-library-id with "fluent-bit" or "fluent/fluent-bit"
    Then use mcp__context7__get-library-docs with:
    - context7CompatibleLibraryID: /fluent/fluent-bit-docs (or /fluent/fluent-bit)
    - topic: The plugin name and configuration (e.g., "elasticsearch output configuration")
    - page: 1 (fetch additional pages if needed)
  3. Fallback to WebSearch if context7 fails:
    Search query patterns:
    "fluent-bit" "<plugin-type>" "<plugin-name>" "configuration" "parameters" site:docs.fluentbit.io
    
    Examples:
    "fluent-bit" "output" "elasticsearch" "configuration" site:docs.fluentbit.io
    "fluent-bit" "filter" "kubernetes" "configuration" site:docs.fluentbit.io
    "fluent-bit" "parser" "multiline" "configuration" site:docs.fluentbit.io
  4. Extract key information:
    • Required parameters
    • Optional parameters and defaults
    • Configuration examples
    • Performance tuning options
    • Common pitfalls and best practices
若配置需要特定插件或自定义输出目标:
  1. 确定需要查阅文档的插件
    • 自定义输出插件(专有系统)
    • 不太常用的输入插件
    • 复杂过滤配置
    • 特定日志格式的解析器规则
    • 云服务商集成(AWS、GCP、Azure)
  2. 优先使用context7 MCP
    使用mcp__context7__resolve-library-id,参数为"fluent-bit"或"fluent/fluent-bit"
    然后使用mcp__context7__get-library-docs,参数如下:
    - context7CompatibleLibraryID: /fluent/fluent-bit-docs(或/fluent/fluent-bit)
    - topic: 插件名称与配置(例如:"elasticsearch output configuration")
    - page: 1(若需要可获取更多页面)
  3. 若context7失败,回退到WebSearch
    搜索查询模板:
    "fluent-bit" "<plugin-type>" "<plugin-name>" "configuration" "parameters" site:docs.fluentbit.io
    
    示例:
    "fluent-bit" "output" "elasticsearch" "configuration" site:docs.fluentbit.io
    "fluent-bit" "filter" "kubernetes" "configuration" site:docs.fluentbit.io
    "fluent-bit" "parser" "multiline" "configuration" site:docs.fluentbit.io
  4. 提取关键信息
    • 必填参数
    • 可选参数与默认值
    • 配置示例
    • 性能调优选项
    • 常见陷阱与最佳实践

Stage 3: SERVICE Section Configuration

阶段3:SERVICE段配置

ALWAYS start with the SERVICE section - this defines global behavior:
ini
[SERVICE]
    # Flush interval in seconds - how often to flush data to outputs
    # Lower values = lower latency, higher CPU usage
    # Recommended: 1-5 seconds for most use cases
    Flush        1

    # Daemon mode - run as background process (Off in containers)
    Daemon       Off

    # Log level: off, error, warn, info, debug, trace
    # Recommended: info for production, debug for troubleshooting
    Log_Level    info

    # Optional: Write Fluent Bit's own logs to file
    # Log_File     /var/log/fluent-bit.log

    # Parser configuration file (if using custom parsers)
    Parsers_File parsers.conf

    # Enable built-in HTTP server for metrics and health checks
    # Recommended for Kubernetes liveness/readiness probes
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020

    # Enable storage metrics endpoint
    storage.metrics on

    # Number of worker threads (0 = auto-detect CPU cores)
    # Increase for high-volume environments
    # workers      0
Key SERVICE parameters:
  • Flush (1-5 sec): Lower for real-time, higher for batching efficiency
  • Log_Level: Use
    info
    in production,
    debug
    for troubleshooting
  • HTTP_Server: Enable for health checks and metrics
  • Parsers_File: Reference external parser definitions
  • storage.metrics: Enable for monitoring buffer/storage metrics
始终从SERVICE段开始:该段定义全局行为:
ini
[SERVICE]
    # 刷新间隔(秒)- 控制向输出端刷新数据的频率
    # 数值越小 = 延迟越低,CPU占用越高
    # 推荐:大多数场景设置1-5秒
    Flush        1

    # 守护进程模式 - 以后台进程运行(容器环境设为Off)
    Daemon       Off

    # 日志级别:off, error, warn, info, debug, trace
    # 推荐:生产环境用info,故障排查用debug
    Log_Level    info

    # 可选:将Fluent Bit自身日志写入文件
    # Log_File     /var/log/fluent-bit.log

    # 解析器配置文件(若使用自定义解析器)
    Parsers_File parsers.conf

    # 启用内置HTTP服务器,用于指标采集和健康检查
    # 推荐在Kubernetes中用于存活/就绪探针
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020

    # 启用存储指标端点
    storage.metrics on

    # 工作线程数(0 = 自动检测CPU核心数)
    # 高流量环境可增加此值
    # workers      0
SERVICE段关键参数
  • Flush(1-5秒):平衡延迟与效率,低数值对应低延迟,高CPU/网络占用;高数值对应更优的批处理,更高内存占用
  • Log_Level:生产环境用
    info
    ,故障排查用
    debug
  • HTTP_Server:启用后支持健康检查和指标采集
  • Parsers_File:引用外部解析器定义文件
  • storage.metrics:启用后可监控缓冲区/存储指标

Stage 4: INPUT Section Configuration

阶段4:INPUT段配置

Create INPUT sections for data sources. Common patterns:
为每个数据源创建INPUT段。常见配置模式:

Kubernetes Pod Logs (DaemonSet)

Kubernetes Pod日志(DaemonSet模式)

ini
[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/*.log
    # Exclude Fluent Bit's own logs to prevent loops
    Exclude_Path      /var/log/containers/*fluent-bit*.log
    Parser            docker
    DB                /var/log/flb_kube.db
    Mem_Buf_Limit     50MB
    Skip_Long_Lines   On
    Refresh_Interval  10
    Read_from_Head    Off
Key INPUT patterns:
  1. tail plugin (most common):
    • Path
      : File path or wildcard pattern
    • Tag
      : Routing tag for filters/outputs
    • Parser
      : Pre-parser for log format (docker, cri, json)
    • DB
      : Position database for crash recovery
    • Mem_Buf_Limit
      : Per-input memory limit (prevents OOM)
    • Skip_Long_Lines
      : Skip lines > 32KB (prevents hang)
    • Read_from_Head
      : Start from beginning (false for new logs only)
  2. systemd plugin:
ini
[INPUT]
    Name              systemd
    Tag               host.*
    Systemd_Filter    _SYSTEMD_UNIT=kubelet.service
    Read_From_Tail    On
  1. http plugin (webhook receiver):
ini
[INPUT]
    Name          http
    Tag           app.logs
    Listen        0.0.0.0
    Port          9880
    Buffer_Size   32KB
  1. forward plugin (Fluent protocol):
ini
[INPUT]
    Name          forward
    Tag           forward.*
    Listen        0.0.0.0
    Port          24224
Best practices for INPUT:
  • Always set
    Mem_Buf_Limit
    to prevent memory issues
  • Use
    DB
    for tail inputs to track file positions
  • Set appropriate
    Tag
    patterns for routing
  • Use
    Exclude_Path
    to prevent log loops
  • Enable
    Skip_Long_Lines
    for robustness
ini
[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/*.log
    # 排除Fluent Bit自身日志,防止循环
    Exclude_Path      /var/log/containers/*fluent-bit*.log
    Parser            docker
    DB                /var/log/flb_kube.db
    Mem_Buf_Limit     50MB
    Skip_Long_Lines   On
    Refresh_Interval  10
    Read_from_Head    Off
INPUT段常见模式
  1. tail插件(最常用):
    • Path
      :文件路径或通配符模式
    • Tag
      :用于过滤器/输出端的路由标签
    • Parser
      :日志格式预解析器(docker、cri、json)
    • DB
      :用于崩溃恢复的位置数据库
    • Mem_Buf_Limit
      :每个输入的内存限制(防止OOM)
    • Skip_Long_Lines
      :跳过大于32KB的行(防止挂起)
    • Read_from_Head
      :从文件开头开始读取(仅收集新日志设为false)
  2. systemd插件
ini
[INPUT]
    Name              systemd
    Tag               host.*
    Systemd_Filter    _SYSTEMD_UNIT=kubelet.service
    Read_From_Tail    On
  1. http插件(Webhook接收器):
ini
[INPUT]
    Name          http
    Tag           app.logs
    Listen        0.0.0.0
    Port          9880
    Buffer_Size   32KB
  1. forward插件(Fluent协议):
ini
[INPUT]
    Name          forward
    Tag           forward.*
    Listen        0.0.0.0
    Port          24224
INPUT段最佳实践
  • 始终设置
    Mem_Buf_Limit
    以避免内存问题
  • tail输入使用
    DB
    参数跟踪文件位置
  • 设置合适的
    Tag
    模式用于路由
  • 使用
    Exclude_Path
    防止日志循环
  • 启用
    Skip_Long_Lines
    提升鲁棒性

Stage 5: FILTER Section Configuration

阶段5:FILTER段配置

Create FILTER sections for log processing and enrichment:
创建FILTER段用于日志处理和增强:

Kubernetes Metadata Enrichment

Kubernetes元数据增强

ini
[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Kube_Tag_Prefix     kube.var.log.containers.
    Merge_Log           On
    Keep_Log            Off
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On
    Labels              On
    Annotations         Off
    Buffer_Size         0
Key FILTER patterns:
  1. kubernetes filter (metadata enrichment):
    • Merge_Log
      : Parse JSON logs into structured fields
    • Keep_Log
      : Keep original log field (Off saves space)
    • K8S-Logging.Parser
      : Honor pod parser annotations
    • K8S-Logging.Exclude
      : Honor pod exclude annotations
    • Labels
      : Include pod labels in output
    • Annotations
      : Include pod annotations (optional, increases size)
  2. parser filter (structured parsing):
ini
[FILTER]
    Name          parser
    Match         *
    Key_Name      log
    Parser        json
    Reserve_Data  On
    Preserve_Key  Off
  1. grep filter (include/exclude):
ini
[FILTER]
    Name          grep
    Match         *
    # Include only error logs
    Regex         level (error|fatal|critical)
    # Exclude health check logs
    Exclude       path /health
  1. modify filter (add/remove fields):
ini
[FILTER]
    Name          modify
    Match         *
    Add           cluster_name production
    Add           environment prod
    Remove        _p
  1. nest filter (restructure):
ini
[FILTER]
    Name          nest
    Match         *
    Operation     lift
    Nested_under  kubernetes
    Add_prefix    k8s_
  1. multiline filter (stack traces):
ini
[FILTER]
    Name          multiline
    Match         *
    multiline.key_content log
    multiline.parser      java, python, go
  1. throttle filter (rate limiting):
ini
[FILTER]
    Name          throttle
    Match         *
    Rate          1000
    Window        5
    Interval      1m
  1. lua filter (custom scripting):
ini
[FILTER]
    Name    lua
    Match   *
    script  /fluent-bit/scripts/filter.lua
    call    process_record
Example Lua script (
/fluent-bit/scripts/filter.lua
):
lua
function process_record(tag, timestamp, record)
    -- Add custom field
    record["custom_field"] = "custom_value"

    -- Transform existing field
    if record["level"] then
        record["severity"] = string.upper(record["level"])
    end

    -- Filter out specific records (return -1 to drop)
    if record["message"] and string.match(record["message"], "DEBUG") then
        return -1, timestamp, record
    end

    -- Return modified record
    return 1, timestamp, record
end
Best practices for FILTER:
  • Order matters: parsers before modifiers
  • Use
    Kubernetes
    filter in K8s environments for enrichment
  • Parse JSON logs early to enable field-based filtering
  • Add cluster/environment identifiers for multi-cluster setups
  • Use
    grep
    to reduce data volume early in pipeline
  • Implement throttling to prevent downstream overload
ini
[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Kube_Tag_Prefix     kube.var.log.containers.
    Merge_Log           On
    Keep_Log            Off
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On
    Labels              On
    Annotations         Off
    Buffer_Size         0
FILTER段常见模式
  1. kubernetes过滤器(元数据增强):
    • Merge_Log
      :将JSON日志解析为结构化字段
    • Keep_Log
      :保留原始日志字段(设为Off可节省空间)
    • K8S-Logging.Parser
      :遵循Pod的解析器注解
    • K8S-Logging.Exclude
      :遵循Pod的排除注解
    • Labels
      :在输出中包含Pod标签
    • Annotations
      :在输出中包含Pod注解(可选,会增加数据大小)
  2. parser过滤器(结构化解析):
ini
[FILTER]
    Name          parser
    Match         *
    Key_Name      log
    Parser        json
    Reserve_Data  On
    Preserve_Key  Off
  1. grep过滤器(包含/排除):
ini
[FILTER]
    Name          grep
    Match         *
    # 仅包含错误日志
    Regex         level (error|fatal|critical)
    # 排除健康检查日志
    Exclude       path /health
  1. modify过滤器(添加/删除字段):
ini
[FILTER]
    Name          modify
    Match         *
    Add           cluster_name production
    Add           environment prod
    Remove        _p
  1. nest过滤器(结构重组):
ini
[FILTER]
    Name          nest
    Match         *
    Operation     lift
    Nested_under  kubernetes
    Add_prefix    k8s_
  1. multiline过滤器(堆栈跟踪):
ini
[FILTER]
    Name          multiline
    Match         *
    multiline.key_content log
    multiline.parser      java, python, go
  1. throttle过滤器(速率限制):
ini
[FILTER]
    Name          throttle
    Match         *
    Rate          1000
    Window        5
    Interval      1m
  1. lua过滤器(自定义脚本):
ini
[FILTER]
    Name    lua
    Match   *
    script  /fluent-bit/scripts/filter.lua
    call    process_record
Lua脚本示例(
/fluent-bit/scripts/filter.lua
):
lua
function process_record(tag, timestamp, record)
    -- 添加自定义字段
    record["custom_field"] = "custom_value"

    -- 转换现有字段
    if record["level"] then
        record["severity"] = string.upper(record["level"])
    end

    -- 过滤特定记录(返回-1表示丢弃)
    if record["message"] and string.match(record["message"], "DEBUG") then
        return -1, timestamp, record
    end

    -- 返回修改后的记录
    return 1, timestamp, record
end
FILTER段最佳实践
  • 顺序很重要:解析器应在修改器之前
  • Kubernetes环境中使用
    Kubernetes
    过滤器增强元数据
  • 尽早解析JSON日志,启用基于字段的过滤
  • 为多集群环境添加集群/环境标识符
  • 使用
    grep
    在管道早期减少数据量
  • 实现限流防止下游系统过载

Stage 6: OUTPUT Section Configuration

阶段6:OUTPUT段配置

Create OUTPUT sections for destination systems:
为每个目标系统创建OUTPUT段:

Elasticsearch

Elasticsearch

ini
[OUTPUT]
    Name              es
    Match             *
    Host              elasticsearch.default.svc
    Port              9200
    # Index pattern with date
    Logstash_Format   On
    Logstash_Prefix   fluent-bit
    Retry_Limit       3
    # Buffer configuration
    storage.total_limit_size 5M
    # TLS configuration
    tls               On
    tls.verify        Off
    # Authentication
    HTTP_User         ${ES_USER}
    HTTP_Passwd       ${ES_PASSWORD}
    # Performance tuning
    Buffer_Size       False
    Type              _doc
ini
[OUTPUT]
    Name              es
    Match             *
    Host              elasticsearch.default.svc
    Port              9200
    # 带日期的索引模式
    Logstash_Format   On
    Logstash_Prefix   fluent-bit
    Retry_Limit       3
    # 缓冲区配置
    storage.total_limit_size 5M
    # TLS配置
    tls               On
    tls.verify        Off
    # 认证
    HTTP_User         ${ES_USER}
    HTTP_Passwd       ${ES_PASSWORD}
    # 性能调优
    Buffer_Size       False
    Type              _doc

Grafana Loki

Grafana Loki

ini
[OUTPUT]
    Name              loki
    Match             *
    Host              loki.default.svc
    Port              3100
    # Label extraction from metadata
    labels            job=fluent-bit, namespace=$kubernetes['namespace_name'], pod=$kubernetes['pod_name'], container=$kubernetes['container_name']
    label_keys        $stream
    # Remove Kubernetes metadata to reduce payload size
    remove_keys       kubernetes,stream
    # Auto Kubernetes labels
    auto_kubernetes_labels on
    # Line format
    line_format       json
    # Retry configuration
    Retry_Limit       3
ini
[OUTPUT]
    Name              loki
    Match             *
    Host              loki.default.svc
    Port              3100
    # 从元数据提取标签
    labels            job=fluent-bit, namespace=$kubernetes['namespace_name'], pod=$kubernetes['pod_name'], container=$kubernetes['container_name']
    label_keys        $stream
    # 删除Kubernetes元数据以减少 payload 大小
    remove_keys       kubernetes,stream
    # 自动添加Kubernetes标签
    auto_kubernetes_labels on
    # 行格式
    line_format       json
    # 重试配置
    Retry_Limit       3

AWS S3

AWS S3

ini
[OUTPUT]
    Name              s3
    Match             *
    bucket            my-logs-bucket
    region            us-east-1
    total_file_size   100M
    upload_timeout    10m
    use_put_object    Off
    # Compression
    compression       gzip
    # Path structure with time formatting
    s3_key_format     /fluent-bit-logs/%Y/%m/%d/$TAG[0]/%H-%M-%S-$UUID.gz
    # IAM role authentication (recommended)
    # Or use AWS credentials
    # AWS credentials loaded from environment or IAM role
    Retry_Limit       3
ini
[OUTPUT]
    Name              s3
    Match             *
    bucket            my-logs-bucket
    region            us-east-1
    total_file_size   100M
    upload_timeout    10m
    use_put_object    Off
    # 压缩
    compression       gzip
    # 带时间格式化的路径结构
    s3_key_format     /fluent-bit-logs/%Y/%m/%d/$TAG[0]/%H-%M-%S-$UUID.gz
    # IAM角色认证(推荐)
    # 或使用AWS凭证
    # AWS凭证从环境变量或IAM角色加载
    Retry_Limit       3

Kafka

Kafka

ini
[OUTPUT]
    Name              kafka
    Match             *
    Brokers           kafka-broker-1:9092,kafka-broker-2:9092
    Topics            logs
    # Message format
    Format            json
    # Timestamp key
    Timestamp_Key     @timestamp
    # Retry configuration
    Retry_Limit       3
    # Queue configuration
    rdkafka.queue.buffering.max.messages     100000
    rdkafka.request.required.acks            1
ini
[OUTPUT]
    Name              kafka
    Match             *
    Brokers           kafka-broker-1:9092,kafka-broker-2:9092
    Topics            logs
    # 消息格式
    Format            json
    # 时间戳键
    Timestamp_Key     @timestamp
    # 重试配置
    Retry_Limit       3
    # 队列配置
    rdkafka.queue.buffering.max.messages     100000
    rdkafka.request.required.acks            1

AWS CloudWatch Logs

AWS CloudWatch Logs

ini
[OUTPUT]
    Name              cloudwatch_logs
    Match             *
    region            us-east-1
    log_group_name    /aws/fluent-bit/logs
    log_stream_prefix from-fluent-bit-
    auto_create_group On
    Retry_Limit       3
ini
[OUTPUT]
    Name              cloudwatch_logs
    Match             *
    region            us-east-1
    log_group_name    /aws/fluent-bit/logs
    log_stream_prefix from-fluent-bit-
    auto_create_group On
    Retry_Limit       3

OpenTelemetry (OTLP)

OpenTelemetry (OTLP)

ini
[OUTPUT]
    Name                 opentelemetry
    Match                *
    Host                 opentelemetry-collector.observability.svc
    Port                 4318
    # Use HTTP protocol for OTLP
    logs_uri             /v1/logs
    # Add resource attributes
    add_label            cluster my-cluster
    add_label            environment production
    # TLS configuration
    tls                  On
    tls.verify           Off
    # Retry configuration
    Retry_Limit          3
ini
[OUTPUT]
    Name                 opentelemetry
    Match                *
    Host                 opentelemetry-collector.observability.svc
    Port                 4318
    # 使用HTTP协议的OTLP
    logs_uri             /v1/logs
    # 添加资源属性
    add_label            cluster my-cluster
    add_label            environment production
    # TLS配置
    tls                  On
    tls.verify           Off
    # 重试配置
    Retry_Limit          3

Prometheus Remote Write

Prometheus远程写入

ini
[OUTPUT]
    Name              prometheus_remote_write
    Match             *
    Host              prometheus.monitoring.svc
    Port              9090
    Uri               /api/v1/write
    # Add labels to all metrics
    add_label         cluster my-cluster
    add_label         environment production
    # TLS configuration
    tls               On
    tls.verify        Off
    # Retry configuration
    Retry_Limit       3
    # Compression
    compression       snappy
ini
[OUTPUT]
    Name              prometheus_remote_write
    Match             *
    Host              prometheus.monitoring.svc
    Port              9090
    Uri               /api/v1/write
    # 为所有指标添加标签
    add_label         cluster my-cluster
    add_label         environment production
    # TLS配置
    tls               On
    tls.verify        Off
    # 重试配置
    Retry_Limit       3
    # 压缩
    compression       snappy

HTTP Endpoint

HTTP端点

ini
[OUTPUT]
    Name              http
    Match             *
    Host              logs.example.com
    Port              443
    URI               /api/logs
    Format            json
    # TLS
    tls               On
    tls.verify        On
    # Authentication
    Header            Authorization Bearer ${API_TOKEN}
    # Compression
    Compress          gzip
    # Retry configuration
    Retry_Limit       3
ini
[OUTPUT]
    Name              http
    Match             *
    Host              logs.example.com
    Port              443
    URI               /api/logs
    Format            json
    # TLS
    tls               On
    tls.verify        On
    # 认证
    Header            Authorization Bearer ${API_TOKEN}
    # 压缩
    Compress          gzip
    # 重试配置
    Retry_Limit       3

stdout (debugging)

stdout(调试用)

ini
[OUTPUT]
    Name              stdout
    Match             *
    Format            json_lines
Key OUTPUT patterns:
  1. Common parameters:
    • Name
      : Output plugin name
    • Match
      : Tag pattern to match (supports wildcards)
    • Retry_Limit
      : Number of retries (0 = infinite)
    • storage.total_limit_size
      : Disk buffer limit
  2. Buffer and retry configuration:
    ini
    # Memory buffering (default)
    storage.type      memory
    
    # Filesystem buffering (for high reliability)
    storage.type      filesystem
    storage.path      /var/log/fluent-bit-buffer/
    storage.total_limit_size 10G
    
    # Retry configuration
    Retry_Limit       5
  3. TLS configuration:
    ini
    tls               On
    tls.verify        On
    tls.ca_file       /path/to/ca.crt
    tls.crt_file      /path/to/client.crt
    tls.key_file      /path/to/client.key
Best practices for OUTPUT:
  • Always set
    Retry_Limit
    (3-5 for most cases)
  • Use environment variables for credentials:
    ${ENV_VAR}
  • Enable TLS for production
  • Set
    storage.total_limit_size
    to prevent disk exhaustion
  • Use compression when available (gzip)
  • For Kubernetes: use service DNS names
  • Add multiple outputs for redundancy if needed
ini
[OUTPUT]
    Name              stdout
    Match             *
    Format            json_lines
OUTPUT段常见模式
  1. 通用参数
    • Name
      :输出插件名称
    • Match
      :匹配的标签模式(支持通配符)
    • Retry_Limit
      :重试次数(0表示无限重试)
    • storage.total_limit_size
      :磁盘缓冲区限制
  2. 缓冲区与重试配置
    ini
    # 内存缓冲(默认)
    storage.type      memory
    
    # 文件系统缓冲(高可靠性场景)
    storage.type      filesystem
    storage.path      /var/log/fluent-bit-buffer/
    storage.total_limit_size 10G
    
    # 重试配置
    Retry_Limit       5
  3. TLS配置
    ini
    tls               On
    tls.verify        On
    tls.ca_file       /path/to/ca.crt
    tls.crt_file      /path/to/client.crt
    tls.key_file      /path/to/client.key
OUTPUT段最佳实践
  • 始终设置
    Retry_Limit
    (大多数场景设为3-5)
  • 使用环境变量存储凭证:
    ${ENV_VAR}
  • 生产环境启用TLS
  • 设置
    storage.total_limit_size
    防止磁盘耗尽
  • 支持时启用压缩(gzip)
  • Kubernetes环境中使用服务DNS名称
  • 若需要冗余,可配置多个输出端

Stage 7: PARSER Section Configuration

阶段7:PARSER段配置

IMPORTANT: Always check
examples/parsers.conf
first
before creating custom parsers. The examples directory contains production-ready parser definitions for common use cases.
Step 1: Read the existing parsers file:
bash
undefined
重要提示:创建自定义解析器前,务必先查阅
examples/parsers.conf
。示例目录包含了适用于常见场景的生产就绪型解析器定义。
步骤1:读取现有解析器文件
bash
undefined

Read the examples/parsers.conf file to see available parsers

读取examples/parsers.conf文件,查看可用的解析器

Read examples/parsers.conf

**Step 2: Reuse existing parsers when possible.** The `examples/parsers.conf` includes:
- `docker` - Docker JSON log format
- `json` - Generic JSON logs
- `cri` - CRI container runtime format
- `syslog-rfc3164` - Syslog RFC 3164
- `syslog-rfc5424` - Syslog RFC 5424
- `nginx` - Nginx access logs
- `apache` - Apache access logs
- `apache_error` - Apache error logs
- `mongodb` - MongoDB logs
- `multiline-java` - Java stack traces
- `multiline-python` - Python tracebacks
- `multiline-go` - Go panic traces
- `multiline-ruby` - Ruby exceptions

**Step 3: Only create custom parsers** when the existing ones don't match your log format.

**Example custom parser definition (only if needed):**

```ini
Read examples/parsers.conf

**步骤2:尽可能复用现有解析器**。`examples/parsers.conf`包含以下解析器:
- `docker` - Docker JSON日志格式
- `json` - 通用JSON日志
- `cri` - CRI容器运行时格式
- `syslog-rfc3164` - Syslog RFC 3164
- `syslog-rfc5424` - Syslog RFC 5424
- `nginx` - Nginx访问日志
- `apache` - Apache访问日志
- `apache_error` - Apache错误日志
- `mongodb` - MongoDB日志
- `multiline-java` - Java堆栈跟踪
- `multiline-python` - Python堆栈跟踪
- `multiline-go` - Go panic跟踪
- `multiline-ruby` - Ruby异常

**步骤3:仅在现有解析器无法匹配日志格式时,创建自定义解析器**。

**自定义解析器示例(仅在必要时使用)**:

```ini

parsers.conf - Add custom parsers alongside existing ones

parsers.conf - 在现有解析器旁添加自定义解析器

[PARSER] Name custom-app Format regex Regex ^(?<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) [(?<level>\w+)] (?<message>.*)$ Time_Key timestamp Time_Format %Y-%m-%d %H:%M:%S

**Parser types:**

1. **JSON**: For JSON-formatted logs
2. **Regex**: For custom log formats
3. **LTSV**: For LTSV (Labeled Tab-Separated Values)
4. **Logfmt**: For logfmt format
5. **MULTILINE_PARSER**: For multi-line logs (stack traces)

**Best practices for PARSER:**
- **Reuse `examples/parsers.conf`** - copy and extend rather than recreating from scratch
- Use built-in parsers when possible (docker, cri, json)
- Test regex patterns thoroughly
- Set `Time_Key` and `Time_Format` for proper timestamps
- Use `MULTILINE_PARSER` for stack traces
- Reference parsers file in SERVICE section
[PARSER] Name custom-app Format regex Regex ^(?<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) [(?<level>\w+)] (?<message>.*)$ Time_Key timestamp Time_Format %Y-%m-%d %H:%M:%S

**解析器类型**:

1. **JSON**:用于JSON格式日志
2. **Regex**:用于自定义日志格式
3. **LTSV**:用于LTSV(标签分隔值)格式
4. **Logfmt**:用于logfmt格式
5. **MULTILINE_PARSER**:用于多行日志(堆栈跟踪)

**PARSER段最佳实践**:
- **复用`examples/parsers.conf`** - 复制并扩展,而非从零开始创建
- 尽可能使用内置解析器(docker、cri、json)
- 彻底测试正则表达式
- 设置`Time_Key`和`Time_Format`以保证时间戳正确
- 堆栈跟踪使用`MULTILINE_PARSER`
- 在SERVICE段中引用解析器文件

Stage 8: Complete Configuration Structure

阶段8:完整配置结构

A production-ready Fluent Bit configuration follows this structure:
fluent-bit.conf          # Main configuration file
parsers.conf             # Custom parser definitions (optional)
Before writing a new configuration, consult the
examples/
directory
for production-ready templates:
  • Review
    examples/
    files that match your use case
  • Use them as starting points and customize as needed
  • Reference
    examples/parsers.conf
    for parser definitions
Example complete configuration (Kubernetes to Elasticsearch):
ini
undefined
生产就绪型Fluent Bit配置遵循以下结构:
fluent-bit.conf          # 主配置文件
parsers.conf             # 自定义解析器定义(可选)
在编写新配置前,务必查阅
examples/
目录
获取生产就绪型模板:
  • 查看与你的用例匹配的
    examples/
    文件
  • 将其作为起点,根据需求进行定制
  • 参考
    examples/parsers.conf
    获取解析器定义
完整配置示例(Kubernetes到Elasticsearch)
ini
undefined

fluent-bit.conf

fluent-bit.conf

[SERVICE] Flush 1 Daemon Off Log_Level info Parsers_File parsers.conf HTTP_Server On HTTP_Listen 0.0.0.0 HTTP_Port 2020 storage.metrics on
[INPUT] Name tail Tag kube.* Path /var/log/containers/*.log Exclude_Path /var/log/containers/fluent-bit.log Parser docker DB /var/log/flb_kube.db Mem_Buf_Limit 50MB Skip_Long_Lines On Refresh_Interval 10
[FILTER] Name kubernetes Match kube.* Kube_URL https://kubernetes.default.svc:443 Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token Kube_Tag_Prefix kube.var.log.containers. Merge_Log On Keep_Log Off K8S-Logging.Parser On K8S-Logging.Exclude On Labels On Annotations Off
[FILTER] Name modify Match * Add cluster_name my-cluster Add environment production
[FILTER] Name nest Match * Operation lift Nested_under kubernetes
[OUTPUT] Name es Match * Host elasticsearch.logging.svc Port 9200 Logstash_Format On Logstash_Prefix k8s Retry_Limit 3 storage.total_limit_size 5M tls On tls.verify Off
undefined
[SERVICE] Flush 1 Daemon Off Log_Level info Parsers_File parsers.conf HTTP_Server On HTTP_Listen 0.0.0.0 HTTP_Port 2020 storage.metrics on
[INPUT] Name tail Tag kube.* Path /var/log/containers/*.log Exclude_Path /var/log/containers/fluent-bit.log Parser docker DB /var/log/flb_kube.db Mem_Buf_Limit 50MB Skip_Long_Lines On Refresh_Interval 10
[FILTER] Name kubernetes Match kube.* Kube_URL https://kubernetes.default.svc:443 Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token Kube_Tag_Prefix kube.var.log.containers. Merge_Log On Keep_Log Off K8S-Logging.Parser On K8S-Logging.Exclude On Labels On Annotations Off
[FILTER] Name modify Match * Add cluster_name my-cluster Add environment production
[FILTER] Name nest Match * Operation lift Nested_under kubernetes
[OUTPUT] Name es Match * Host elasticsearch.logging.svc Port 9200 Logstash_Format On Logstash_Prefix k8s Retry_Limit 3 storage.total_limit_size 5M tls On tls.verify Off
undefined

Stage 9: Best Practices and Optimization

阶段9:最佳实践与优化

Apply these best practices to all generated configurations:
所有生成的配置都应遵循以下最佳实践:

Performance Optimization

性能优化

  1. Buffer management:
    • Set
      Mem_Buf_Limit
      on inputs (default 32MB can cause OOM)
    • Use
      storage.type filesystem
      for high-reliability scenarios
    • Set
      storage.total_limit_size
      to prevent disk exhaustion
    • Recommended: 50-100MB per input, 5-10GB total disk buffer
  2. Flush and batching:
    • Flush 1-5
      : Balance between latency and efficiency
    • Lower flush = lower latency, higher CPU/network
    • Higher flush = better batching, higher memory usage
  3. Worker threads:
    • Default (0) auto-detects CPU cores
    • Increase for high-volume environments
    • Monitor CPU usage before adjusting
  4. Compression:
    • Enable compression for network outputs (gzip)
    • Reduces bandwidth by 70-90%
    • Slight CPU overhead
  1. 缓冲区管理
    • 为所有输入设置
      Mem_Buf_Limit
      (默认32MB可能导致OOM)
    • 高可靠性场景使用
      storage.type filesystem
    • 设置
      storage.total_limit_size
      防止磁盘耗尽
    • 推荐:每个输入50-100MB,总磁盘缓冲区5-10GB
  2. 刷新与批处理
    • Flush 1-5
      :平衡延迟与效率
    • 低刷新值对应低延迟,高CPU/网络占用;高刷新值对应更优的批处理,更高内存占用
  3. 工作线程
    • 默认值(0)自动检测CPU核心数
    • 高流量环境可增加此值
    • 调整前先监控CPU使用率
  4. 压缩
    • 网络输出启用压缩(gzip)
    • 可减少70-90%的带宽
    • 会带来轻微的CPU开销

Reliability

可靠性

  1. Retry logic:
    • Set
      Retry_Limit 3-5
      on all outputs
    • Use filesystem buffering for critical logs
    • Consider multiple outputs for redundancy
  2. Health checks:
  3. Database files:
    • Use
      DB
      parameter for tail inputs
    • Enables position tracking across restarts
    • Store in persistent volume in Kubernetes
  1. 重试逻辑
    • 所有输出端设置
      Retry_Limit 3-5
    • 关键日志使用文件系统缓冲
    • 可考虑配置多个输出端实现冗余
  2. 健康检查
  3. 数据库文件
    • tail输入使用
      DB
      参数
    • 支持重启后跟踪文件位置
    • Kubernetes环境中存储在持久卷中

Security

安全性

  1. TLS/SSL:
    • Always enable TLS in production (
      tls On
      )
    • Default to
      tls.verify On
      for production deployments
    • Use
      tls.verify Off
      ONLY in these scenarios:
      • Internal Kubernetes cluster traffic with self-signed certificates
      • Development/testing environments
      • When proper CA certificates are not available (add comment explaining why)
    • When using
      tls.verify Off
      , always add a comment explaining the reason:
      ini
      tls               On
      tls.verify        Off  # Internal cluster with self-signed certs
    • Use environment variables for credentials
  2. Credentials:
    • Never hardcode passwords
    • Use environment variables:
      ${VAR_NAME}
    • Or Kubernetes secrets mounted as env vars
  3. RBAC (Kubernetes):
    • Grant minimal permissions to ServiceAccount
    • Only needs read access to pods/namespaces
    • No write permissions required
  1. TLS/SSL
    • 生产环境始终启用TLS(
      tls On
    • 生产部署默认设为
      tls.verify On
    • 仅在以下场景使用
      tls.verify Off
      • 内部Kubernetes集群使用自签名证书
      • 开发/测试环境
      • 无法获取有效CA证书(添加注释说明原因)
    • 使用
      tls.verify Off
      时,必须添加注释说明原因:
      ini
      tls               On
      tls.verify        Off  # 内部集群使用自签名证书
    • 使用环境变量存储凭证
  2. 凭证管理
    • 绝不硬编码密码
    • 使用环境变量:
      ${VAR_NAME}
    • 或使用挂载为环境变量的Kubernetes Secrets
  3. RBAC(Kubernetes)
    • 为ServiceAccount授予最小权限
    • 仅需要对Pods/Namespaces的读权限
    • 不需要写权限

Resource Limits

资源限制

  1. Memory:
    • Set per-input limits:
      Mem_Buf_Limit 50MB
    • Kubernetes limits: 200-500MB for typical DaemonSet
    • Monitor actual usage and adjust
  2. CPU:
    • Typically low CPU usage (5-50m per node)
    • Spikes during log bursts
    • Set requests/limits based on workload
  3. Disk:
    • For filesystem buffering only
    • Recommended: 5-10GB per node
    • Monitor with
      storage.metrics on
  1. 内存
    • 为每个输入设置限制:
      Mem_Buf_Limit 50MB
    • Kubernetes DaemonSet典型限制:200-500MB
    • 根据实际使用情况调整
  2. CPU
    • 通常CPU使用率较低(每个节点5-50m)
    • 日志突发时会出现峰值
    • 根据工作负载设置请求/限制
  3. 磁盘
    • 仅用于文件系统缓冲
    • 推荐:每个节点5-10GB
    • 启用
      storage.metrics on
      进行监控

Logging Best Practices

日志最佳实践

  1. Structured logging:
    • Prefer JSON logs in applications
    • Easier parsing and querying
    • Better performance than regex
  2. Log levels:
    • Use appropriate log levels in apps
    • Filter noisy logs with grep filter
    • Reduce volume = lower costs
  3. Avoid log loops:
    • Exclude Fluent Bit's own logs
    • Use
      Exclude_Path
      pattern
    • Tag filtering if needed
  1. 结构化日志
    • 应用优先使用JSON格式日志
    • 更易于解析和查询
    • 性能优于正则表达式解析
  2. 日志级别
    • 应用中使用合适的日志级别
    • 使用grep过滤器过滤冗余日志
    • 减少数据量可降低成本
  3. 避免日志循环
    • 排除Fluent Bit自身日志
    • 使用
      Exclude_Path
      模式
    • 必要时使用标签过滤

Stage 10: Validate Generated Configuration

阶段10:验证生成的配置

ALWAYS validate the generated configuration using the devops-skills:fluentbit-validator skill:
Invoke the devops-skills:fluentbit-validator skill to validate the config:
1. Syntax validation (section format, key-value pairs)
2. Required field checks
3. Plugin parameter validation
4. Tag consistency checks
5. Parser reference validation
6. Security checks (plaintext passwords)
7. Best practice recommendations
8. Dry-run testing (if fluent-bit binary available)

Follow the devops-skills:fluentbit-validator workflow to identify and fix any issues.
Validation checklist:
  • Configuration syntax is correct (INI format)
  • All required parameters are present
  • Plugin names are valid
  • Tags are consistent across sections
  • Parser files and references exist
  • Buffer limits are set
  • Retry limits are configured
  • TLS is enabled for production
  • No hardcoded credentials
  • Memory limits are reasonable
If validation fails, fix issues and re-validate until all checks pass.
必须使用devops-skills:fluentbit-validator技能验证生成的配置
调用devops-skills:fluentbit-validator技能验证配置:
1. 语法验证(段格式、键值对)
2. 必填字段检查
3. 插件参数验证
4. 标签一致性检查
5. 解析器引用验证
6. 安全检查(明文密码)
7. 最佳实践建议
8. 试运行测试(若Fluent Bit二进制可用)

遵循devops-skills:fluentbit-validator工作流,识别并修复所有问题。
验证清单
  • 配置语法正确(INI格式)
  • 所有必填参数已存在
  • 插件名称有效
  • 各段标签一致
  • 解析器文件和引用存在
  • 已设置缓冲区限制
  • 已配置重试限制
  • 生产环境已启用TLS
  • 无硬编码凭证
  • 内存限制合理
若验证失败,修复问题后重新验证,直到所有检查通过。

Error Handling

错误处理

Common Issues and Solutions

常见问题与解决方案

  1. Configuration syntax errors:
    • Check section headers:
      [SECTION]
      format
    • Verify key-value indentation (spaces, not tabs)
    • Check for typos in plugin names
    • Use validator for syntax checking
  2. Memory issues (OOM):
    • Set
      Mem_Buf_Limit
      on all tail inputs
    • Reduce buffer limits if memory constrained
    • Enable filesystem buffering for overflow
    • Check Kubernetes memory limits
  3. Missing logs:
    • Verify file paths exist
    • Check file permissions (read access)
    • Verify tag matching in filters/outputs
    • Check
      DB
      file for position tracking
    • Review
      Exclude_Path
      patterns
  4. Parser failures:
    • Test regex patterns with sample logs
    • Verify parser file is referenced in SERVICE
    • Check Time_Format matches log timestamps
    • Enable debug logging to see parser errors
  5. Kubernetes metadata missing:
    • Verify RBAC permissions (ServiceAccount, ClusterRole)
    • Check Kube_URL is correct (usually https://kubernetes.default.svc:443)
    • Verify Kube_CA_File and Kube_Token_File paths
    • Check Kube_Tag_Prefix matches input tags
  6. Output connection failures:
    • Verify host and port are correct
    • Check network connectivity (DNS resolution)
    • Verify TLS configuration if enabled
    • Check authentication credentials
    • Review retry_limit settings
  7. High CPU usage:
    • Reduce flush frequency
    • Simplify regex parsers
    • Reduce filter complexity
    • Consider worker threads
  8. Disk full (buffering):
    • Set
      storage.total_limit_size
    • Monitor disk usage
    • Clean old buffer files
    • Adjust flush intervals
  1. 配置语法错误
    • 检查段头格式:
      [SECTION]
    • 验证键值对缩进(使用空格,而非制表符)
    • 检查插件名称拼写
    • 使用验证工具检查语法
  2. 内存问题(OOM)
    • 为所有tail输入设置
      Mem_Buf_Limit
    • 内存受限环境降低缓冲区限制
    • 启用文件系统缓冲处理溢出
    • 检查Kubernetes内存限制
  3. 日志缺失
    • 验证文件路径存在
    • 检查文件权限(读权限)
    • 验证过滤器/输出端的标签匹配
    • 检查
      DB
      文件的位置跟踪
    • 查看
      Exclude_Path
      模式
  4. 解析失败
    • 使用示例日志测试正则表达式
    • 验证SERVICE段中已引用解析器文件
    • 检查Time_Format与日志时间戳匹配
    • 启用调试日志查看解析错误
  5. Kubernetes元数据缺失
    • 验证RBAC权限(ServiceAccount、ClusterRole)
    • 检查Kube_URL是否正确(通常为https://kubernetes.default.svc:443)
    • 验证Kube_CA_File和Kube_Token_File路径
    • 检查Kube_Tag_Prefix与输入标签匹配
  6. 输出连接失败
    • 验证主机和端口正确
    • 检查网络连通性(DNS解析)
    • 启用TLS时验证TLS配置
    • 检查认证凭证
    • 查看retry_limit设置
  7. CPU使用率过高
    • 降低刷新频率
    • 简化正则解析器
    • 降低过滤复杂度
    • 考虑调整工作线程数
  8. 磁盘已满(缓冲)
    • 设置
      storage.total_limit_size
    • 监控磁盘使用率
    • 清理旧缓冲文件
    • 调整刷新间隔

Communication Guidelines

沟通指南

When generating configurations:
  1. Explain structure - Describe the configuration sections and their purpose
  2. Document decisions - Explain why certain plugins or settings were chosen
  3. Highlight customization - Point out parameters that should be customized
  4. Provide examples - Show how to use the config with different scenarios
  5. Reference documentation - Link to relevant Fluent Bit docs when helpful
  6. Validate proactively - Always validate generated configs and fix issues
  7. Security reminders - Highlight credential and TLS requirements
  8. Performance notes - Explain buffer limits and flush intervals
生成配置时:
  1. 解释结构 - 描述配置段及其用途
  2. 记录决策 - 说明选择特定插件或设置的原因
  3. 突出可定制项 - 指出需要用户自定义的参数
  4. 提供示例 - 展示配置在不同场景下的使用方法
  5. 引用文档 - 必要时链接到相关Fluent Bit文档
  6. 主动验证 - 始终验证生成的配置并修复问题
  7. 安全提醒 - 强调凭证和TLS要求
  8. 性能说明 - 解释缓冲区限制和刷新间隔

Integration with devops-skills:fluentbit-validator

与devops-skills:fluentbit-validator集成

After generating any Fluent Bit configuration, automatically invoke the devops-skills:fluentbit-validator skill to ensure quality:
Steps:
1. Generate the Fluent Bit configuration
2. Invoke devops-skills:fluentbit-validator skill with the config file
3. Review validation results
4. Fix any issues identified
5. Re-validate until all checks pass
6. Provide summary of generated config and validation status
This ensures all generated configurations follow best practices and are production-ready.
生成任何Fluent Bit配置后,自动调用devops-skills:fluentbit-validator技能以保证配置质量:
步骤:
1. 生成Fluent Bit配置
2. 将配置文件传入devops-skills:fluentbit-validator技能
3. 查看验证结果
4. 修复识别出的问题
5. 重新验证直到所有检查通过
6. 提供生成的配置摘要和验证状态
这可确保所有生成的配置遵循最佳实践,且为生产就绪型。

Resources

资源

scripts/

scripts/

generate_config.py
  • Python script for generating Fluent Bit configurations
  • Template-based approach with common use cases
  • Supports 13 use cases:
    • kubernetes-elasticsearch
      - K8s logs to Elasticsearch
    • kubernetes-loki
      - K8s logs to Loki
    • kubernetes-cloudwatch
      - K8s logs to CloudWatch
    • kubernetes-opentelemetry
      - K8s logs to OpenTelemetry (NEW)
    • application-multiline
      - App logs with multiline parsing
    • syslog-forward
      - Syslog forwarding
    • file-tail-s3
      - File tailing to S3
    • http-kafka
      - HTTP webhook to Kafka
    • multi-destination
      - Multiple output destinations
    • prometheus-metrics
      - Prometheus metrics collection (NEW)
    • lua-filtering
      - Lua script filtering (NEW)
    • stream-processor
      - Stream processor for analytics (NEW)
    • custom
      - Minimal custom template
  • Usage:
    python3 scripts/generate_config.py --use-case kubernetes-elasticsearch --output fluent-bit.conf
generate_config.py
  • 用于生成Fluent Bit配置的Python脚本
  • 基于模板的方法,支持常见用例
  • 支持13种用例:
    • kubernetes-elasticsearch
      - K8s日志到Elasticsearch
    • kubernetes-loki
      - K8s日志到Loki
    • kubernetes-cloudwatch
      - K8s日志到CloudWatch
    • kubernetes-opentelemetry
      - K8s日志到OpenTelemetry(新增)
    • application-multiline
      - 带多行解析的应用日志
    • syslog-forward
      - Syslog转发
    • file-tail-s3
      - 文件尾追到S3
    • http-kafka
      - HTTP Webhook到Kafka
    • multi-destination
      - 多输出目标
    • prometheus-metrics
      - Prometheus指标收集(新增)
    • lua-filtering
      - Lua脚本过滤(新增)
    • stream-processor
      - 流处理分析(新增)
    • custom
      - 极简自定义模板
  • 使用方法:
    python3 scripts/generate_config.py --use-case kubernetes-elasticsearch --output fluent-bit.conf

examples/

examples/

Contains production-ready example configurations:
  • kubernetes-elasticsearch.conf
    - K8s logs to Elasticsearch with metadata enrichment
  • kubernetes-loki.conf
    - K8s logs to Loki with labels
  • kubernetes-opentelemetry.conf
    - K8s logs to OpenTelemetry Collector (OTLP/HTTP)
  • application-multiline.conf
    - App logs with stack trace parsing
  • syslog-forward.conf
    - Syslog collection and forwarding
  • file-tail-s3.conf
    - File tailing to S3 with compression
  • http-input-kafka.conf
    - HTTP webhook to Kafka
  • multi-destination.conf
    - Logs to multiple outputs (Elasticsearch + S3)
  • prometheus-metrics.conf
    - Metrics collection and Prometheus remote_write
  • lua-filtering.conf
    - Custom Lua script filtering and transformation
  • stream-processor.conf
    - SQL-like stream processing for analytics
  • parsers.conf
    - Custom parser examples (JSON, regex, multiline)
  • full-production.conf
    - Complete production setup
  • cloudwatch.conf
    - AWS CloudWatch integration
包含生产就绪型示例配置:
  • kubernetes-elasticsearch.conf
    - K8s日志到Elasticsearch,带元数据增强
  • kubernetes-loki.conf
    - K8s日志到Loki,带标签
  • kubernetes-opentelemetry.conf
    - K8s日志到OpenTelemetry Collector(OTLP/HTTP)
  • application-multiline.conf
    - 带堆栈跟踪解析的应用日志
  • syslog-forward.conf
    - Syslog收集与转发
  • file-tail-s3.conf
    - 文件尾追到S3,带压缩
  • http-input-kafka.conf
    - HTTP Webhook到Kafka
  • multi-destination.conf
    - 日志到多输出端(Elasticsearch + S3)
  • prometheus-metrics.conf
    - 指标收集与Prometheus远程写入
  • lua-filtering.conf
    - 自定义Lua脚本过滤与转换
  • stream-processor.conf
    - 类SQL流处理分析
  • parsers.conf
    - 自定义解析器示例(JSON、正则、多行)
  • full-production.conf
    - 完整生产环境配置
  • cloudwatch.conf
    - AWS CloudWatch集成

Documentation Sources

文档来源

基于以下来源的综合研究: