loki-config-generator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Loki Configuration Generator

Loki配置生成器

Overview

概述

Generate production-ready Grafana Loki server configurations with best practices. Supports monolithic, simple scalable, and microservices deployment modes with S3, GCS, Azure, or filesystem storage.

Current Stable: Loki 3.6.2 (November 2025) Important: Promtail deprecated in 3.4 - use Grafana Alloy instead. See
examples/grafana-alloy.yaml
for log collection configuration.

基于最佳实践生成生产可用的Grafana Loki服务端配置，支持单体、简易可扩展、微服务三种部署模式，适配S3、GCS、Azure或本地文件系统存储。

当前稳定版本： Loki 3.6.2（2025年11月） 重要提示： Promtail从3.4版本开始弃用，请改用Grafana Alloy。日志采集配置可参考
examples/grafana-alloy.yaml
文件。

When to Use

适用场景

Invoke when: deploying Loki, creating configs from scratch, migrating to Loki, implementing multi-tenant logging, configuring storage backends, or optimizing existing deployments.

以下场景可调用本工具：部署Loki、从零搭建配置、迁移到Loki、实现多租户日志系统、配置存储后端、优化现有Loki部署。

Generation Methods

生成方式

Method 1: Script Generation (Recommended)

方式1：脚本生成（推荐）

Use
scripts/generate_config.py
for consistent, validated configurations:

bash

undefined

使用
scripts/generate_config.py
生成一致、经过校验的配置：

bash

undefined

Simple Scalable with S3 (production)

适配S3的简易可扩展模式（生产环境）

python3 scripts/generate_config.py
--mode simple-scalable
--storage s3
--bucket my-loki-bucket
--region us-east-1
--retention-days 30
--otlp-enabled
--output loki-config.yaml

Monolithic with filesystem (development)

适配本地文件系统的单体模式（开发环境）

python3 scripts/generate_config.py
--mode monolithic
--storage filesystem
--no-auth-enabled
--output loki-dev.yaml

Production with Thanos storage (Loki 3.4+)

适配Thanos存储的生产配置（Loki 3.4+）

python3 scripts/generate_config.py
--mode simple-scalable
--storage s3
--thanos-storage
--otlp-enabled
--time-sharding
--output loki-thanos.yaml


**Script Options:**
| Option | Description |
|--------|-------------|
| `--mode` | monolithic, simple-scalable, microservices |
| `--storage` | filesystem, s3, gcs, azure |
| `--auth-enabled` / `--no-auth-enabled` | Explicitly enable/disable auth |
| `--otlp-enabled` | Enable OTLP ingestion configuration |
| `--thanos-storage` | Use Thanos object storage client (3.4+, cloud backends) |
| `--time-sharding` | Enable out-of-order ingestion (simple-scalable) |
| `--ruler` | Enable alerting/recording rules (not monolithic) |
| `--horizontal-compactor` | main/worker mode (simple-scalable, 3.6+) |
| `--zone-awareness` | Enable multi-AZ placement safeguards |
| `--limits-dry-run` | Log limit rejections without enforcing |

python3 scripts/generate_config.py
--mode simple-scalable
--storage s3
--thanos-storage
--otlp-enabled
--time-sharding
--output loki-thanos.yaml


**脚本参数说明：**
| 参数 | 描述 |
|--------|-------------|
| `--mode` | 部署模式：monolithic（单体）、simple-scalable（简易可扩展）、microservices（微服务） |
| `--storage` | 存储类型：filesystem（本地文件）、s3、gcs、azure |
| `--auth-enabled` / `--no-auth-enabled` | 显式开启/关闭身份认证 |
| `--otlp-enabled` | 开启OTLP数据接入配置 |
| `--thanos-storage` | 使用Thanos对象存储客户端（3.4+版本支持，仅云存储后端可用） |
| `--time-sharding` | 开启乱序数据写入（仅简易可扩展模式可用） |
| `--ruler` | 开启告警/记录规则（单体模式不支持） |
| `--horizontal-compactor` | 主/从模式的水平扩展压缩器（简易可扩展模式，3.6+版本支持） |
| `--zone-awareness` | 开启多可用区部署容灾保障 |
| `--limits-dry-run` | 仅记录配额超限日志，不实际拦截请求 |

Method 2: Manual Configuration

方式2：手动配置

Follow the staged workflow below when script generation doesn't meet specific requirements or when learning the configuration structure.

当脚本生成无法满足特定需求，或需要学习配置结构时，可按照下文的分步流程手动配置。

Output Formats

输出格式

For Kubernetes deployments, generate BOTH formats:

Native Loki config (
```
loki-config.yaml
```
) - For ConfigMap or direct use
Helm values (
```
values.yaml
```
) - For Helm chart deployments

See

examples/kubernetes-helm-values.yaml

for Helm format.

针对Kubernetes部署，需要同时生成两种格式：

原生Loki配置（
```
loki-config.yaml
```
） - 用于ConfigMap或直接部署
Helm values（
```
values.yaml
```
） - 用于Helm chart部署

Helm格式示例可参考

examples/kubernetes-helm-values.yaml

文件。

Documentation Lookup

文档查询说明

When to Use Context7/Web Search

何时需要使用Context7/网页搜索

REQUIRED - Use Context7 MCP for:

Configuring features from Loki 3.4+ (Thanos storage, time sharding)
Configuring features from Loki 3.6+ (horizontal compactor, enforced labels)
Bloom filter configuration (complex, experimental)
Custom OTLP attribute mappings beyond standard patterns
Troubleshooting configuration errors

OPTIONAL - Skip documentation lookup for:

Standard deployment modes (monolithic, simple-scalable)
Basic storage configuration (S3, GCS, Azure, filesystem)
Default limits and component settings
Configurations covered in
```
references/
```
directory

必须使用Context7 MCP查询的场景：

配置Loki 3.4+版本新增特性（Thanos存储、时间分片）
配置Loki 3.6+版本新增特性（水平扩展压缩器、强制标签）
布隆过滤器配置（复杂、实验特性）
超出标准模式的自定义OTLP属性映射
配置错误排查

可跳过文档查询的场景：

标准部署模式（单体、简易可扩展）
基础存储配置（S3、GCS、Azure、本地文件）
默认配额和组件设置
```
references/
```
目录中已覆盖的配置

Context7 MCP (preferred)

Context7 MCP（优先使用）

resolve-library-id: "grafana loki"
get-library-docs: /websites/grafana_loki, topic: [component]

Example topics:

storage_config

limits_config

otlp

compactor

ruler

bloom

resolve-library-id: "grafana loki"
get-library-docs: /websites/grafana_loki, topic: [组件名]

示例主题：

storage_config

、

limits_config

、

otlp

、

compactor

、

ruler

、

bloom

Web Search Fallback

网页搜索备选方案

Use when Context7 unavailable:

"Grafana Loki 3.6 [component] configuration documentation site:grafana.com"

当Context7不可用时使用：

"Grafana Loki 3.6 [组件名] configuration documentation site:grafana.com"

Configuration Workflow

配置流程

Stage 1: Gather Requirements

阶段1：收集需求

Deployment Mode:

Mode	Scale	Use Case
Monolithic	<100GB/day	Testing, development
Simple Scalable	100GB-1TB/day	Production
Microservices	>1TB/day	Large-scale, multi-tenant

Storage Backend: S3, GCS, Azure Blob, Filesystem, MinIO

Key Questions: Expected log volume? Retention period? Multi-tenancy needed? High availability requirements? Kubernetes deployment?

Ask the user directly if required information is missing.

部署模式选择：

模式	规模	适用场景
单体模式	日写入量<100GB	测试、开发环境
简易可扩展模式	日写入量100GB-1TB	生产环境
微服务模式	日写入量>1TB	大规模、多租户场景

存储后端可选： S3、GCS、Azure Blob、本地文件系统、MinIO

核心确认问题： 预期日志规模？数据保留周期？是否需要多租户？高可用要求？是否部署在Kubernetes上？

如果缺少必要信息，请直接询问用户。

Stage 2: Schema Configuration (CRITICAL)

阶段2：Schema配置（关键）

For all new deployments (Loki 2.9+), use TSDB with v13 schema:

yaml

schema_config:
  configs:
    - from: "2025-01-01"  # Use deployment date
      store: tsdb
      object_store: s3     # s3, gcs, azure, filesystem
      schema: v13
      index:
        prefix: loki_index_
        period: 24h

Key: Schema cannot change after deployment without migration.

所有Loki 2.9+版本的新部署，都应使用TSDB + v13 schema：

yaml

schema_config:
  configs:
    - from: "2025-01-01"  # 替换为实际部署日期
      store: tsdb
      object_store: s3     # 可选：s3、gcs、azure、filesystem
      schema: v13
      index:
        prefix: loki_index_
        period: 24h

注意： Schema一旦部署后无法修改，必须通过迁移才能变更。

Stage 3: Storage Configuration

阶段3：存储配置

S3:

yaml

common:
  storage:
    s3:
      s3: s3://us-east-1/loki-bucket
      s3forcepathstyle: false

GCS:

gcs: { bucket_name: loki-bucket }

Azure:

azure: { container_name: loki-container, account_name: ${AZURE_ACCOUNT_NAME} }

Filesystem:

filesystem: { chunks_directory: /loki/chunks, rules_directory: /loki/rules }

S3配置：

yaml

common:
  storage:
    s3:
      s3: s3://us-east-1/loki-bucket
      s3forcepathstyle: false

GCS配置：

gcs: { bucket_name: loki-bucket }

Azure配置：

azure: { container_name: loki-container, account_name: ${AZURE_ACCOUNT_NAME} }

本地文件系统配置：

filesystem: { chunks_directory: /loki/chunks, rules_directory: /loki/rules }

Stage 4: Component Configuration

阶段4：组件配置

Ingester:

yaml

ingester:
  chunk_encoding: snappy
  chunk_idle_period: 30m
  max_chunk_age: 2h
  chunk_target_size: 1572864  # 1.5MB
  lifecycler:
    ring:
      replication_factor: 3  # 3 for production

Querier:

yaml

querier:
  max_concurrent: 4
  query_timeout: 1m

Compactor:

yaml

compactor:
  working_directory: /loki/compactor
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h

Ingester配置：

yaml

ingester:
  chunk_encoding: snappy
  chunk_idle_period: 30m
  max_chunk_age: 2h
  chunk_target_size: 1572864  # 1.5MB
  lifecycler:
    ring:
      replication_factor: 3  # 生产环境设为3

Querier配置：

yaml

querier:
  max_concurrent: 4
  query_timeout: 1m

Compactor配置：

yaml

compactor:
  working_directory: /loki/compactor
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h

Stage 5: Limits Configuration

阶段5：配额配置

yaml

limits_config:
  ingestion_rate_mb: 10
  ingestion_burst_size_mb: 20
  max_streams_per_user: 10000
  max_entries_limit_per_query: 5000
  max_query_length: 721h
  retention_period: 30d
  allow_structured_metadata: true
  volume_enabled: true

yaml

limits_config:
  ingestion_rate_mb: 10
  ingestion_burst_size_mb: 20
  max_streams_per_user: 10000
  max_entries_limit_per_query: 5000
  max_query_length: 721h
  retention_period: 30d
  allow_structured_metadata: true
  volume_enabled: true

Stage 6: Server & Auth

阶段6：服务端与认证配置

yaml

server:
  http_listen_port: 3100
  grpc_listen_port: 9096
  log_level: info

auth_enabled: true  # false for single-tenant

yaml

server:
  http_listen_port: 3100
  grpc_listen_port: 9096
  log_level: info

auth_enabled: true  # 单租户场景设为false

Stage 7: OTLP Ingestion (Loki 3.0+)

阶段7：OTLP接入配置（Loki 3.0+）

Native OpenTelemetry ingestion - use

otlphttp

exporter (NOT deprecated

lokiexporter

yaml

limits_config:
  allow_structured_metadata: true
  otlp_config:
    resource_attributes:
      attributes_config:
        - action: index_label  # Low-cardinality only!
          attributes: [service.name, service.namespace, deployment.environment]
        - action: structured_metadata  # High-cardinality
          attributes: [k8s.pod.name, service.instance.id]

Actions:

index_label

(searchable, low-cardinality),

structured_metadata

(queryable),

drop

⚠️ NEVER use
k8s.pod.name
as index_label - use structured_metadata instead.

OTel Collector:

yaml

exporters:
  otlphttp:
    endpoint: http://loki:3100/otlp

原生OpenTelemetry数据接入——使用

otlphttp

exporter（不要用已弃用的

lokiexporter

）：

yaml

limits_config:
  allow_structured_metadata: true
  otlp_config:
    resource_attributes:
      attributes_config:
        - action: index_label  # 仅低基数属性可设为index_label!
          attributes: [service.name, service.namespace, deployment.environment]
        - action: structured_metadata  # 高基数属性设为structured_metadata
          attributes: [k8s.pod.name, service.instance.id]

可选动作：

index_label

（可搜索，低基数）、

structured_metadata

（可查询）、

drop

（丢弃）

⚠️ 绝对不要将
k8s.pod.name
设为index_label，应该归类为structured_metadata。

OTel Collector配置：

yaml

exporters:
  otlphttp:
    endpoint: http://loki:3100/otlp

Stage 8: Caching

阶段8：缓存配置

yaml

chunk_store_config:
  chunk_cache_config:
    memcached_client:
      host: memcached-chunks
      timeout: 500ms

query_range:
  cache_results: true
  results_cache:
    cache:
      memcached_client:
        host: memcached-results

yaml

chunk_store_config:
  chunk_cache_config:
    memcached_client:
      host: memcached-chunks
      timeout: 500ms

query_range:
  cache_results: true
  results_cache:
    cache:
      memcached_client:
        host: memcached-results

Stage 9: Advanced Features

阶段9：高级特性配置

Pattern Ingester (3.0+):

yaml

pattern_ingester:
  enabled: true

Bloom Filters (Experimental, 3.3+): Only for >75TB/month deployments. Works on structured metadata only. See examples/ for config.

Time Sharding (3.4+): For out-of-order ingestion:

yaml

limits_config:
  shard_streams:
    time_sharding_enabled: true

Thanos Storage (3.4+): New storage client, opt-in now, default later:

yaml

storage_config:
  use_thanos_objstore: true
  object_store:
    s3:
      bucket_name: my-bucket
      endpoint: s3.us-west-2.amazonaws.com

模式Ingester（3.0+）：

yaml

pattern_ingester:
  enabled: true

布隆过滤器（实验特性，3.3+）： 仅适用于月写入量>75TB的部署，仅支持structured_metadata属性，配置示例可参考examples/目录。

时间分片（3.4+）： 支持乱序数据写入：

yaml

limits_config:
  shard_streams:
    time_sharding_enabled: true

Thanos存储（3.4+）： 新版存储客户端，当前为可选配置，后续会成为默认：

yaml

storage_config:
  use_thanos_objstore: true
  object_store:
    s3:
      bucket_name: my-bucket
      endpoint: s3.us-west-2.amazonaws.com

Stage 10: Ruler (Alerting)

阶段10：Ruler（告警）配置

yaml

ruler:
  storage:
    type: s3
    s3: { bucket_name: loki-ruler }
  alertmanager_url: http://alertmanager:9093
  enable_api: true
  enable_sharding: true

yaml

ruler:
  storage:
    type: s3
    s3: { bucket_name: loki-ruler }
  alertmanager_url: http://alertmanager:9093
  enable_api: true
  enable_sharding: true

Stage 11: Loki 3.6 Features

阶段11：Loki 3.6新特性配置

Horizontally Scalable Compactor:
```
horizontal_scaling_mode: main|worker
```
Policy-Based Enforced Labels:
```
enforced_labels: [service.name]
```
FluentBit v4:
```
structured_metadata
```
parameter support

水平可扩展Compactor：
```
horizontal_scaling_mode: main|worker
```
基于策略的强制标签：
```
enforced_labels: [service.name]
```
FluentBit v4支持： 支持
```
structured_metadata
```
参数

Stage 12: Validate Configuration (REQUIRED)

阶段12：配置校验（必须执行）

Always validate before deployment:

bash

undefined

部署前必须进行校验：

bash

undefined

Syntax and parameter validation

语法和参数校验

loki -config.file=loki-config.yaml -verify-config

Print resolved configuration (shows defaults)

打印完整解析后的配置（包含默认值）

loki -config.file=loki-config.yaml -print-config-stderr 2>&1 | head -100

Dry-run with Docker (if Loki not installed locally)

本地未安装Loki时用Docker执行 dry-run

docker run --rm -v $(pwd)/loki-config.yaml:/etc/loki/config.yaml
grafana/loki:3.6.2 -config.file=/etc/loki/config.yaml -verify-config


**Validation Checklist:**
- [ ] No syntax errors from `-verify-config`
- [ ] Schema uses `tsdb` and `v13`
- [ ] `replication_factor: 3` for production
- [ ] `auth_enabled: true` if multi-tenant
- [ ] Storage credentials/IAM configured
- [ ] Retention period matches requirements

---

docker run --rm -v $(pwd)/loki-config.yaml:/etc/loki/config.yaml
grafana/loki:3.6.2 -config.file=/etc/loki/config.yaml -verify-config


**校验清单：**
- [ ] `-verify-config` 无语法错误
- [ ] Schema使用`tsdb`和`v13`
- [ ] 生产环境`replication_factor: 3`
- [ ] 多租户场景`auth_enabled: true`
- [ ] 存储凭证/IAM已正确配置
- [ ] 数据保留周期符合需求

---

Production Checklist

生产环境检查清单

High Availability Requirements

高可用要求

Zone-Aware Replication (CRITICAL for production multi-AZ deployments):

When using

replication_factor: 3

, ALWAYS enable zone-awareness for multi-AZ deployments:

yaml

ingester:
  lifecycler:
    ring:
      replication_factor: 3
      zone_awareness_enabled: true  # CRITICAL for multi-AZ

可用区感知复制（多可用区生产部署的关键配置）：

当设置

replication_factor: 3

时，多可用区部署必须开启可用区感知：

yaml

ingester:
  lifecycler:
    ring:
      replication_factor: 3
      zone_awareness_enabled: true  # 多可用区部署必须开启

Set zone via environment variable or config

通过环境变量或配置设置可用区

Each pod should set its zone based on node topology

每个Pod应根据节点拓扑设置自身可用区

common: instance_availability_zone: ${AVAILABILITY_ZONE}


**Why:** Without zone-awareness, all 3 replicas may land in the same AZ. If that AZ fails, you lose data.

**Kubernetes Implementation:**
```yaml

common: instance_availability_zone: ${AVAILABILITY_ZONE}


**原因：** 不开启可用区感知的话，3个副本可能都调度到同一个可用区，如果该可用区故障会导致数据丢失。

**Kubernetes实现方式：**
```yaml

In Helm values or pod spec

在Helm values或Pod spec中配置

env:

name: AVAILABILITY_ZONE valueFrom: fieldRef: fieldPath: metadata.labels['topology.kubernetes.io/zone']

undefined

env:

name: AVAILABILITY_ZONE valueFrom: fieldRef: fieldPath: metadata.labels['topology.kubernetes.io/zone']

undefined

TLS Configuration (Production Required)

TLS配置（生产环境必须开启）

Enable TLS for all inter-component and client communication:

yaml

server:
  http_tls_config:
    cert_file: /etc/loki/tls/tls.crt
    key_file: /etc/loki/tls/tls.key
    client_ca_file: /etc/loki/tls/ca.crt  # For mTLS
  grpc_tls_config:
    cert_file: /etc/loki/tls/tls.crt
    key_file: /etc/loki/tls/tls.key
    client_ca_file: /etc/loki/tls/ca.crt

See

examples/production-tls.yaml

for complete TLS configuration.

所有组件间和客户端通信都要开启TLS：

yaml

server:
  http_tls_config:
    cert_file: /etc/loki/tls/tls.crt
    key_file: /etc/loki/tls/tls.key
    client_ca_file: /etc/loki/tls/ca.crt  # 双向TLS认证使用
  grpc_tls_config:
    cert_file: /etc/loki/tls/tls.crt
    key_file: /etc/loki/tls/tls.key
    client_ca_file: /etc/loki/tls/ca.crt

完整TLS配置可参考

examples/production-tls.yaml

文件。

Production Checklist Summary

生产检查清单汇总

Requirement	Setting	Required For
`replication_factor: 3`	common block	All production
`zone_awareness_enabled: true`	ingester.lifecycler.ring	Multi-AZ
`auth_enabled: true`	root level	Multi-tenant
TLS enabled	server block	All production
IAM roles (not keys)	storage config	Cloud storage
Caching enabled	chunk_store_config, query_range	Performance
Pattern ingester	pattern_ingester.enabled	Observability
Retention configured	compactor + limits_config	Cost control

要求	配置位置	适用场景
`replication_factor: 3`	common块	所有生产环境
`zone_awareness_enabled: true`	ingester.lifecycler.ring	多可用区部署
`auth_enabled: true`	根层级	多租户场景
开启TLS	server块	所有生产环境
使用IAM角色（而非密钥）	存储配置	云存储场景
开启缓存	chunk_store_config、query_range	性能优化
开启模式Ingester	pattern_ingester.enabled	可观测性提升
配置数据保留周期	compactor + limits_config	成本控制

Monitoring Recommendations

监控建议

Key Metrics to Monitor

核心监控指标

Configure Prometheus to scrape Loki metrics and alert on these critical indicators:

yaml

undefined

配置Prometheus采集Loki指标，并对以下关键指标设置告警：

yaml

undefined

Prometheus scrape config

Prometheus采集配置

job_name: 'loki' static_configs:
- targets: ['loki:3100']

undefined

job_name: 'loki' static_configs:
- targets: ['loki:3100']

undefined

Critical Alerts

关键告警规则

yaml

groups:
  - name: loki-critical
    rules:
      # Ingestion failures
      - alert: LokiIngestionFailures
        expr: sum(rate(loki_distributor_ingester_append_failures_total[5m])) > 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Loki ingestion failures detected"

      # High stream cardinality (performance killer)
      - alert: LokiHighStreamCardinality
        expr: loki_ingester_memory_streams > 100000
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High stream cardinality - review labels"

      # Compaction not running (retention broken)
      - alert: LokiCompactionStalled
        expr: time() - loki_compactor_last_successful_run_timestamp_seconds > 7200
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Loki compaction stalled - retention not enforced"

      # Query latency
      - alert: LokiSlowQueries
        expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket{route=~"loki_api_v1_query.*"}[5m])) by (le)) > 30
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Loki query P99 latency > 30s"

      # Ingester memory pressure
      - alert: LokiIngesterMemoryHigh
        expr: container_memory_usage_bytes{container="ingester"} / container_spec_memory_limit_bytes{container="ingester"} > 0.8
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Loki ingester memory usage > 80%"

yaml

groups:
  - name: loki-critical
    rules:
      # 数据写入失败
      - alert: LokiIngestionFailures
        expr: sum(rate(loki_distributor_ingester_append_failures_total[5m])) > 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "检测到Loki写入失败"

      # 流基数过高（严重影响性能）
      - alert: LokiHighStreamCardinality
        expr: loki_ingester_memory_streams > 100000
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "流基数过高，请检查标签配置"

      # 压缩任务停止（数据保留失效）
      - alert: LokiCompactionStalled
        expr: time() - loki_compactor_last_successful_run_timestamp_seconds > 7200
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Loki压缩任务停滞，数据保留策略未生效"

      # 查询延迟过高
      - alert: LokiSlowQueries
        expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket{route=~"loki_api_v1_query.*"}[5m])) by (le)) > 30
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Loki查询P99延迟超过30秒"

      # Ingester内存压力过高
      - alert: LokiIngesterMemoryHigh
        expr: container_memory_usage_bytes{container="ingester"} / container_spec_memory_limit_bytes{container="ingester"} > 0.8
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Loki ingester内存使用率超过80%"

Key Metrics Reference

核心指标参考

Metric	Description	Action Threshold
`loki_ingester_memory_streams`	Active streams in memory	>100k: review cardinality
`loki_distributor_ingester_append_failures_total`	Ingestion failures	>0: investigate immediately
`loki_request_duration_seconds`	Query latency	P99 >30s: add caching/queriers
`loki_ingester_chunks_flushed_total`	Chunk flush rate	Low rate: check ingester health
`loki_compactor_last_successful_run_timestamp_seconds`	Last compaction	>2h ago: compaction broken

指标	描述	告警阈值
`loki_ingester_memory_streams`	内存中活跃流数量	>10万：检查基数
`loki_distributor_ingester_append_failures_total`	写入失败次数	>0：立即排查
`loki_request_duration_seconds`	查询延迟	P99>30秒：增加缓存/Querier实例
`loki_ingester_chunks_flushed_total`	Chunk刷盘速率	速率过低：检查Ingester健康状态
`loki_compactor_last_successful_run_timestamp_seconds`	上次成功压缩时间	超过2小时：压缩任务故障

Grafana Dashboard

Grafana仪表盘

Import official Loki dashboards:

Dashboard ID:
```
13407
```
- Loki Logs
Dashboard ID:
```
14055
```
- Loki Operational

导入官方Loki仪表盘：

仪表盘ID:
```
13407
```
- Loki日志
仪表盘ID:
```
14055
```
- Loki运营监控

Log Collection with Grafana Alloy

使用Grafana Alloy采集日志

Promtail is deprecated (support ends Feb 2026). Use Grafana Alloy for new deployments.

Promtail已弃用（支持到2026年2月），新部署请使用Grafana Alloy。

Basic Alloy Configuration

基础Alloy配置

See

examples/grafana-alloy.yaml

for complete configuration.

alloy

// Kubernetes log discovery
discovery.kubernetes "pods" {
  role = "pod"
}

// Relabeling for Kubernetes metadata
discovery.relabel "pods" {
  targets = discovery.kubernetes.pods.targets

  rule {
    source_labels = ["__meta_kubernetes_namespace"]
    target_label  = "namespace"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_name"]
    target_label  = "pod"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_container_name"]
    target_label  = "container"
  }
}

// Log collection
loki.source.kubernetes "pods" {
  targets    = discovery.relabel.pods.output
  forward_to = [loki.write.default.receiver]
}

// Send to Loki
loki.write "default" {
  endpoint {
    url = "http://loki-gateway.loki.svc.cluster.local/loki/api/v1/push"

    // For multi-tenant
    tenant_id = "default"
  }
}

完整配置可参考

examples/grafana-alloy.yaml

文件。

alloy

// Kubernetes日志发现
discovery.kubernetes "pods" {
  role = "pod"
}

// Kubernetes元数据重标记
discovery.relabel "pods" {
  targets = discovery.kubernetes.pods.targets

  rule {
    source_labels = ["__meta_kubernetes_namespace"]
    target_label  = "namespace"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_name"]
    target_label  = "pod"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_container_name"]
    target_label  = "container"
  }
}

// 日志采集
loki.source.kubernetes "pods" {
  targets    = discovery.relabel.pods.output
  forward_to = [loki.write.default.receiver]
}

// 发送到Loki
loki.write "default" {
  endpoint {
    url = "http://loki-gateway.loki.svc.cluster.local/loki/api/v1/push"

    // 多租户场景配置
    tenant_id = "default"
  }
}

Migration from Promtail

从Promtail迁移

bash

undefined

bash

undefined

Convert Promtail config to Alloy

将Promtail配置转换为Alloy配置

alloy convert --source-format=promtail --output=alloy-config.alloy promtail.yaml

---

alloy convert --source-format=promtail --output=alloy-config.alloy promtail.yaml

---

Complete Examples

完整示例

See

examples/

directory for full configurations:

```
monolithic-filesystem.yaml
```
- Development/testing
```
simple-scalable-s3.yaml
```
- Production with S3
```
microservices-s3.yaml
```
- Large-scale distributed
```
multi-tenant.yaml
```
- Multi-tenant with per-tenant limits
```
production-tls.yaml
```
- TLS-enabled production config
```
grafana-alloy.yaml
```
- Log collection with Alloy
```
kubernetes-helm-values.yaml
```
- Helm chart values

Minimal Monolithic:

yaml

auth_enabled: false
server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2025-01-01
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: loki_index_
        period: 24h

limits_config:
  retention_period: 30d
  allow_structured_metadata: true

compactor:
  working_directory: /loki/compactor
  retention_enabled: true

examples/

目录下提供完整配置示例：

```
monolithic-filesystem.yaml
```
- 开发/测试环境单体配置
```
simple-scalable-s3.yaml
```
- 适配S3的生产环境配置
```
microservices-s3.yaml
```
- 大规模分布式部署配置
```
multi-tenant.yaml
```
- 支持单租户配额的多租户配置
```
production-tls.yaml
```
- 开启TLS的生产配置
```
grafana-alloy.yaml
```
- Alloy日志采集配置
```
kubernetes-helm-values.yaml
```
- Helm chart部署values

最简单体配置：

yaml

auth_enabled: false
server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2025-01-01
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: loki_index_
        period: 24h

limits_config:
  retention_period: 30d
  allow_structured_metadata: true

compactor:
  working_directory: /loki/compactor
  retention_enabled: true

Helm Deployment

Helm部署

bash

helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki -f values.yaml

Generate both native config and Helm values for Kubernetes deployments.

yaml

undefined

bash

helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki -f values.yaml

Kubernetes部署请同时生成原生配置和Helm values。

yaml

undefined

values.yaml

deploymentMode: SimpleScalable

loki: schemaConfig: configs: - from: "2025-01-01" store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h limits_config: retention_period: 30d allow_structured_metadata: true

Zone awareness for HA

ingester: lifecycler: ring: zone_awareness_enabled: true

backend: replicas: 3

Spread across zones

topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule read: replicas: 3 write: replicas: 3

---

deploymentMode: SimpleScalable

loki: schemaConfig: configs: - from: "2025-01-01" store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h limits_config: retention_period: 30d allow_structured_metadata: true

高可用可用区感知配置

ingester: lifecycler: ring: zone_awareness_enabled: true

backend: replicas: 3

跨可用区调度

topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule read: replicas: 3 write: replicas: 3

---

Best Practices

最佳实践

Performance:

chunk_encoding: snappy

chunk_target_size: 1572864

Enable caching (chunks, results)
```
parallelise_shardable_queries: true
```

Security:

```
auth_enabled: true
```
with reverse proxy auth
IAM roles for cloud storage (never hardcode keys)
TLS for all communications (see Production Checklist)

Reliability:

```
replication_factor: 3
```
for production
```
zone_awareness_enabled: true
```
for multi-AZ (see Production Checklist)
Persistent volumes for ingesters
Monitor ingestion rate and query latency (see Monitoring section)

Limits: Set

ingestion_rate_mb

max_streams_per_user

to prevent overload

性能优化：

配置

chunk_encoding: snappy

、

chunk_target_size: 1572864

开启缓存（chunk、查询结果）
配置
```
parallelise_shardable_queries: true
```

安全建议：

开启
```
auth_enabled: true
```
配合反向代理认证
云存储使用IAM角色（绝对不要硬编码密钥）
所有通信开启TLS（参考生产检查清单）

可靠性建议：

生产环境配置
```
replication_factor: 3
```
多可用区部署开启
```
zone_awareness_enabled: true
```
（参考生产检查清单）
Ingester使用持久化卷
监控写入速率和查询延迟（参考监控章节）

配额配置： 合理设置

ingestion_rate_mb

、

max_streams_per_user

防止系统过载

Common Issues

常见问题

Issue	Solution
High ingester memory	Reduce `max_streams_per_user` , lower `chunk_idle_period`
Slow queries	Increase `max_concurrent` , enable parallelization, add caching
Ingestion failures	Check `ingestion_rate_mb` , verify storage connectivity
Storage growing fast	Enable retention, check compression, review cardinality
Data loss in AZ failure	Enable `zone_awareness_enabled: true`
Config validation fails	Run `loki -verify-config` , check YAML syntax

问题	解决方案
Ingester内存占用过高	降低 `max_streams_per_user` ，缩短 `chunk_idle_period`
查询缓慢	提高 `max_concurrent` ，开启查询并行，增加缓存
写入失败	检查 `ingestion_rate_mb` 配置，验证存储连通性
存储容量增长过快	开启数据保留，检查压缩配置，优化基数
可用区故障导致数据丢失	开启 `zone_awareness_enabled: true`
配置校验失败	执行 `loki -verify-config` ，检查YAML语法

Deprecated (Migrate Away)

已弃用特性（请迁移）

```
boltdb-shipper
```
→
```
tsdb
```
```
lokiexporter
```
→
```
otlphttp
```
Promtail → Grafana Alloy (support ends Feb 2026)

```
boltdb-shipper
```
→ 替换为
```
tsdb
```
```
lokiexporter
```
→ 替换为
```
otlphttp
```
Promtail → 替换为Grafana Alloy（2026年2月停止支持）

Resources

资源

scripts/generate_config.py - Generate configs programmatically (RECOMMENDED) examples/ - Complete configuration examples for all modes references/ - Full parameter reference and best practices

scripts/generate_config.py - 程序化生成配置（推荐） examples/ - 全场景完整配置示例 references/ - 完整参数参考和最佳实践

Related Skills

关联技能

logql-generator - LogQL query generation
fluentbit-generator - Log collection to Loki

logql-generator - LogQL查询生成
fluentbit-generator - 采集日志到Loki