perses-query-builder

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Perses Query Builder

Perses 查询构建工具

Build and optimize queries for Perses dashboard panels.
为Perses仪表盘面板构建并优化查询。

Operator Context

操作上下文

This skill constructs, validates, and optimizes queries embedded in Perses panel definitions. It handles PromQL (Prometheus), LogQL (Loki), and TraceQL (Tempo) with correct variable interpolation and datasource binding.
本技能可构建、验证并优化嵌入Perses面板定义中的查询。它支持PromQL(Prometheus)、LogQL(Loki)和TraceQL(Tempo),并能正确处理变量插值与数据源绑定。

Hardcoded Behaviors (Always Apply)

硬编码行为(始终适用)

  • Variable-aware: Always use Perses variable syntax
    $var
    or
    ${var:format}
    — never hardcode label values that should come from variables
  • Datasource-scoped: Every query MUST reference its datasource by both
    kind
    and
    name
    fields
  • Interpolation-correct: Use
    ${var:regex}
    for
    =~
    matchers,
    ${var:csv}
    or
    ${var:pipe}
    for multi-select labels — never use bare
    $var
    with regex operators
  • Rate interval alignment: Use
    $__rate_interval
    when the platform provides it; otherwise set rate intervals >= 4x the scrape interval
  • 支持变量:始终使用Perses变量语法
    $var
    ${var:format}
    ——切勿硬编码应来自变量的标签值
  • 数据源作用域:每个查询必须同时通过
    kind
    name
    字段关联对应的数据源
  • 正确插值:对
    =~
    匹配器使用
    ${var:regex}
    ,对多选标签使用
    ${var:csv}
    ${var:pipe}
    ——切勿在正则运算符中使用裸
    $var
  • 速率间隔对齐:若平台提供
    $__rate_interval
    则使用该变量;否则设置速率间隔≥4倍抓取间隔

Default Behaviors (ON unless disabled)

默认行为(启用状态,除非手动关闭)

  • PromQL default: Default to PrometheusTimeSeriesQuery if query type not specified
  • Optimization suggestions: Flag recording rule candidates for expensive aggregations over high-cardinality metrics
  • Label matcher validation: Warn when queries lack a narrowing label matcher (e.g., selecting all series for a metric)
  • Multi-value detection: When a variable is marked
    allowMultiple
    , automatically apply the correct interpolation format
  • PromQL默认设置:若未指定查询类型,默认使用PrometheusTimeSeriesQuery
  • 优化建议:针对高基数指标上的昂贵聚合操作,标记可转为记录规则的候选查询
  • 标签匹配器验证:当查询缺少窄化标签匹配器时(例如选择某个指标的所有序列)发出警告
  • 多值检测:当变量标记为
    allowMultiple
    时,自动应用正确的插值格式

Optional Behaviors (OFF unless enabled)

可选行为(关闭状态,除非手动启用)

  • Recording rule generation: Produce
    recording_rules.yaml
    for identified candidates
  • TraceQL exemplar linking: Add exemplar query alongside PromQL for trace correlation
  • Query explain mode: Annotate each query clause with comments explaining what it selects
  • 记录规则生成:为识别出的候选查询生成
    recording_rules.yaml
  • TraceQL示例链接:在PromQL旁添加示例查询以实现链路关联
  • 查询解释模式:为每个查询子句添加注释,说明其筛选逻辑

What This Skill CAN Do

本技能可完成的工作

  • Build PromQL, LogQL, and TraceQL queries for Perses panel specs
  • Apply correct Perses variable interpolation formats (
    ${var:regex}
    ,
    ${var:csv}
    , etc.)
  • Validate query syntax and flag common PromQL/LogQL/TraceQL errors
  • Suggest query optimizations (recording rules, label narrowing, rate intervals)
  • Wire queries to the correct datasource kind and name
  • 为Perses面板规格构建PromQL、LogQL和TraceQL查询
  • 应用正确的Perses变量插值格式(
    ${var:regex}
    ${var:csv}
    等)
  • 验证查询语法,标记PromQL/LogQL/TraceQL的常见错误
  • 提供查询优化建议(记录规则、标签窄化、速率间隔设置)
  • 将查询关联到正确的数据源类型(kind)和名称

What This Skill CANNOT Do

本技能不可完成的工作

  • Create or configure datasources (use perses-datasource-manage)
  • Build full dashboards or panel layouts (use perses-dashboard-create)
  • Deploy Perses server instances (use perses-deploy)
  • Develop custom Perses plugins (use perses-plugin-create)

  • 创建或配置数据源(请使用perses-datasource-manage)
  • 构建完整仪表盘或面板布局(请使用perses-dashboard-create)
  • 部署Perses服务器实例(请使用perses-deploy)
  • 开发自定义Perses插件(请使用perses-plugin-create)

Error Handling

错误处理

PromQL Syntax Errors

PromQL语法错误

Symptom: Query fails validation — missing closing bracket, invalid function name, bad label matcher syntax. Detection: Look for unbalanced
()
,
{}
,
[]
; unknown function names;
=~
with unescaped special chars. Resolution: Fix the syntax. Common fixes:
  • Add missing closing
    }
    or
    )
  • Replace
    =~
    value with a valid RE2 regex (no lookaheads)
  • Use correct function name (e.g.,
    rate()
    not
    Rate()
    ,
    histogram_quantile()
    not
    histogram_percentile()
    )
症状:查询验证失败——缺少闭合括号、函数名称无效、标签匹配器语法错误 检测方式:查找不平衡的
()
{}
[]
;未知函数名称;
=~
后使用未转义的特殊字符 解决方法:修复语法问题。常见修复操作:
  • 添加缺失的闭合
    }
    或`)
  • =~
    后的值替换为有效的RE2正则表达式(不支持前向断言)
  • 使用正确的函数名称(例如
    rate()
    而非
    Rate()
    histogram_quantile()
    而非
    histogram_percentile()

Variable Interpolation Format Mismatch

变量插值格式不匹配

Symptom: Dashboard renders wrong results or query errors when multi-value variable is selected. Detection:
$var
or
${var}
used with
=~
matcher;
${var:csv}
used with
=~
(needs regex format). Resolution:
  • For
    =~
    matchers: use
    ${var:regex}
    (produces
    val1|val2|val3
    )
  • For
    =
    with multi-select: use
    ${var:csv}
    or
    ${var:pipe}
    depending on downstream expectation
  • For JSON API params: use
    ${var:json}
症状:当选择多值变量时,仪表盘渲染结果错误或查询报错 检测方式
=~
匹配器中使用了
$var
${var}
=~
中使用了
${var:csv}
(需使用regex格式) 解决方法
  • 对于
    =~
    匹配器:使用
    ${var:regex}
    (生成
    val1|val2|val3
  • 对于多值的
    =
    匹配:根据下游需求使用
    ${var:csv}
    ${var:pipe}
  • 对于JSON API参数:使用
    ${var:json}

Datasource Kind Mismatch

数据源类型不匹配

Symptom: Query silently returns no data or errors at runtime with "unsupported query type". Detection: Query plugin
kind
does not match datasource
kind
(e.g.,
PrometheusTimeSeriesQuery
referencing a
TempoDatasource
). Resolution: Align the query plugin kind with the datasource kind:
  • PrometheusTimeSeriesQuery
    PrometheusDatasource
  • TempoTraceQuery
    TempoDatasource
  • LokiLogQuery
    LokiDatasource
症状:查询无返回结果或运行时报错「不支持的查询类型」 检测方式:查询插件的
kind
与数据源的
kind
不匹配(例如
PrometheusTimeSeriesQuery
关联
TempoDatasource
解决方法:使查询插件类型与数据源类型对齐:
  • PrometheusTimeSeriesQuery
    PrometheusDatasource
  • TempoTraceQuery
    TempoDatasource
  • LokiLogQuery
    LokiDatasource

High-Cardinality Query Warnings

高基数查询警告

Symptom: Query is slow, times out, or overwhelms Prometheus. Detection: No label matchers narrowing selection;
rate()
missing or with no interval; aggregation over unbounded label set. Resolution:
  • Add label matchers to reduce selected series (at minimum
    job
    or
    namespace
    )
  • Wrap counters in
    rate()
    or
    increase()
    with an appropriate interval
  • Consider a recording rule for expensive
    histogram_quantile()
    or multi-level aggregations

症状:查询缓慢、超时或导致Prometheus负载过高 检测方式:查询缺少标签匹配器进行窄化;
rate()
缺失或间隔设置不合理;对无界标签集进行聚合 解决方法
  • 添加标签匹配器以减少选中的序列(至少包含
    job
    namespace
  • 为计数器添加
    rate()
    increase()
    ,并设置合理的间隔
  • 考虑为昂贵的
    histogram_quantile()
    或多层聚合操作创建记录规则

Anti-Patterns

反模式

Hardcoding Label Values

硬编码标签值

Wrong:
http_requests_total{namespace="production"}
in a panel query. Right:
http_requests_total{namespace="$namespace"}
using a dashboard variable. Why: Hardcoded values break reusability across environments and defeat the purpose of dashboard variables.
错误示例:面板查询中使用
http_requests_total{namespace="production"}
正确示例:使用仪表盘变量
http_requests_total{namespace="$namespace"}
原因:硬编码值会破坏仪表盘在多环境中的复用性,失去变量的意义

Bare
$var
with Multi-Value or Regex

多值或正则场景使用裸
$var

Wrong:
http_requests_total{pod=~"$pod"}
when
pod
is a multi-select variable. Right:
http_requests_total{pod=~"${pod:regex}"}
. Why: Without
:regex
format, multi-select values are not joined with
|
— the query matches only the first selected value or produces a syntax error.
错误示例:当
pod
是多选择变量时,使用
http_requests_total{pod=~"$pod"}
正确示例
http_requests_total{pod=~"${pod:regex}"}
原因:若不使用
:regex
格式,多值变量的取值不会用
|
连接——查询只会匹配第一个选中值或产生语法错误

Missing Datasource Spec in Query

查询中缺失数据源配置

Wrong: Omitting the
datasource
block or specifying only
name
without
kind
. Right:
yaml
datasource:
  kind: PrometheusDatasource
  name: prometheus
Why: Perses needs both
kind
and
name
to resolve the datasource. Omitting
kind
causes runtime resolution failures.
错误示例:省略
datasource
块或仅指定
name
而未指定
kind
正确示例
yaml
datasource:
  kind: PrometheusDatasource
  name: prometheus
原因:Perses需要同时通过
kind
name
来解析数据源。省略
kind
会导致运行时解析失败

Using
rate()
Without Meaningful Interval

rate()
使用无意义的间隔

Wrong:
rate(http_requests_total{job="api"}[1s])
. Right:
rate(http_requests_total{job="api"}[$__rate_interval])
or
[5m]
aligned with scrape interval. Why: Intervals shorter than the scrape interval produce empty results;
$__rate_interval
auto-adapts.

错误示例
rate(http_requests_total{job="api"}[1s])
正确示例
rate(http_requests_total{job="api"}[$__rate_interval])
或与抓取间隔对齐的
[5m]
原因:间隔短于抓取间隔会产生空结果;
$__rate_interval
可自动适配

Anti-Rationalization

反合理化借口

RationalizationRealityRequired Action
"Bare
$var
works fine for single-select"
Variables can be changed to multi-select later, breaking the queryAlways use explicit format when combined with
=~
"Datasource kind is obvious from context"Perses resolves datasources by kind+name pair at runtimeAlways specify both
kind
and
name
"This query is simple enough to skip validation"Simple queries with typos still fail silentlyValidate every query against syntax rules
"Recording rules are premature optimization"
histogram_quantile
over thousands of series will time out in production
Flag recording rule candidates for expensive aggregations

借口实际情况必要操作
「单选择场景下裸
$var
没问题」
变量后续可能被改为多选择,导致查询失效当与
=~
结合时,始终使用显式格式
「数据源类型从上下文就能看出来」Perses在运行时通过kind+name对来解析数据源始终同时指定
kind
name
「这个查询太简单,不用验证」简单查询的拼写错误仍会导致静默失败所有查询都要验证语法规则
「记录规则是过早优化」对数千条序列执行
histogram_quantile()
会在生产环境中超时
标记昂贵聚合操作的记录规则候选

FORBIDDEN Patterns

禁止模式

  • NEVER use
    ${var:regex}
    with
    =
    (equality) matchers — regex format with
    =
    causes silent mismatches
  • NEVER omit
    kind
    from the datasource reference — Perses cannot resolve by name alone
  • NEVER mix query plugin types within a single panel query list (e.g., PromQL and TraceQL in the same
    queries[]
    array)
  • NEVER use Grafana-style
    $__interval
    or
    ${__rate_interval}
    — Perses uses
    $__rate_interval
    (no braces, double underscores)
  • NEVER assume a variable supports multi-select — check the variable definition's
    allowMultiple
    field

  • 绝对不要
    =
    (相等)匹配器中使用
    ${var:regex}
    ——正则格式与
    =
    结合会导致静默不匹配
  • 绝对不要在数据源引用中省略
    kind
    ——Perses无法仅通过名称解析数据源
  • 绝对不要在单个面板的查询列表中混合查询插件类型(例如同一
    queries[]
    数组中同时包含PromQL和TraceQL)
  • 绝对不要使用Grafana风格的
    $__interval
    ${__rate_interval}
    ——Perses使用
    $__rate_interval
    (无大括号,双下划线)
  • 绝对不要假设变量支持多选择——需检查变量定义中的
    allowMultiple
    字段

Blocker Criteria

阻塞条件

Do NOT proceed past the BUILD phase if any of these are true:
  1. Datasource unknown: The target datasource name and kind have not been confirmed — query cannot be validated
  2. Variable definitions missing: Query references
    $var
    but no matching variable exists in the dashboard spec
  3. Query type ambiguous: Cannot determine whether PromQL, LogQL, or TraceQL is needed from user request
  4. Metric name unverified: The metric name referenced does not exist in the target Prometheus/Loki/Tempo instance and the user has not confirmed it

如果存在以下任一情况,请勿进入BUILD阶段:
  1. 数据源未知:目标数据源的名称和类型未确认——无法验证查询
  2. 变量定义缺失:查询引用了
    $var
    但仪表盘规格中无匹配的变量
  3. 查询类型模糊:无法从用户请求中确定需要PromQL、LogQL还是TraceQL
  4. 指标名称未验证:查询引用的指标名称在目标Prometheus/Loki/Tempo实例中不存在,且用户未确认该指标

Instructions

操作步骤

Phase 1: IDENTIFY

阶段1:识别

Goal: Determine query type, datasource, and variable context.
  1. Query type: Identify which query language is needed:
    • PrometheusTimeSeriesQuery (PromQL) — metrics, counters, histograms
    • TempoTraceQuery (TraceQL) — distributed traces
    • LokiLogQuery (LogQL) — log streams
  2. Datasource: Confirm the datasource
    name
    and
    kind
    from the dashboard or project context
  3. Variables: Identify which dashboard variables the query should reference and their
    allowMultiple
    setting
Gate: Query type, datasource, and variable context confirmed. Proceed to Phase 2.
目标:确定查询类型、数据源和变量上下文
  1. 查询类型:确定所需的查询语言:
    • PrometheusTimeSeriesQuery(PromQL)——指标、计数器、直方图
    • TempoTraceQuery(TraceQL)——分布式链路
    • LokiLogQuery(LogQL)——日志流
  2. 数据源:从仪表盘或项目上下文中确认数据源的
    name
    kind
  3. 变量:确定查询应引用的仪表盘变量及其
    allowMultiple
    设置
检查点:查询类型、数据源和变量上下文已确认,进入阶段2

Phase 2: BUILD

阶段2:构建

Goal: Construct the query with proper variable templating and datasource binding.
yaml
queries:
  - kind: TimeSeriesQuery
    spec:
      plugin:
        kind: PrometheusTimeSeriesQuery
        spec:
          query: "rate(http_requests_total{job=\"$job\", instance=~\"${instance:regex}\"}[$__rate_interval])"
          datasource:
            kind: PrometheusDatasource
            name: prometheus
Variable interpolation reference:
FormatOutputUse With
${var:regex}
val1|val2|val3
=~
matchers
${var:csv}
val1,val2,val3
API params,
in()
${var:pipe}
val1|val2|val3
Custom pipe-delimited contexts
${var:json}
["val1","val2"]
JSON payloads
${var:doublequote}
"val1","val2"
Quoted lists
${var:singlequote}
'val1','val2'
Quoted lists
${var:glob}
{val1,val2}
Glob patterns
${var:lucene}
("val1" OR "val2")
Lucene queries
${var:raw}
val1
(first only)
Single-value forced
${var:values}
val1+val2
URL-encoded params
${var:singlevariablevalue}
val1
Force single value
Gate: Query built with correct interpolation and datasource. Proceed to Phase 3.
目标:构建包含正确变量模板和数据源绑定的查询
yaml
queries:
  - kind: TimeSeriesQuery
    spec:
      plugin:
        kind: PrometheusTimeSeriesQuery
        spec:
          query: "rate(http_requests_total{job=\"$job\", instance=~\"${instance:regex}\"}[$__rate_interval])"
          datasource:
            kind: PrometheusDatasource
            name: prometheus
变量插值参考:
格式输出结果适用场景
${var:regex}
val1|val2|val3
=~
匹配器
${var:csv}
val1,val2,val3
API参数、
in()
函数
${var:pipe}
val1|val2|val3
自定义管道分隔场景
${var:json}
["val1","val2"]
JSON负载
${var:doublequote}
"val1","val2"
引号包裹的列表
${var:singlequote}
'val1','val2'
引号包裹的列表
${var:glob}
{val1,val2}
通配符模式
${var:lucene}
("val1" OR "val2")
Lucene查询
${var:raw}
val1
(仅第一个值)
强制单值场景
${var:values}
val1+val2
URL编码参数
${var:singlevariablevalue}
val1
强制单值
检查点:查询已构建完成,包含正确的插值格式和数据源绑定,进入阶段3

Phase 3: OPTIMIZE

阶段3:优化

Goal: Review the query for performance and correctness.
  1. Label narrowing: Ensure at least one selective label matcher is present (e.g.,
    job
    ,
    namespace
    )
  2. Rate intervals: Confirm
    rate()
    /
    increase()
    intervals align with scrape interval or use
    $__rate_interval
  3. Recording rule candidates: Flag
    histogram_quantile()
    over high-cardinality metrics, multi-level
    sum(rate(...))
    aggregations, or any query aggregating over > 1000 estimated series
  4. Variable format audit: Verify every
    $var
    reference uses the correct interpolation format for its operator context
  5. Datasource alignment: Confirm query plugin kind matches datasource kind
Gate: Query optimized and validated. Task complete.

目标:检查查询的性能和正确性
  1. 标签窄化:确保查询至少包含一个筛选性标签匹配器(例如
    job
    namespace
  2. 速率间隔:确认
    rate()
    /
    increase()
    的间隔与抓取间隔对齐,或使用
    $__rate_interval
  3. 记录规则候选:标记对高基数指标执行的
    histogram_quantile()
    、多层
    sum(rate(...))
    聚合,或任何预估聚合序列超过1000条的查询
  4. 变量格式审计:验证每个
    $var
    引用都根据其操作符上下文使用了正确的插值格式
  5. 数据源对齐:确认查询插件类型与数据源类型匹配
检查点:查询已优化并验证完成,任务结束

References

参考资料