building-dashboards

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Building Dashboards

构建仪表板

You design dashboards that help humans make decisions quickly. Dashboards are products: audience, questions, and actions matter more than chart count.

你设计的仪表板能够帮助人们快速做出决策。仪表板是产品：受众、要解决的问题以及可执行的操作比图表数量更重要。

Philosophy

设计理念

Decisions first. Every panel answers a question that leads to an action.
Overview → drilldown → evidence. Start broad, narrow on click/filter, end with raw logs.
Rates and percentiles over averages. Averages hide problems; p95/p99 expose them.
Simple beats dense. One question per panel. No chart junk.
Validate with data. Never guess fields—discover schema first.

**决策优先。**每个面板都要回答一个能导向具体行动的问题。
**概览 → 下钻 → 证据。**从全局视角开始，点击/筛选后聚焦细节，最终展示原始日志。
**优先使用比率和百分位数而非平均值。**平均值会掩盖问题；p95/p99分位数能暴露问题。
**简洁胜于密集。**每个面板只解决一个问题。避免冗余图表元素。
**用数据验证。**绝不猜测字段——先发现数据模式。

Entry Points

起始场景

Choose your starting point:

Starting from	Workflow
Vague description	Intake → design blueprint → APL per panel → deploy
Template	Pick template → customize dataset/service/env → deploy
Splunk dashboard	Extract SPL → translate via spl-to-apl → map to chart types → deploy
Exploration	Use axiom-sre to discover schema/signals → productize into panels

选择你的起始工作流：

起始场景	工作流
模糊需求描述	需求收集 → 设计蓝图 → 为每个面板编写APL → 部署
模板	选择模板 → 自定义数据集/服务/环境 → 部署
Splunk仪表板	提取SPL → 通过spl-to-apl转换为APL → 映射图表类型 → 部署
探索分析	使用axiom-sre发现模式/信号 → 转化为可复用面板

Intake: What to Ask First

需求收集：首先要明确的问题

Before designing, clarify:

Audience & decision
- Oncall triage? (fast refresh, error-focused)
- Team health? (daily trends, SLO tracking)
- Exec reporting? (weekly summaries, high-level)
Scope
- Service, environment, region, cluster, endpoint?
- Single service or cross-service view?
Datasets
- Which Axiom datasets contain the data?
- Run
```
getschema
```
  to discover fields—never guess:
apl
```
['dataset'] | where _time between (ago(1h) .. now()) | getschema
```
Golden signals
- Traffic: requests/sec, events/min
- Errors: error rate, 5xx count
- Latency: p50, p95, p99 duration
- Saturation: CPU, memory, queue depth, connections
Drilldown dimensions
- What do users filter/group by? (service, route, status, pod, customer_id)

开始设计前，先厘清以下内容：

受众与决策场景
- 是运维值班排查？（需快速刷新，聚焦错误）
- 是团队健康度监控？（需每日趋势，SLO追踪）
- 是高管汇报？（需每周汇总，高层视角）
范围
- 涉及哪些服务、环境、区域、集群、端点？
- 是单服务视图还是跨服务视图？
数据集
- 哪些Axiom数据集包含所需数据？
- 运行
```
getschema
```
  来发现字段——绝不猜测：
apl
```
['dataset'] | where _time between (ago(1h) .. now()) | getschema
```
核心指标
- 流量：请求/秒、事件/分钟
- 错误：错误率、5xx请求数
- 延迟：p50、p95、p99响应时长
- 饱和度：CPU、内存、队列深度、连接数
下钻维度
- 用户会按哪些维度筛选/分组？（服务、路由、状态、Pod、客户ID）

Dashboard Blueprint

仪表板蓝图

Use this 4-section structure as the default:

默认采用以下4段式结构：

1. At-a-Glance (Statistic panels)

1. 概览面板（统计型面板）

Single numbers that answer "is it broken right now?"

Error rate (last 5m)
p95 latency (last 5m)
Request rate (last 5m)
Active alerts (if applicable)

用单个数值回答“当前是否出现故障？”

错误率（最近5分钟）
p95延迟（最近5分钟）
请求率（最近5分钟）
活跃告警（如有）

2. Trends (TimeSeries panels)

2. 趋势面板（时间序列面板）

Time-based patterns that answer "what changed?"

Traffic over time
Error rate over time
Latency percentiles over time
Stacked by status/service for comparison

用时间维度的模式回答“发生了什么变化？”

流量趋势
错误率趋势
延迟分位数趋势
按状态/服务堆叠以方便对比

3. Breakdowns (Table/Pie panels)

3. 细分面板（表格/饼图面板）

Top-N analysis that answers "where should I look?"

Top 10 failing routes
Top 10 error messages
Worst pods by error rate
Request distribution by status

用Top-N分析回答“应该关注哪里？”

故障最多的10个路由
出现最频繁的10条错误信息
错误率最高的Pod
请求按状态的分布

4. Evidence (LogStream + SmartFilter)

4. 证据面板（LogStream + SmartFilter）

Raw events that answer "what exactly happened?"

LogStream filtered to errors
SmartFilter for service/env/route
Key fields projected for readability

用原始事件回答“具体发生了什么？”

过滤为错误的LogStream
针对服务/环境/路由的SmartFilter
仅展示关键字段以提升可读性

Chart Types

图表类型

Note: Dashboard queries inherit time from the UI picker—no explicit

_time

filter needed.

Validation: TimeSeries, Statistic, Table, Pie, LogStream, Note, MonitorList are fully validated by

dashboard-validate

. Heatmap, ScatterPlot, SmartFilter work but may trigger warnings.

**注意：**仪表板查询会继承UI时间选择器的时间范围——无需显式添加

_time

过滤条件。

**验证支持：**TimeSeries、Statistic、Table、Pie、LogStream、Note、MonitorList完全支持

dashboard-validate

验证。Heatmap、ScatterPlot、SmartFilter可正常使用，但可能触发警告。

Statistic

统计型（Statistic）

When: Single KPI, current value, threshold comparison.

apl

['logs']
| where service == "api"
| summarize 
    total = count(),
    errors = countif(status >= 500)
| extend error_rate = round(100.0 * errors / total, 2)
| project error_rate

Pitfalls: Don't use for time series; ensure query returns single row.

**适用场景：**单个KPI、当前值、阈值对比。

apl

['logs']
| where service == "api"
| summarize 
    total = count(),
    errors = countif(status >= 500)
| extend error_rate = round(100.0 * errors / total, 2)
| project error_rate

**避坑指南：**不要用于时间序列场景；确保查询返回单行结果。

TimeSeries

时间序列（TimeSeries）

When: Trends over time, before/after comparison, rate changes.

apl

// Single metric - use bin_auto for automatic sizing
['logs']
| summarize ['req/min'] = count() by bin_auto(_time)

// Latency percentiles - use percentiles_array for proper overlay
['logs']
| summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)

Best practices:

Use
```
bin_auto(_time)
```
instead of fixed
```
bin(_time, 1m)
```
— auto-adjusts to time window
Use
```
percentiles_array()
```
instead of multiple
```
percentile()
```
calls — renders as one chart
Too many series = unreadable; use
```
top N
```
or filter

**适用场景：**时间趋势、前后对比、速率变化。

apl

// 单一指标 - 使用bin_auto自动调整时间粒度
['logs']
| summarize ['req/min'] = count() by bin_auto(_time)

// 延迟分位数 - 使用percentiles_array实现正确的叠加展示
['logs']
| summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)

最佳实践：

用
```
bin_auto(_time)
```
替代固定的
```
bin(_time, 1m)
```
——会根据时间窗口自动调整
用
```
percentiles_array()
```
替代多次
```
percentile()
```
调用——会渲染为单个图表
过多序列会导致可读性下降；使用
```
top N
```
或过滤条件精简

Table

表格（Table）

When: Top-N lists, detailed breakdowns, exportable data.

apl

['logs']
| where status >= 500
| summarize errors = count() by route, error_message
| top 10 by errors
| project route, error_message, errors

Pitfalls:

Always use
```
top N
```
to prevent unbounded results
Use
```
project
```
to control column order and names

**适用场景：**Top-N列表、详细细分、可导出数据。

apl

['logs']
| where status >= 500
| summarize errors = count() by route, error_message
| top 10 by errors
| project route, error_message, errors

避坑指南：

始终使用
```
top N
```
避免无限制结果
用
```
project
```
控制列的顺序和名称

Pie

饼图（Pie）

When: Share-of-total for LOW cardinality dimensions (≤6 slices).

apl

['logs']
| summarize count() by status_class = case(
    status < 300, "2xx",
    status < 400, "3xx",
    status < 500, "4xx",
    "5xx"
  )

Pitfalls:

Never use for high cardinality (routes, user IDs)
Prefer tables for >6 categories
Always aggregate to reduce slices

**适用场景：**低基数维度（≤6个分类）的占比分析。

apl

['logs']
| summarize count() by status_class = case(
    status < 300, "2xx",
    status < 400, "3xx",
    status < 500, "4xx",
    "5xx"
  )

避坑指南：

不要用于高基数维度（路由、用户ID）
分类超过6个时优先使用表格
始终先聚合以减少分类数量

LogStream

日志流（LogStream）

When: Raw event inspection, debugging, evidence gathering.

apl

['logs']
| where service == "api" and status >= 500
| project-keep _time, trace_id, route, status, error_message, duration_ms
| take 100

Pitfalls:

Always include
```
take N
```
(100-500 max)
Use
```
project-keep
```
to show relevant fields only
Filter aggressively—raw logs are expensive

**适用场景：**原始事件检查、调试、证据收集。

apl

['logs']
| where service == "api" and status >= 500
| project-keep _time, trace_id, route, status, error_message, duration_ms
| take 100

避坑指南：

始终包含
```
take N
```
（最多100-500条）
用
```
project-keep
```
只展示相关字段
严格过滤——原始日志查询成本较高

Heatmap

热力图（Heatmap）

When: Distribution visualization, latency patterns, density analysis.

apl

['logs']
| summarize histogram(duration_ms, 15) by bin_auto(_time)

Best for: Latency distributions, response time patterns, identifying outliers.

**适用场景：**分布可视化、延迟模式、密度分析。

apl

['logs']
| summarize histogram(duration_ms, 15) by bin_auto(_time)

**最佳适用场景：**延迟分布、响应时间模式、异常值识别。

Scatter Plot

散点图（Scatter Plot）

When: Correlation between two metrics, identifying patterns.

apl

['logs']
| summarize avg(duration_ms), avg(resp_size_bytes) by route

Best for: Response size vs latency correlation, resource usage patterns.

**适用场景：**两个指标间的相关性分析、模式识别。

apl

['logs']
| summarize avg(duration_ms), avg(resp_size_bytes) by route

**最佳适用场景：**响应大小与延迟的相关性、资源使用模式。

SmartFilter (Filter Bar)

SmartFilter（过滤栏）

When: Interactive filtering for the entire dashboard.

SmartFilter is a chart type that creates dropdown/search filters. Requires:

A
```
SmartFilter
```
chart with filter definitions
```
declare query_parameters
```
in each panel query

Filter types:

```
selectType: "apl"
```
— Dynamic dropdown from APL query
```
selectType: "list"
```
— Static dropdown with predefined options
```
type: "search"
```
— Free-text input

Panel query pattern:

apl

declare query_parameters (country_filter:string = "");
['logs'] | where isempty(country_filter) or ['geo.country'] == country_filter

See

reference/smartfilter.md

for full JSON structure and cascading filter examples.

**适用场景：**为整个仪表板提供交互式过滤功能。

SmartFilter是一种图表类型，用于创建下拉/搜索过滤器。需要：

一个包含过滤规则的
```
SmartFilter
```
图表
每个面板查询中添加
```
declare query_parameters
```

过滤类型：

```
selectType: "apl"
```
—— 基于APL查询的动态下拉选项
```
selectType: "list"
```
—— 预定义选项的静态下拉菜单
```
type: "search"
```
—— 自由文本输入框

面板查询模式：

apl

declare query_parameters (country_filter:string = "");
['logs'] | where isempty(country_filter) or ['geo.country'] == country_filter

完整的JSON结构和级联过滤示例请参考

reference/smartfilter.md

。

Monitor List

监控列表（Monitor List）

When: Display monitor status on operational dashboards.

No APL needed—select monitors from the UI. Shows:

Monitor status (normal/triggered/off)
Run history (green/red squares)
Dataset, type, notifiers

**适用场景：**在运维仪表板上展示监控状态。

无需编写APL——从UI中选择监控项即可。展示内容包括：

监控状态（正常/触发/关闭）
运行历史（绿/红方块）
数据集、类型、通知方式

Note

备注（Note）

When: Context, instructions, section headers.

Use GitHub Flavored Markdown for:

Dashboard purpose and audience
Runbook links
Section dividers
On-call instructions

**适用场景：**添加上下文信息、操作说明、章节标题。

支持GitHub风格的Markdown，可用于：

仪表板用途和受众说明
运行手册链接
章节分隔符
值班操作指南

Chart Configuration

图表配置

Charts support JSON configuration options beyond the query. See

reference/chart-config.md

for full details.

Quick reference:

Chart Type	Key Options
Statistic	`colorScheme` , `customUnits` , `unit` , `showChart` (sparkline), `errorThreshold` / `warningThreshold`
TimeSeries	`aggChartOpts` : `variant` (line/area/bars), `scaleDistr` (linear/log), `displayNull`
LogStream/Table	`tableSettings` : `columns` , `fontSize` , `highlightSeverity` , `wrapLines`
Pie	`hideHeader`
Note	`text` (markdown), `variant`

Common options (all charts):

```
overrideDashboardTimeRange
```
: boolean
```
overrideDashboardCompareAgainst
```
: boolean
```
hideHeader
```
: boolean

图表支持除查询外的JSON配置选项。完整详情请参考

reference/chart-config.md

。

快速参考：

图表类型	关键配置项
Statistic	`colorScheme` 、 `customUnits` 、 `unit` 、 `showChart` （迷你趋势图）、 `errorThreshold` / `warningThreshold`
TimeSeries	`aggChartOpts` : `variant` （折线/面积/柱状）、 `scaleDistr` （线性/对数）、 `displayNull`
LogStream/Table	`tableSettings` : `columns` 、 `fontSize` 、 `highlightSeverity` 、 `wrapLines`
Pie	`hideHeader`
Note	`text` （Markdown内容）、 `variant`

通用配置项（所有图表）：

```
overrideDashboardTimeRange
```
: 布尔值
```
overrideDashboardCompareAgainst
```
: 布尔值
```
hideHeader
```
: 布尔值

APL Patterns

APL模式

Time Filtering in Dashboards vs Ad-hoc Queries

仪表板查询与临时查询的时间过滤差异

Dashboard panel queries do NOT need explicit time filters. The dashboard UI time picker automatically scopes all queries to the selected time window.

apl

// DASHBOARD QUERY — no time filter needed
['logs']
| where service == "api"
| summarize count() by bin_auto(_time)

Ad-hoc queries (Axiom Query tab, axiom-sre exploration) MUST have explicit time filters:

apl

// AD-HOC QUERY — always include time filter
['logs']
| where _time between (ago(1h) .. now())
| where service == "api"
| summarize count() by bin_auto(_time)

**仪表板面板查询无需显式时间过滤。**仪表板UI的时间选择器会自动将所有查询限定在选定的时间窗口内。

apl

// 仪表板查询 —— 无需时间过滤
['logs']
| where service == "api"
| summarize count() by bin_auto(_time)

临时查询（Axiom查询标签页、axiom-sre探索分析）必须添加显式时间过滤：

apl

// 临时查询 —— 必须包含时间过滤
['logs']
| where _time between (ago(1h) .. now())
| where service == "api"
| summarize count() by bin_auto(_time)

Bin Size Selection

时间粒度选择

Prefer
bin_auto(_time)
— it automatically adjusts to the dashboard time window.

Manual bin sizes (only when auto doesn't fit your needs):

Time window	Bin size
15m	10s–30s
1h	1m
6h	5m
24h	15m–1h
7d	1h–6h

优先使用
bin_auto(_time)
——它会根据仪表板的时间窗口自动调整粒度。

手动设置粒度（仅当自动调整不符合需求时使用）：

时间窗口	推荐粒度
15分钟	10秒–30秒
1小时	1分钟
6小时	5分钟
24小时	15分钟–1小时
7天	1小时–6小时

Cardinality Guardrails

基数限制准则

Prevent query explosion:

apl

// GOOD: bounded
| summarize count() by route | top 10 by count_

// BAD: unbounded high-cardinality grouping
| summarize count() by user_id  // millions of rows

避免查询结果爆炸：

apl

// 推荐：有界结果
| summarize count() by route | top 10 by count_

// 不推荐：无界高基数分组
| summarize count() by user_id  // 会产生数百万行结果

Field Escaping

字段转义

Fields with dots need bracket notation:

apl

| where ['kubernetes.pod.name'] == "frontend"

Fields with dots IN the name (not hierarchy) need escaping:

apl

| where ['kubernetes.labels.app\\.kubernetes\\.io/name'] == "frontend"

含点的字段需要使用方括号语法：

apl

| where ['kubernetes.pod.name'] == "frontend"

名称中包含点的字段（非层级结构）需要转义：

apl

| where ['kubernetes.labels.app\\.kubernetes\\.io/name'] == "frontend"

Golden Signal Queries

核心指标查询示例

Traffic:

apl

| summarize requests = count() by bin_auto(_time)

Errors (as rate %):

apl

| summarize total = count(), errors = countif(status >= 500) by bin_auto(_time)
| extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0)
| project _time, error_rate

Latency (use percentiles_array for proper chart overlay):

apl

| summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)

流量：

apl

| summarize requests = count() by bin_auto(_time)

错误率（百分比形式）：

apl

| summarize total = count(), errors = countif(status >= 500) by bin_auto(_time)
| extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0)
| project _time, error_rate

延迟（使用percentiles_array实现正确的图表叠加）：

apl

| summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)

Layout Composition

布局组合

Grid Principles

网格原则

Dashboard width = 12 units
Typical panel: w=3 (quarter), w=4 (third), w=6 (half), w=12 (full)
Stats row: 4 panels × w=3, h=2
TimeSeries row: 2 panels × w=6, h=4
Tables: w=6 or w=12, h=4–6
LogStream: w=12, h=6–8

仪表板宽度为12单位
典型面板尺寸：w=3（四分之一宽）、w=4（三分之一宽）、w=6（半宽）、w=12（全宽）
统计行：4个面板 × w=3，h=2
时间序列行：2个面板 × w=6，h=4
表格：w=6或w=12，h=4–6
LogStream：w=12，h=6–8

Section Layout Pattern

章节布局模式

Row 0-1:  [Stat w=3] [Stat w=3] [Stat w=3] [Stat w=3]
Row 2-5:  [TimeSeries w=6, h=4] [TimeSeries w=6, h=4]
Row 6-9:  [Table w=6, h=4] [Pie w=6, h=4]
Row 10+:  [LogStream w=12, h=6]

行0-1:  [统计面板 w=3] [统计面板 w=3] [统计面板 w=3] [统计面板 w=3]
行2-5:  [时间序列面板 w=6, h=4] [时间序列面板 w=6, h=4]
行6-9:  [表格面板 w=6, h=4] [饼图面板 w=6, h=4]
行10+:  [LogStream面板 w=12, h=6]

Naming Conventions

命名规范

Use question-style titles: "Error rate by route" not "Errors"
Prefix with context if multi-service: "[API] Error rate"
Include units: "Latency (ms)", "Traffic (req/s)"

使用问题式标题：比如“按路由划分的错误率”而非“错误”
多服务场景下添加前缀：比如“[API] 错误率”
包含单位：比如“延迟（ms）”、“流量（req/s）”

Dashboard Settings

仪表板设置

Refresh Rate

刷新频率

Dashboard auto-refreshes at configured interval. Options: 15s, 30s, 1m, 5m, etc.

⚠️ Query cost warning: Short refresh (15s) + long time range (90d) = expensive queries running constantly.

Recommendations:

Use case	Refresh rate
Oncall/real-time	15s–30s
Team health	1m–5m
Executive/weekly	5m–15m

仪表板会按配置的间隔自动刷新。可选值：15秒、30秒、1分钟、5分钟等。

**⚠️ 查询成本警告：**短刷新间隔（15秒）+ 长时间范围（90天）= 持续运行的高成本查询。

推荐配置：

使用场景	刷新频率
值班/实时监控	15秒–30秒
团队健康度监控	1分钟–5分钟
高管/周度汇报	5分钟–15分钟

Sharing

共享设置

Just Me: Private, only you can access
Group: Specific team/group in your org
Everyone: All users in your Axiom org

Data visibility is still governed by dataset permissions—users only see data from datasets they can access.

仅我可见：私有，只有你能访问
指定群组：组织内的特定团队/群组
所有人可见：Axiom组织内的所有用户

数据可见性仍受数据集权限管控——用户只能访问他们有权限的数据集。

URL Time Range Parameters

URL时间范围参数

?t_qr=24h

(quick range),

?t_ts=...&t_te=...

(custom),

?t_against=-1d

(comparison)

?t_qr=24h

（快速时间范围）、

?t_ts=...&t_te=...

（自定义时间范围）、

?t_against=-1d

（对比时间范围）

Setup

环境搭建

Run

scripts/setup

to check requirements (curl, jq, ~/.axiom.toml).

Config in

~/.axiom.toml

(shared with axiom-sre):

toml

[deployments.prod]
url = "https://api.axiom.co"
token = "xaat-your-token"
org_id = "your-org-id"

运行

scripts/setup

检查依赖项（curl、jq、~/.axiom.toml）。

配置文件位于

~/.axiom.toml

（与axiom-sre共享）：

toml

[deployments.prod]
url = "https://api.axiom.co"
token = "xaat-your-token"
org_id = "your-org-id"

Deployment

部署流程

Scripts

脚本说明

Script	Usage
`scripts/get-user-id <deploy>`	Get your user ID for `owner` field
`scripts/dashboard-list <deploy>`	List all dashboards
`scripts/dashboard-get <deploy> <id>`	Fetch dashboard JSON
`scripts/dashboard-validate <file>`	Validate JSON structure
`scripts/dashboard-create <deploy> <file>`	Create dashboard
`scripts/dashboard-update <deploy> <id> <file>`	Update (needs version)
`scripts/dashboard-copy <deploy> <id>`	Clone dashboard
`scripts/dashboard-link <deploy> <id>`	Get shareable URL
`scripts/dashboard-delete <deploy> <id>`	Delete (with confirm)
`scripts/axiom-api <deploy> <method> <path>`	Low-level API calls

脚本	用途
`scripts/get-user-id <deploy>`	获取你的用户ID，用于 `owner` 字段
`scripts/dashboard-list <deploy>`	列出所有仪表板
`scripts/dashboard-get <deploy> <id>`	获取仪表板JSON内容
`scripts/dashboard-validate <file>`	验证JSON结构
`scripts/dashboard-create <deploy> <file>`	创建仪表板
`scripts/dashboard-update <deploy> <id> <file>`	更新仪表板（需要版本号）
`scripts/dashboard-copy <deploy> <id>`	克隆仪表板
`scripts/dashboard-link <deploy> <id>`	获取可共享的URL
`scripts/dashboard-delete <deploy> <id>`	删除仪表板（需确认）
`scripts/axiom-api <deploy> <method> <path>`	底层API调用

Workflow

工作流

⚠️ CRITICAL: Always validate queries BEFORE deploying.

Design dashboard (sections + panels)
Write APL for each panel
Build JSON (from template or manually)
Validate queries using axiom-sre with explicit time filter
```
dashboard-validate
```
to check structure
```
dashboard-create
```
or
```
dashboard-update
```
to deploy
dashboard-link
to get URL — NEVER construct Axiom URLs manually (org IDs and base URLs vary per deployment)
Share link with user

⚠️ 关键：部署前务必验证所有查询。

设计仪表板（章节 + 面板）
为每个面板编写APL
构建JSON文件（基于模板或手动编写）
使用axiom-sre并添加显式时间过滤来验证查询
运行
```
dashboard-validate
```
检查结构
运行
```
dashboard-create
```
或
```
dashboard-update
```
进行部署
使用
dashboard-link
获取URL —— 绝不手动构造Axiom URL（组织ID和基础URL因部署环境而异）
与用户共享链接

Sibling Skill Integration

关联技能集成

spl-to-apl: Translate Splunk SPL → APL. Map

timechart

→ TimeSeries,

stats

→ Statistic/Table. See

reference/splunk-migration.md

axiom-sre: Discover schema with

getschema

, explore baselines, identify dimensions, then productize into panels.

**spl-to-apl：**将Splunk SPL转换为APL。映射

timechart

到TimeSeries，

stats

到Statistic/Table。详情请参考

reference/splunk-migration.md

。

**axiom-sre：**使用

getschema

发现数据模式，探索基准线，识别维度，然后转化为可复用面板。

Templates

模板

Pre-built templates in

reference/templates/

Template	Use case
`service-overview.json`	Single service oncall dashboard with Heatmap
`service-overview-with-filters.json`	Same with SmartFilter (route/status dropdowns)
`api-health.json`	HTTP API with traffic/errors/latency
`blank.json`	Minimal skeleton

Placeholders:

{{owner_id}}

{{service}}

{{dataset}}

Usage:

bash

USER_ID=$(scripts/get-user-id prod)
scripts/dashboard-from-template service-overview "my-service" "$USER_ID" "my-dataset" ./dashboard.json
scripts/dashboard-validate ./dashboard.json
scripts/dashboard-create prod ./dashboard.json

⚠️ Templates assume field names (

service

status

route

duration_ms

). Discover your schema first and use

sed

to fix mismatches.

预构建模板位于

reference/templates/

：

模板	适用场景
`service-overview.json`	带热力图的单服务值班仪表板
`service-overview-with-filters.json`	带SmartFilter（路由/状态下拉）的单服务仪表板
`api-health.json`	包含流量/错误/延迟的HTTP API仪表板
`blank.json`	最小化骨架模板

占位符：

{{owner_id}}

、

{{service}}

、

{{dataset}}

使用方法：

bash

USER_ID=$(scripts/get-user-id prod)
scripts/dashboard-from-template service-overview "my-service" "$USER_ID" "my-dataset" ./dashboard.json
scripts/dashboard-validate ./dashboard.json
scripts/dashboard-create prod ./dashboard.json

⚠️ 模板假设字段名称（

service

、

status

、

route

、

duration_ms

）。请先发现你的数据模式，再使用

sed

修正不匹配的字段名。

Common Pitfalls

常见问题

Problem	Cause	Solution
"unable to find dataset" errors	Dataset name doesn't exist in your org	Check available datasets in Axiom UI
"creating dashboards for other users" 403	Owner ID doesn't match your token	Use `scripts/get-user-id prod` to get your UUID
All panels show errors	Field names don't match your schema	Discover schema first, use sed to fix field names
Dashboard shows no data	Service filter too restrictive	Remove or adjust `where service == 'x'` filters
Queries time out	Missing time filter or too broad	Dashboard inherits time from picker; ad-hoc queries need explicit time filter
Wrong org in dashboard URL	Manually constructed URL	Always use `dashboard-link <deploy> <id>` — never guess org IDs or base URLs

问题	原因	解决方案
“无法找到数据集”错误	组织中不存在该数据集名称	在Axiom UI中检查可用数据集
“为其他用户创建仪表板”403错误	Owner ID与你的token不匹配	使用 `scripts/get-user-id prod` 获取你的UUID
所有面板都显示错误	字段名称与你的数据模式不匹配	先发现数据模式，使用sed修正字段名
仪表板无数据展示	服务过滤条件过于严格	移除或调整 `where service == 'x'` 过滤条件
查询超时	缺少时间过滤或范围过宽	仪表板会继承时间选择器的范围；临时查询需要显式时间过滤
仪表板URL中的组织信息错误	手动构造URL	始终使用 `dashboard-link <deploy> <id>` —— 绝不猜测组织ID或基础URL

Reference

参考文档

```
reference/chart-config.md
```
— All chart configuration options (JSON)
```
reference/smartfilter.md
```
— SmartFilter/FilterBar full configuration
```
reference/chart-cookbook.md
```
— APL patterns per chart type
```
reference/layout-recipes.md
```
— Grid layouts and section blueprints
```
reference/splunk-migration.md
```
— Splunk panel → Axiom mapping
```
reference/design-playbook.md
```
— Decision-first design principles
```
reference/templates/
```
— Ready-to-use dashboard JSON files

For APL syntax: https://axiom.co/docs/apl/introduction

```
reference/chart-config.md
```
—— 所有图表的配置选项（JSON格式）
```
reference/smartfilter.md
```
—— SmartFilter/过滤栏的完整配置
```
reference/chart-cookbook.md
```
—— 各图表类型的APL模式
```
reference/layout-recipes.md
```
—— 网格布局和章节蓝图
```
reference/splunk-migration.md
```
—— Splunk面板到Axiom的映射规则
```
reference/design-playbook.md
```
—— 决策优先的设计原则
```
reference/templates/
```
—— 可直接使用的仪表板JSON文件

APL语法参考：https://axiom.co/docs/apl/introduction