database-observability
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGrafana Cloud Database Observability
Grafana Cloud 数据库可观测性
Provides query-level insights (RED metrics, query samples, explain plans) for MySQL and PostgreSQL
without application code changes. Generally Available as of April 2026.
无需修改应用代码即可提供针对 MySQL 和 PostgreSQL 的查询级洞察(RED 指标、查询样本、执行计划)。自 2026 年 4 月起正式可用。
Supported Databases
支持的数据库
| Database | Variants |
|---|---|
| MySQL | Self-managed, RDS MySQL, Aurora MySQL, Cloud SQL MySQL, Azure Database for MySQL |
| PostgreSQL | Self-managed, RDS PostgreSQL, Aurora PostgreSQL, Cloud SQL PostgreSQL, Azure Database for PostgreSQL |
| 数据库 | 变体类型 |
|---|---|
| MySQL | 自托管、RDS MySQL、Aurora MySQL、Cloud SQL MySQL、Azure Database for MySQL |
| PostgreSQL | 自托管、RDS PostgreSQL、Aurora PostgreSQL、Cloud SQL PostgreSQL、Azure Database for PostgreSQL |
Prerequisites
前提条件
PostgreSQL
PostgreSQL
sql
-- 1. Enable pg_stat_statements in postgresql.conf
-- shared_preload_libraries = 'pg_stat_statements'
-- Then restart PostgreSQL
-- 2. Create monitoring user
CREATE USER grafana_monitoring WITH PASSWORD 'secret';
GRANT pg_monitor TO grafana_monitoring;
GRANT CONNECT ON DATABASE mydb TO grafana_monitoring;
-- 3. Enable the extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;sql
-- 1. 在postgresql.conf中启用pg_stat_statements
-- shared_preload_libraries = 'pg_stat_statements'
-- 然后重启PostgreSQL
-- 2. 创建监控用户
CREATE USER grafana_monitoring WITH PASSWORD 'secret';
GRANT pg_monitor TO grafana_monitoring;
GRANT CONNECT ON DATABASE mydb TO grafana_monitoring;
-- 3. 启用扩展
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;MySQL
MySQL
sql
-- Create monitoring user with least-privilege permissions
CREATE USER 'grafana_monitoring'@'%' IDENTIFIED BY 'secret';
GRANT SELECT, PROCESS, REPLICATION CLIENT ON *.* TO 'grafana_monitoring'@'%';
GRANT SELECT ON performance_schema.* TO 'grafana_monitoring'@'%';
FLUSH PRIVILEGES;sql
-- 创建具有最小权限的监控用户
CREATE USER 'grafana_monitoring'@'%' IDENTIFIED BY 'secret';
GRANT SELECT, PROCESS, REPLICATION CLIENT ON *.* TO 'grafana_monitoring'@'%';
GRANT SELECT ON performance_schema.* TO 'grafana_monitoring'@'%';
FLUSH PRIVILEGES;Alloy Configuration
Alloy 配置
PostgreSQL
PostgreSQL
alloy
database_observability.postgres "mydb" {
data_source_name = "postgresql://grafana_monitoring:secret@localhost:5432/mydb?sslmode=disable"
enable_collectors = ["pg_stat_statements", "query_samples", "schema_details"]
forward_metrics_to = [prometheus.remote_write.cloud.receiver]
forward_logs_to = [loki.write.cloud.receiver]
}
prometheus.remote_write "cloud" {
endpoint {
url = sys.env("PROMETHEUS_URL")
basic_auth {
username = sys.env("PROMETHEUS_USER")
password = sys.env("GRAFANA_CLOUD_API_KEY")
}
}
}
loki.write "cloud" {
endpoint {
url = sys.env("LOKI_URL")
basic_auth {
username = sys.env("LOKI_USER")
password = sys.env("GRAFANA_CLOUD_API_KEY")
}
}
}alloy
database_observability.postgres "mydb" {
data_source_name = "postgresql://grafana_monitoring:secret@localhost:5432/mydb?sslmode=disable"
enable_collectors = ["pg_stat_statements", "query_samples", "schema_details"]
forward_metrics_to = [prometheus.remote_write.cloud.receiver]
forward_logs_to = [loki.write.cloud.receiver]
}
prometheus.remote_write "cloud" {
endpoint {
url = sys.env("PROMETHEUS_URL")
basic_auth {
username = sys.env("PROMETHEUS_USER")
password = sys.env("GRAFANA_CLOUD_API_KEY")
}
}
}
loki.write "cloud" {
endpoint {
url = sys.env("LOKI_URL")
basic_auth {
username = sys.env("LOKI_USER")
password = sys.env("GRAFANA_CLOUD_API_KEY")
}
}
}MySQL
MySQL
alloy
database_observability.mysql "mydb" {
data_source_name = "grafana_monitoring:secret@tcp(localhost:3306)/mydb"
enable_collectors = ["query_samples", "explain_plans", "schema_details"]
forward_metrics_to = [prometheus.remote_write.cloud.receiver]
forward_logs_to = [loki.write.cloud.receiver]
}alloy
database_observability.mysql "mydb" {
data_source_name = "grafana_monitoring:secret@tcp(localhost:3306)/mydb"
enable_collectors = ["query_samples", "explain_plans", "schema_details"]
forward_metrics_to = [prometheus.remote_write.cloud.receiver]
forward_logs_to = [loki.write.cloud.receiver]
}Key Metrics
关键指标
promql
undefinedpromql
undefinedQuery rate by database
按数据库统计的查询速率
rate(db_query_total{db_instance="mydb"}[5m])
rate(db_query_total{db_instance="mydb"}[5m])
P95 query latency
P95 查询延迟
histogram_quantile(0.95, rate(db_query_duration_seconds_bucket[5m]))
histogram_quantile(0.95, rate(db_query_duration_seconds_bucket[5m]))
Error rate
错误率
rate(db_query_errors_total[5m]) / rate(db_query_total[5m])
rate(db_query_errors_total[5m]) / rate(db_query_total[5m])
Slow queries (over 1 second)
慢查询(超过1秒)
count(db_query_duration_seconds > 1) by (db_query_digest)
count(db_query_duration_seconds > 1) by (db_query_digest)
Active connections
活跃连接数
db_connections_active{db_instance="mydb"}
undefineddb_connections_active{db_instance="mydb"}
undefinedWhat You Get
您将获得的内容
Query Performance Dashboard (auto-provisioned):
- Top queries by total time, call count, mean latency
- Query samples with actual parameters and timing
- Visual explain plans showing index usage, scan types, costs
- RED metrics per query digest
Correlation with APM:
- Link slow DB queries to the application traces that triggered them
- ,
db.statement,db.systemOTel attributes connect DB spans to query samplesdb.name - Drill from service latency spike → specific slow SQL query → explain plan
查询性能仪表板(自动配置):
- 按总耗时、调用次数、平均延迟排序的Top查询
- 包含实际参数和计时信息的查询样本
- 展示索引使用情况、扫描类型、成本的可视化执行计划
- 按查询摘要统计的RED指标
与APM的关联:
- 将慢数据库查询与触发它们的应用追踪关联起来
- ,
db.statement,db.system这些OTel属性将数据库跨度与查询样本关联db.name - 从服务延迟峰值 → 特定慢SQL查询 → 执行计划进行深入排查
Alert Rules
告警规则
yaml
groups:
- name: database-observability
rules:
- alert: SlowQueryDetected
expr: histogram_quantile(0.95, rate(db_query_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "P95 query latency > 1s on {{ $labels.db_instance }}"
- alert: HighDBErrorRate
expr: rate(db_query_errors_total[5m]) / rate(db_query_total[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "DB error rate > 5% on {{ $labels.db_instance }}"
- alert: TooManyConnections
expr: db_connections_active / db_connections_max > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "DB connection pool >80% on {{ $labels.db_instance }}"yaml
groups:
- name: database-observability
rules:
- alert: SlowQueryDetected
expr: histogram_quantile(0.95, rate(db_query_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "P95 query latency > 1s on {{ $labels.db_instance }}"
- alert: HighDBErrorRate
expr: rate(db_query_errors_total[5m]) / rate(db_query_total[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "DB error rate > 5% on {{ $labels.db_instance }}"
- alert: TooManyConnections
expr: db_connections_active / db_connections_max > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "DB connection pool >80% on {{ $labels.db_instance }}"Setup Checklist
设置清单
- Enable (PostgreSQL) or Performance Schema (MySQL)
pg_stat_statements - Create least-privilege monitoring user
- Add block to Alloy config
database_observability.* - Verify metrics appear in Grafana Cloud → Database Observability
- Set up alerting on slow queries and error rates
- Enable trace correlation by ensuring app uses span attributes
db.statement
- 启用 (PostgreSQL)或 Performance Schema(MySQL)
pg_stat_statements - 创建具有最小权限的监控用户
- 在Alloy配置中添加 块
database_observability.* - 在Grafana Cloud → 数据库可观测性中验证指标是否显示
- 设置慢查询和错误率的告警规则
- 通过确保应用使用 跨度属性来启用追踪关联
db.statement