altinity-expert-clickhouse-metrics
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseReal-Time Metrics Monitoring
实时指标监控
Real-time monitoring of ClickHouse metrics, events, and asynchronous metrics.
实时监控ClickHouse指标、事件及异步指标。
Diagnostics
诊断
Run all queries from the file checks.sql and analyze the results.
运行checks.sql文件中的所有查询并分析结果。
Ad-Hoc Query Guidelines
临时查询指南
Key Tables
核心表
- - Current gauge values
system.metrics - - Cumulative counters since restart
system.events - - System-level metrics
system.asynchronous_metrics - - Historical metrics
system.metric_log - - Historical async metrics
system.asynchronous_metric_log
- - 当前仪表盘数值
system.metrics - - 重启后的累计计数器
system.events - - 系统级指标
system.asynchronous_metrics - - 历史指标
system.metric_log - - 历史异步指标
system.asynchronous_metric_log
Useful Patterns
实用查询模式
sql
-- Find metrics by pattern
select * from system.metrics where metric like '%pattern%'
select * from system.asynchronous_metrics where metric like '%pattern%'
select * from system.events where event like '%pattern%'sql
-- 按模式查找指标
select * from system.metrics where metric like '%pattern%'
select * from system.asynchronous_metrics where metric like '%pattern%'
select * from system.events where event like '%pattern%'Cross-Module Triggers
跨模块触发器
| Finding | Load Module | Reason |
|---|---|---|
| High memory metrics | | Memory analysis |
| High replica delay | | Replication issues |
| High parts count | | Merge backlog |
| High load average | | Query analysis |
| High connections | | Connection analysis |
| 发现项 | 加载模块 | 原因 |
|---|---|---|
| 内存指标过高 | | 内存分析 |
| 副本延迟过高 | | 复制问题 |
| 分区数量过高 | | 合并积压 |
| 负载均值过高 | | 查询分析 |
| 连接数过高 | | 连接分析 |
Monitoring Recommendations
监控建议
Key Metrics to Alert On
需设置告警的核心指标
| Metric | Warning | Critical |
|---|---|---|
| - | > 0 |
| > 75% max | > 90% max |
| > 80% RAM | > 90% RAM |
| > parts_to_delay | > parts_to_throw |
| > 5 min | > 1 hour |
| > CPU count | > 2x CPU count |
| 指标 | 警告阈值 | 严重阈值 |
|---|---|---|
| - | > 0 |
| > 最大阈值的75% | > 最大阈值的90% |
| > 内存的80% | > 内存的90% |
| > parts_to_delay | > parts_to_throw |
| > 5分钟 | > 1小时 |
| > CPU核心数 | > 2倍CPU核心数 |
Prometheus/Grafana Export
Prometheus/Grafana 导出
ClickHouse exposes metrics at in Prometheus format when enabled.
:9363/metrics当启用后,ClickHouse会在地址以Prometheus格式暴露指标。
:9363/metrics