altinity-expert-clickhouse-mutations
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMutation Tracking and Analysis
Mutation跟踪与分析
Track and diagnose ALTER UPDATE, ALTER DELETE, and other mutation operations.
跟踪并诊断ALTER UPDATE、ALTER DELETE及其他mutation操作。
Diagnostics
诊断
Run all queries from the file checks.sql and analyze the results.
运行checks.sql文件中的所有查询并分析结果。
Problem Investigation
问题排查
Why Is Mutation Stuck?
为什么Mutation会卡住?
Check for competing operations using these ad-hoc queries:
sql
-- Active merges on same table
select
database,
table,
is_mutation,
elapsed,
progress,
num_parts
from system.merges
where database = '{database}' and table = '{table}'sql
-- Replication queue for same table
select
type,
create_time,
is_currently_executing,
num_tries,
last_exception
from system.replication_queue
where database = '{database}' and table = '{table}'
order by create_time
limit 20sql
-- Part mutations status
select
name,
active,
mutation_version,
modification_time
from system.parts
where database = '{database}' and table = '{table}'
order by mutation_version desc
limit 30使用以下临时查询检查是否存在竞争操作:
sql
-- 同一张表上的活跃合并操作
select
database,
table,
is_mutation,
elapsed,
progress,
num_parts
from system.merges
where database = '{database}' and table = '{table}'sql
-- 同一张表的复制队列
select
type,
create_time,
is_currently_executing,
num_tries,
last_exception
from system.replication_queue
where database = '{database}' and table = '{table}'
order by create_time
limit 20sql
-- 分区mutation状态
select
name,
active,
mutation_version,
modification_time
from system.parts
where database = '{database}' and table = '{table}'
order by mutation_version desc
limit 30Canceling Mutations
取消Mutation操作
To kill a stuck mutation:
sql
-- Find mutation_id first
select mutation_id, command from system.mutations
where database = '{database}' and table = '{table}' and not is_done;
-- Then kill it
-- KILL MUTATION WHERE database = '{database}' AND table = '{table}' AND mutation_id = '{mutation_id}';Warning: Killed mutations leave table in partially-mutated state.
要终止卡住的mutation:
sql
-- 先找到mutation_id
select mutation_id, command from system.mutations
where database = '{database}' and table = '{table}' and not is_done;
-- 然后终止它
-- KILL MUTATION WHERE database = '{database}' AND table = '{table}' AND mutation_id = '{mutation_id}';警告: 终止的mutation会使表处于部分修改状态。
Best Practices
最佳实践
Mutation Anti-Patterns
Mutation反模式
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Frequent small UPDATEs | Creates many mutations | Batch updates together |
| DELETE without WHERE | Full table rewrite | Use TTL instead |
| UPDATE on high-cardinality column | Slow, lots of IO | Restructure data model |
| Many concurrent mutations | Queue builds up | Serialize mutations |
| 反模式 | 问题 | 解决方案 |
|---|---|---|
| 频繁的小批量UPDATE | 产生大量mutation | 批量执行更新 |
| 不带WHERE条件的DELETE | 全表重写 | 使用TTL替代 |
| 在高基数列上执行UPDATE | 速度慢、IO量大 | 重构数据模型 |
| 大量并发mutation | 队列堆积 | 序列化mutation执行 |
Monitoring Mutations
监控Mutation
Set alerts for:
- Mutations pending > 10
- Mutation age > 1 hour
- not empty
latest_fail_reason
设置以下告警:
- 待处理Mutation数量>10
- Mutation存在时长>1小时
- 不为空
latest_fail_reason
Cross-Module Triggers
跨模块触发
| Finding | Load Module | Reason |
|---|---|---|
| Mutation blocked by merge | | Merge backlog |
| Mutation OOM | | Memory limits |
| Mutation slow due to disk | | IO bottleneck |
| Replicated mutation stuck | | Replication issues |
| 发现问题 | 加载模块 | 原因 |
|---|---|---|
| Mutation被合并操作阻塞 | | 合并操作积压 |
| Mutation出现OOM | | 内存限制 |
| Mutation因磁盘问题变慢 | | IO瓶颈 |
| 复制型Mutation卡住 | | 复制问题 |
Settings Reference
设置参考
| Setting | Notes |
|---|---|
| 0=async, 1=wait current replica, 2=wait all |
| Max concurrent mutations |
| Delay INSERTs threshold |
| Throw error threshold |
| 设置项 | 说明 |
|---|---|
| 0=异步,1=等待当前副本完成,2=等待所有副本完成 |
| 最大并发mutation数 |
| 延迟INSERT的阈值 |
| 抛出错误的阈值 |