clickhouse-query

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ClickHouse Query Testing

ClickHouse 查询测试

Use this skill to run ad-hoc ClickHouse queries for analytics, metrics analysis, and debugging.
你可以使用该技能运行即席ClickHouse查询,用于数据分析、指标分析和调试工作。

Running Queries

运行查询

Use the included query script:
bash
node .claude/skills/clickhouse-query/query.mjs "SELECT count() FROM views"
使用自带的查询脚本:
bash
node .claude/skills/clickhouse-query/query.mjs "SELECT count() FROM views"

Options

可用选项

FlagDescription
--explain
Show query execution plan
--writable
Allow write operations (requires user permission)
--timeout <s>
,
-t
Query timeout in seconds (default: 30)
--file
,
-f
Read query from a file
--json
Output results as JSON
--quiet
,
-q
Minimal output, only results
Flag说明
--explain
展示查询执行计划
--writable
允许写入操作(需要用户权限)
--timeout <s>
,
-t
查询超时时间,单位为秒(默认值:30)
--file
,
-f
从文件读取查询语句
--json
以JSON格式输出结果
--quiet
,
-q
最小化输出,仅返回查询结果

Examples

示例

bash
undefined
bash
undefined

Count rows in a table

统计表中行数

node .claude/skills/clickhouse-query/query.mjs "SELECT count() FROM views"
node .claude/skills/clickhouse-query/query.mjs "SELECT count() FROM views"

Query with filters

带筛选条件的查询

node .claude/skills/clickhouse-query/query.mjs "SELECT * FROM modelEvents WHERE modelId = 123 LIMIT 10"
node .claude/skills/clickhouse-query/query.mjs "SELECT * FROM modelEvents WHERE modelId = 123 LIMIT 10"

Check query execution plan

查看查询执行计划

node .claude/skills/clickhouse-query/query.mjs --explain "SELECT * FROM views WHERE userId = 1"
node .claude/skills/clickhouse-query/query.mjs --explain "SELECT * FROM views WHERE userId = 1"

Override default 30s timeout for longer queries

对于运行时间更长的查询,覆盖默认30秒超时设置

node .claude/skills/clickhouse-query/query.mjs --timeout 60 "SELECT ... (complex aggregation)"
node .claude/skills/clickhouse-query/query.mjs --timeout 60 "SELECT ... (complex aggregation)"

Query from file

从文件读取查询语句执行

node .claude/skills/clickhouse-query/query.mjs -f my-query.sql
node .claude/skills/clickhouse-query/query.mjs -f my-query.sql

JSON output for processing

输出JSON格式结果便于后续处理

node .claude/skills/clickhouse-query/query.mjs --json "SELECT type, count() FROM modelEvents GROUP BY type"
undefined
node .claude/skills/clickhouse-query/query.mjs --json "SELECT type, count() FROM modelEvents GROUP BY type"
undefined

Safety Features

安全特性

  1. Read-only by default: Blocks INSERT/ALTER/DROP unless
    --writable
    flag is used
  2. 30 second timeout: Prevents runaway queries (override with
    --timeout
    )
  3. Explicit permission required: Before using
    --writable
    , you MUST ask the user for permission
  1. 默认只读:除非使用
    --writable
    参数,否则会拦截INSERT/ALTER/DROP等写入操作
  2. 30秒超时:避免查询长时间运行占用资源(可通过
    --timeout
    参数覆盖默认值)
  3. 需要显式授权:使用
    --writable
    之前,必须先向用户申请权限

When to Use --writable

何时使用 --writable

Only use the
--writable
flag when:
  • The user explicitly requests write access
  • You need to insert test data
  • You're running maintenance operations
IMPORTANT: Always ask the user for permission before running with
--writable
.
仅在以下场景中使用
--writable
参数:
  • 用户明确申请写入权限
  • 你需要插入测试数据
  • 你正在执行维护操作
重要提示:使用
--writable
运行查询前,务必先征得用户同意。

Common Tables

常用表

TableDescription
views
Page/entity view events
modelEvents
Model create/publish/update events
modelVersionEvents
Model version events including downloads
userActivities
User registration, login, subscription events
images
Image upload/delete events
reactions
Like/dislike events
reports
Content report events
entityMetricEvents
Aggregated metric events
表名说明
views
页面/实体访问事件
modelEvents
模型创建/发布/更新事件
modelVersionEvents
模型版本相关事件,包含下载事件
userActivities
用户注册、登录、订阅事件
images
图片上传/删除事件
reactions
点赞/点踩事件
reports
内容举报事件
entityMetricEvents
聚合指标事件

Querying Replica Clusters

查询副本集群

IMPORTANT: Production uses a ClickHouse replica cluster. When querying system tables (logs, metrics, etc.), you must use
clusterAllReplicas()
to get data from all nodes.
重要提示:生产环境使用ClickHouse副本集群。查询系统表(日志、指标等)时,必须使用
clusterAllReplicas()
获取所有节点的数据。

System Tables on Replica Clusters

副本集群上的系统表

sql
-- WRONG: Only queries the node you're connected to
SELECT * FROM system.query_log WHERE event_time > now() - INTERVAL 1 HOUR

-- CORRECT: Queries all replicas in the cluster
SELECT * FROM clusterAllReplicas(default, system.query_log)
WHERE event_time > now() - INTERVAL 1 HOUR
sql
-- 错误写法:仅查询你当前连接的节点
SELECT * FROM system.query_log WHERE event_time > now() - INTERVAL 1 HOUR

-- 正确写法:查询集群中所有副本
SELECT * FROM clusterAllReplicas(default, system.query_log)
WHERE event_time > now() - INTERVAL 1 HOUR

Common System Table Queries

常用系统表查询

sql
-- Find recent queries across all nodes
SELECT
    hostname(),
    event_time,
    query_duration_ms,
    formatReadableSize(memory_usage) AS memory,
    query
FROM clusterAllReplicas(default, system.query_log)
WHERE type = 'QueryFinish'
    AND event_time > now() - INTERVAL 5 MINUTE
ORDER BY event_time DESC
LIMIT 20

-- Find expensive queries by memory usage (last 24 hours)
SELECT
    count() as query_count,
    user,
    sum(memory_usage) AS total_memory,
    normalized_query_hash
FROM clusterAllReplicas(default, system.query_log)
WHERE event_time > now() - INTERVAL 1 DAY
    AND query_kind = 'Select'
    AND type = 'QueryFinish'
GROUP BY normalized_query_hash, user
ORDER BY total_memory DESC
LIMIT 10

-- Search query logs by pattern
SELECT event_time, query_id, query, type
FROM clusterAllReplicas(default, merge('system', '^query_log*'))
WHERE query ILIKE '%some_table%'
    AND event_time > now() - INTERVAL 5 MINUTE

-- Debug a specific query across all nodes
SELECT hostname(), message
FROM clusterAllReplicas(default, system.text_log)
WHERE query_id = 'your-query-id-here'
ORDER BY event_time_microseconds ASC
sql
-- 查询所有节点的近期查询记录
SELECT
    hostname(),
    event_time,
    query_duration_ms,
    formatReadableSize(memory_usage) AS memory,
    query
FROM clusterAllReplicas(default, system.query_log)
WHERE type = 'QueryFinish'
    AND event_time > now() - INTERVAL 5 MINUTE
ORDER BY event_time DESC
LIMIT 20

-- 按内存占用排序,找出过去24小时资源消耗高的查询
SELECT
    count() as query_count,
    user,
    sum(memory_usage) AS total_memory,
    normalized_query_hash
FROM clusterAllReplicas(default, system.query_log)
WHERE event_time > now() - INTERVAL 1 DAY
    AND query_kind = 'Select'
    AND type = 'QueryFinish'
GROUP BY normalized_query_hash, user
ORDER BY total_memory DESC
LIMIT 10

-- 按模式匹配搜索查询日志
SELECT event_time, query_id, query, type
FROM clusterAllReplicas(default, merge('system', '^query_log*'))
WHERE query ILIKE '%some_table%'
    AND event_time > now() - INTERVAL 5 MINUTE

-- 跨所有节点调试特定查询
SELECT hostname(), message
FROM clusterAllReplicas(default, system.text_log)
WHERE query_id = 'your-query-id-here'
ORDER BY event_time_microseconds ASC

When to Use clusterAllReplicas()

何时使用 clusterAllReplicas()

Use CaseFunction
System tables (query_log, text_log, etc.)
clusterAllReplicas(default, system.table_name)
Application tables (views, modelEvents, etc.)Direct query (already distributed)
Search multiple system tables
clusterAllReplicas(default, merge('system', '^pattern*'))
使用场景使用方式
系统表(query_log、text_log等)
clusterAllReplicas(default, system.table_name)
应用表(views、modelEvents等)直接查询(本身已是分布式表)
搜索多个系统表
clusterAllReplicas(default, merge('system', '^pattern*'))

ClickHouse SQL Tips

ClickHouse SQL 使用贴士

sql
-- Use count() not COUNT(*)
SELECT count() FROM views

-- Date filtering with toDate()
SELECT * FROM views WHERE toDate(time) = today()

-- Last 7 days
SELECT * FROM modelEvents WHERE time > now() - INTERVAL 7 DAY

-- Aggregations
SELECT type, count() as cnt FROM modelEvents GROUP BY type ORDER BY cnt DESC
sql
-- 使用count()而非COUNT(*)
SELECT count() FROM views

-- 使用toDate()进行日期过滤
SELECT * FROM views WHERE toDate(time) = today()

-- 查询过去7天的数据
SELECT * FROM modelEvents WHERE time > now() - INTERVAL 7 DAY

-- 聚合查询
SELECT type, count() as cnt FROM modelEvents GROUP BY type ORDER BY cnt DESC