bigquery
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBigQuery
BigQuery
Query Mozilla telemetry data directly using the CLI.
bq使用 CLI直接查询Mozilla遥测数据。
bqPrerequisites
前提条件
- and
gcloudCLI installed (bq)brew install google-cloud-sdk - Authenticated: with a Mozilla account
gcloud auth login - Billing project set: queries run against a project you have on
bigquery.jobs.create - (Optional but highly recommended) mozdata-claude-plugin — provides Glean Dictionary MCP for metric/ping discovery, making it much easier to find the right tables and columns
- 已安装和
gcloudCLI(执行bq安装)brew install google-cloud-sdk - 已完成认证:使用Mozilla账号执行
gcloud auth login - 已设置计费项目:查询需在你拥有权限的项目下运行
bigquery.jobs.create - (可选但强烈推荐)mozdata-claude-plugin — 提供用于指标/ ping 发现的Glean Dictionary MCP,可大幅简化表和列的查找工作
Authentication
认证步骤
bash
undefinedbash
undefinedCheck current account
检查当前账号
gcloud config get-value account
gcloud config get-value account
Re-authenticate if needed
如需重新认证
gcloud auth login
gcloud auth login
List available projects
列出可用项目
gcloud projects list --format="table(projectId,name)"
gcloud projects list --format="table(projectId,name)"
Set billing project (mozdata is the standard choice)
设置计费项目(mozdata是标准选择)
gcloud config set project mozdata
If queries fail with "Access Denied", the billing project likely lacks permissions. Try `--project_id=mozdata`.gcloud config set project mozdata
如果查询时提示“Access Denied”,可能是计费项目缺少权限,尝试添加`--project_id=mozdata`参数。Running Queries
运行查询
bash
undefinedbash
undefinedBasic query
基础查询
bq query --project_id=mozdata --use_legacy_sql=false --format=pretty "SELECT ..."
bq query --project_id=mozdata --use_legacy_sql=false --format=pretty "SELECT ..."
Dry run (check cost before executing)
试运行(执行前检查成本)
bq query --project_id=mozdata --use_legacy_sql=false --dry_run "SELECT ..."
Always use `--project_id=mozdata` and `--use_legacy_sql=false`.bq query --project_id=mozdata --use_legacy_sql=false --dry_run "SELECT ..."
请始终使用`--project_id=mozdata`和`--use_legacy_sql=false`参数。Table Selection
表选择
Choose the right table — this is the most important optimization:
| Query Type | Table | Why |
|---|---|---|
| Windows version distribution | | Pre-aggregated, instant |
| DAU/MAU by standard dimensions | | Pre-computed, 100x faster |
| DAU with custom dimensions | | One row per client per day |
| MAU/WAU/retention | | Bit patterns, scan 1 day not 28 |
| Event analysis | | Pre-unnested, clustered |
| Mobile search | | Pre-aggregated |
| Specific Glean metric | | Raw metrics ping |
All tables are in the project. Fully qualify as .
moz-fx-data-shared-prod`moz-fx-data-shared-prod.{dataset}.{table}`选择合适的表是最重要的优化手段:
| 查询类型 | 表 | 原因 |
|---|---|---|
| Windows版本分布 | | 预聚合,查询速度快 |
| 按标准维度统计DAU/MAU | | 预计算,速度提升100倍 |
| 按自定义维度统计DAU | | 每个客户端每天对应一行数据 |
| MAU/WAU/用户留存 | | 采用位模式,仅需扫描1天数据而非28天 |
| 事件分析 | | 已预展开,支持聚类 |
| 移动端搜索 | | 预聚合 |
| 特定Glean指标 | | 原始指标ping数据 |
所有表均位于项目下,需使用完整路径引用。
moz-fx-data-shared-prod`moz-fx-data-shared-prod.{dataset}.{table}`Critical Rules
重要规则
- Always use aggregate tables first — raw tables are 10-100x more expensive
- Always include partition filter — or
submission_dateDATE(submission_timestamp) - Use for development (1% sample) — remove for production
sample_id = 0 - Say "clients" not "users" — BigQuery tracks , not actual humans
client_id - Never join across products by client_id — each product has its own namespace
- Use for events — never raw
events_stream(requires UNNEST)events_v1 - Use for MAU — bit patterns, scan 1 day not 28
baseline_clients_last_seen
- 优先使用聚合表 —— 原始表的查询成本是聚合表的10-100倍
- 必须包含分区过滤条件 —— 使用或
submission_dateDATE(submission_timestamp) - 开发环境使用(1%样本数据)—— 生产环境移除该条件
sample_id = 0 - 使用“clients”而非“users” —— BigQuery跟踪的是,而非真实用户
client_id - 禁止跨产品通过client_id关联 —— 每个产品有独立的命名空间
- 事件分析使用—— 切勿使用原始
events_stream(需手动UNNEST)events_v1 - MAU统计使用—— 采用位模式,仅需扫描1天数据而非28天
baseline_clients_last_seen
References
参考资料
- — Detailed table schemas and common query patterns
references/tables.md - — Windows, macOS, and Linux version distribution queries with build number, Darwin, and kernel version mappings
references/os-versions.md
- —— 详细的表结构和常见查询模式
references/tables.md - —— 包含版本号、Darwin和内核版本映射的Windows、macOS及Linux版本分布查询
references/os-versions.md
Related Skills
相关技能
- redash — Web UI frontend to BigQuery with visualizations and sharing
- mozdata:query-writing — Guided query writing with Glean Dictionary MCP
- mozdata:probe-discovery — Find Glean metrics and telemetry probes
- redash —— BigQuery的Web UI前端,支持可视化和分享功能
- mozdata:query-writing —— 借助Glean Dictionary MCP进行引导式查询编写
- mozdata:probe-discovery —— 查找Glean指标和遥测探针