spark-history-cli

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

spark-history-cli

spark-history-cli

Use this skill when the task is about exploring or debugging data exposed by a running Apache Spark History Server.
当任务涉及探查或调试运行中的Apache Spark History Server暴露的数据时,使用此技能。

Installation

安装

bash
pip install spark-history-cli
Or if not on PATH after install:
bash
python -m spark_history_cli --json apps
bash
pip install spark-history-cli
如果安装后工具未加入PATH:
bash
python -m spark_history_cli --json apps

Why use this skill

为什么使用此技能

  • It gives you a purpose-built CLI instead of scraping the Spark History Server web UI.
  • It wraps the REST API cleanly and already handles attempt-ID resolution for multi-attempt apps.
  • It supports
    --json
    , which makes downstream reasoning and comparisons much easier.
  • 提供专门构建的CLI,无需爬取Spark History Server的网页UI。
  • 对REST API进行了简洁封装,已为多尝试应用自动处理尝试ID解析逻辑。
  • 支持
    --json
    参数,大幅简化后续的推理和对比工作。

Workflow

工作流程

  1. Prefer the CLI over raw REST calls.
  2. Prefer
    --json
    unless the user explicitly wants a human-formatted table.
  3. Use
    --server <url>
    or
    SPARK_HISTORY_SERVER
    to point at the right SHS. If the user does not specify one, assume
    http://localhost:18080
    .
  4. Start broad, then drill down:
    • list applications
    • choose the relevant app
    • inspect jobs, stages, executors, SQL executions, environment, or logs
  5. If the user says "latest app", "recent run", or similar, list apps first and choose the most relevant recent application before continuing.
  6. If the CLI is unavailable, install it with
    python -m pip install spark-history-cli
    if tool permissions allow it.
  1. 优先使用CLI而非直接调用原始REST接口。
  2. 优先使用
    --json
    参数,除非用户明确需要人类可读的格式化表格。
  3. 使用
    --server <url>
    参数或
    SPARK_HISTORY_SERVER
    环境变量指定正确的SHS地址。如果用户未指定,默认使用
    http://localhost:18080
  4. 从宽范围查询开始,逐步向下钻取:
    • 列出应用列表
    • 选择相关应用
    • 查看作业、阶段、执行器、SQL执行、环境或日志详情
  5. 如果用户提及"latest app"、"recent run"或类似表述,先列出应用列表,选择最近最相关的应用后再继续操作。
  6. 如果CLI不可用,在工具权限允许的情况下,使用
    python -m pip install spark-history-cli
    安装。

Command patterns

命令示例

bash
spark-history-cli --json --server http://localhost:18080 apps
spark-history-cli --json --server http://localhost:18080 app <app-id>
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> jobs
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> stages
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> executors --all
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --view final
spark-history-cli --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --dot -o plan.dot
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-jobs <exec-id>
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> summary
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> env
spark-history-cli --server http://localhost:18080 --app-id <app-id> logs output.zip
If
spark-history-cli
is not on
PATH
, use:
bash
python -m spark_history_cli --json apps
bash
spark-history-cli --json --server http://localhost:18080 apps
spark-history-cli --json --server http://localhost:18080 app <app-id>
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> jobs
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> stages
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> executors --all
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --view final
spark-history-cli --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --dot -o plan.dot
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-jobs <exec-id>
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> summary
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> env
spark-history-cli --server http://localhost:18080 --app-id <app-id> logs output.zip
如果
spark-history-cli
不在
PATH
中,使用:
bash
python -m spark_history_cli --json apps

What to reach for

适用场景

  • apps
    for recent runs, durations, status, and picking candidates
  • app <id>
    for high-level details about one run
  • attempts
    for multi-attempt apps (list or show specific attempt details)
  • jobs
    ,
    job <id>
    for job-level failures or progress
  • job-stages <id>
    for stages belonging to a job
  • stages
    ,
    stage <id>
    for task/stage bottlenecks
  • stage-summary <id>
    for task metric quantiles (p5/p25/p50/p75/p95) — duration, GC, memory, shuffle, I/O
  • stage-tasks <id>
    for individual task details — sorted by runtime to find stragglers
  • executors --all
    for executor churn or skew investigations
  • sql
    for SQL execution history and plan graph data
  • sql-plan <id>
    for SQL plan extraction:
    • --view full
      (default): full plan text
    • --view initial
      : only the Initial Plan (pre-AQE)
    • --view final
      : only the Final Plan (post-AQE)
    • --dot
      : Graphviz DOT output for visualizing the plan DAG
    • --json
      +
      --view
      : structured JSON with
      isAdaptive
      ,
      sectionCount
      ,
      plan
      , and
      sections
    • -o <file>
      : write output to file instead of stdout
  • sql-jobs <id>
    for jobs associated with a SQL execution (fetches all linked jobs by ID)
  • summary
    for a concise application overview: app info, resource config (driver/executor/shuffle), and workload stats (jobs/stages/tasks/SQL)
  • env
    for Spark config/runtime context
  • logs
    only when the user explicitly wants the event log archive saved locally
  • apps
    :查询最近运行记录、运行时长、状态,筛选候选应用
  • app <id>
    :查询单次运行的高层级详情
  • attempts
    :针对多尝试应用,列出或查看特定尝试的详情
  • jobs
    job <id>
    :排查作业级别的失败或进度问题
  • job-stages <id>
    :查询属于某个作业的所有阶段
  • stages
    stage <id>
    :排查任务/阶段瓶颈
  • stage-summary <id>
    :查询任务指标分位数(p5/p25/p50/p75/p95)—— 运行时长、GC、内存、Shuffle、I/O
  • stage-tasks <id>
    :查询单个任务详情——按运行时间排序,定位慢节点
  • executors --all
    :排查执行器波动或数据倾斜问题
  • sql
    :查询SQL执行历史和执行计划图数据
  • sql-plan <id>
    :提取SQL执行计划:
    • --view full
      (默认):完整计划文本
    • --view initial
      :仅初始计划(AQE生效前)
    • --view final
      :仅最终计划(AQE生效后)
    • --dot
      :输出Graphviz DOT格式,用于可视化计划DAG
    • --json
      +
      --view
      :结构化JSON输出,包含
      isAdaptive
      sectionCount
      plan
      sections
      字段
    • -o <file>
      :将输出写入文件而非标准输出
  • sql-jobs <id>
    :查询与某次SQL执行关联的作业(按ID拉取所有关联作业)
  • summary
    :简洁的应用概览:应用信息、资源配置(Driver/Executor/Shuffle)、工作负载统计(作业/阶段/任务/SQL)
  • env
    :查询Spark配置和运行时上下文
  • logs
    :仅当用户明确需要将事件日志归档保存到本地时使用

Practical guidance

实用指南

  • Preserve the user's server URL if they gave one explicitly.
  • Summarize findings after retrieving JSON; do not dump raw JSON unless the user asked for it.
  • Treat event logs and benchmark history as potentially sensitive. Download them only when necessary and keep them local.
  • This CLI needs a running Spark History Server. It does not replace SHS and it does not parse raw event logs directly.
  • 如果用户明确提供了服务器URL,请注意保留。
  • 拉取JSON数据后请总结结果,除非用户要求,否则不要直接输出原始JSON。
  • 事件日志和基准测试历史可能包含敏感信息,仅在必要时下载并保存在本地。
  • 此CLI需要运行中的Spark History Server才能使用,它不能替代SHS,也不能直接解析原始事件日志。

Troubleshooting

故障排查

IssueSolution
Connection refused
SHS not running — start with
$SPARK_HOME/sbin/start-history-server.sh
404 Not Found
on app
App ID may include attempt suffix — use
apps
to list valid IDs
No apps listedCheck
spark.history.fs.logDirectory
points to the right event log path
ModuleNotFoundError
CLI not installed — run
pip install spark-history-cli
Wrong serverSet
SPARK_HISTORY_SERVER
env var or use
--server <url>
Timeout on large appsSHS may be parsing event logs — wait and retry, or check SHS logs
问题解决方案
Connection refused
SHS未运行——执行
$SPARK_HOME/sbin/start-history-server.sh
启动
访问应用返回
404 Not Found
应用ID可能包含尝试后缀——使用
apps
命令列出有效ID
未列出任何应用检查
spark.history.fs.logDirectory
是否指向正确的事件日志路径
ModuleNotFoundError
CLI未安装——执行
pip install spark-history-cli
服务器地址错误设置
SPARK_HISTORY_SERVER
环境变量或使用
--server <url>
参数指定
大型应用查询超时SHS可能正在解析事件日志——等待后重试,或查看SHS日志排查