fixing-streamlit-ci

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Fix CI Failures

修复CI失败

Diagnose and fix failed GitHub Actions CI jobs for the current branch/PR using
gh
CLI
and
git
commands.
使用
gh
CLI
git
命令,诊断并修复当前分支/PR中失败的GitHub Actions CI任务。

When to Use

适用场景

  • CI checks have failed on a PR
  • You need to understand why a workflow failed
  • You want to apply fixes and verify locally
  • PR上的CI检查已失败
  • 你需要了解工作流失败的原因
  • 你想应用修复方案并在本地验证

Workflow

操作流程

Copy this checklist to track progress:
- [ ] Verify authentication
- [ ] Gather context & find failed jobs
- [ ] Download & analyze logs
- [ ] Present diagnosis to user
- [ ] Apply fix & verify locally
- [ ] Push & recheck CI
复制以下检查清单跟踪进度:
- [ ] 验证身份认证
- [ ] 收集上下文并查找失败任务
- [ ] 下载并分析日志
- [ ] 向用户呈现诊断结果
- [ ] 应用修复并在本地验证
- [ ] 推送代码并重新检查CI

1. Verify Authentication

1. 验证身份认证

bash
gh auth status
If authentication fails, prompt user to run
gh auth login
with appropriate scopes.
bash
gh auth status
如果身份认证失败,提示用户使用适当的权限范围运行
gh auth login

2. Gather PR Context

2. 收集PR上下文

bash
undefined
bash
undefined

Get PR for current branch

获取当前分支对应的PR

gh pr view --json number,title,url,headRefName
gh pr view --json number,title,url,headRefName

Get PR description and metadata

获取PR描述和元数据

gh pr view --json title,body,labels,author
gh pr view --json title,body,labels,author

List changed files

列出变更文件

gh pr diff --name-only
gh pr diff --name-only

Summary of changes

变更摘要

gh pr diff --stat
undefined
gh pr diff --stat
undefined

3. Check CI Status

3. 检查CI状态

bash
undefined
bash
undefined

List all checks (shows pass/fail status)

列出所有检查(显示通过/失败状态)

gh pr checks
gh pr checks

Get detailed check info

获取详细的检查信息

gh pr checks --json name,state,conclusion,detailsUrl,startedAt,completedAt
gh pr checks --json name,state,conclusion,detailsUrl,startedAt,completedAt

List only failed runs

仅列出失败的运行记录

gh run list --branch $(git branch --show-current) --status failure --limit 10
gh run list --branch $(git branch --show-current) --status failure --limit 10

Check if CI is still running

检查CI是否仍在运行

gh run list --branch $(git branch --show-current) --status in_progress
undefined
gh run list --branch $(git branch --show-current) --status in_progress
undefined

4. Find Failed Jobs

4. 查找失败任务

bash
undefined
bash
undefined

View run details (get RUN_ID from previous step)

查看运行详情(从上一步获取RUN_ID)

gh run view {RUN_ID}
gh run view {RUN_ID}

List failed jobs with IDs

列出带ID的失败任务

gh run view {RUN_ID} --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {id: .databaseId, name: .name}'
gh run view {RUN_ID} --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {id: .databaseId, name: .name}'

List failed jobs with their failed steps

列出包含失败步骤的失败任务

gh run view {RUN_ID} --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {name: .name, steps: [.steps[] | select(.conclusion == "failure") | .name]}'
undefined
gh run view {RUN_ID} --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {name: .name, steps: [.steps[] | select(.conclusion == "failure") | .name]}'
undefined

5. Download & Analyze Logs

5. 下载并分析日志

Primary method:
bash
undefined
主要方法:
bash
undefined

Get failed logs (last 250 lines usually contains the error)

获取失败日志(最后250行通常包含错误信息)

gh run view {RUN_ID} --log-failed 2>&1 | tail -250
gh run view {RUN_ID} --log-failed 2>&1 | tail -250

Target a specific failed job by ID

按ID定位特定的失败任务

gh run view {RUN_ID} --job {JOB_ID} --log-failed 2>&1 | tail -100

**Fallback for pending logs:**

```bash
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
gh api "/repos/${REPO}/actions/jobs/{JOB_ID}/logs"
Smart log extraction (examples):
bash
undefined
gh run view {RUN_ID} --job {JOB_ID} --log-failed 2>&1 | tail -100

**日志未就绪时的备用方案:**

```bash
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
gh api "/repos/${REPO}/actions/jobs/{JOB_ID}/logs"
智能日志提取(示例):
bash
undefined

Context around failure markers

失败标记前后的上下文

gh run view {RUN_ID} --log-failed 2>&1 | grep -B 5 -A 10 -iE "error|fail|exception|traceback|panic|fatal" | head -100
gh run view {RUN_ID} --log-failed 2>&1 | grep -B 5 -A 10 -iE "error|fail|exception|traceback|panic|fatal" | head -100

Python tests - pytest summary

Python测试 - pytest摘要

gh run view {RUN_ID} --log-failed 2>&1 | grep -E -A 50 "FAILED|ERROR|short test summary"
gh run view {RUN_ID} --log-failed 2>&1 | grep -E -A 50 "FAILED|ERROR|short test summary"

TypeScript/ESLint errors

TypeScript/ESLint错误

gh run view {RUN_ID} --log-failed 2>&1 | grep -E -B 2 -A 5 "error TS|error "
gh run view {RUN_ID} --log-failed 2>&1 | grep -E -B 2 -A 5 "error TS|error "

E2E snapshot mismatches

E2E快照不匹配

gh run view {RUN_ID} --log-failed 2>&1 | grep -E -B 2 -A 5 "Missing snapshot for|Snapshot mismatch for"
undefined
gh run view {RUN_ID} --log-failed 2>&1 | grep -E -B 2 -A 5 "Missing snapshot for|Snapshot mismatch for"
undefined

6. Analyze Failure

6. 分析失败原因

Identify:
  • Error type: Lint, type check, test failure, build error
  • Root cause: First/primary error (not cascading failures)
  • Affected files: Which files need changes
  • Error message: Exact error text
Common CI failure categories:
CategoryWorkflowMake CommandAuto-fix
Python lint
python-tests.yml
make python-lint
make autofix
Python types
python-tests.yml
make python-types
❌ Manual
Python tests
python-tests.yml
make python-tests
❌ Manual
Frontend lint
js-tests.yml
make frontend-lint
make autofix
Frontend types
js-tests.yml
make frontend-types
❌ Manual
Frontend tests
js-tests.yml
make frontend-tests
❌ Manual
E2E tests
playwright.yml
make run-e2e-test <file>
❌ Manual
E2E snapshots
playwright.yml
make run-e2e-test <file>
make update-snapshots
NOTICES
js-tests.yml
make update-notices
make update-notices
Min constraints
python-tests.yml
make update-min-deps
make update-min-deps
Pre-commit
enforce-pre-commit.yml
uv run pre-commit run --all-files
✅ Mostly auto-fix
Relative imports
ensure-relative-imports.yml
Check script output❌ Manual
PR Labels
require-labels.yml
N/A⏭️ Ignore
💡 Quick win: Run
make autofix
first for lint/formatting failures.
确定:
  • 错误类型:Lint检查、类型检查、测试失败、构建错误
  • 根本原因:首个/主要错误(而非连锁失败)
  • 受影响文件:需要修改的文件
  • 错误信息:确切的错误文本
常见CI失败类别:
类别工作流Make命令自动修复
Python代码检查
python-tests.yml
make python-lint
make autofix
Python类型检查
python-tests.yml
make python-types
❌ 手动修复
Python测试
python-tests.yml
make python-tests
❌ 手动修复
前端代码检查
js-tests.yml
make frontend-lint
make autofix
前端类型检查
js-tests.yml
make frontend-types
❌ 手动修复
前端测试
js-tests.yml
make frontend-tests
❌ 手动修复
E2E测试
playwright.yml
make run-e2e-test <file>
❌ 手动修复
E2E快照
playwright.yml
make run-e2e-test <file>
make update-snapshots
NOTICES文件
js-tests.yml
make update-notices
make update-notices
最低依赖约束
python-tests.yml
make update-min-deps
make update-min-deps
预提交检查
enforce-pre-commit.yml
uv run pre-commit run --all-files
✅ 基本可自动修复
相对导入检查
ensure-relative-imports.yml
检查脚本输出❌ 手动修复
PR标签检查
require-labels.yml
N/A⏭️ 忽略
💡 快速修复建议: 若为代码检查/格式类失败,先运行
make autofix

7. Present Diagnosis

7. 呈现诊断结果

For multiple failures, list all and let user choose:
CI Failure Analysis for PR #{NUMBER}: {TITLE}
═══════════════════════════════════════════════════════════════

Found {N} failed jobs/checks:

─────────────────────────────────────────────────────────────────

1. [LINT] Python Unit Tests → Run Linters
   Workflow: python-tests.yml (GitHub Actions)
   Error:    Ruff formatting error in lib/streamlit/elements/foo.py
   Auto-fix: ✅ `make autofix`

2. [TYPE] Javascript Unit Tests → Run type checks
   Workflow: js-tests.yml (GitHub Actions)
   Error:    TS2322: Type 'string' is not assignable to type 'number'
   File:     frontend/lib/src/components/Bar.tsx:42
   Auto-fix: ❌ Manual fix required

─────────────────────────────────────────────────────────────────

Which failures would you like me to address?
Options: "1" | "1,2" | "1-2" | "all" | "only auto-fixable"
For single failure, show detailed analysis:
─────────────────────────────────────────────────────────────────
Analyzing: [TYPE] Javascript Unit Tests → Run type checks
─────────────────────────────────────────────────────────────────

Category: TYPE
Workflow: js-tests.yml
Job:      js-unit-tests (ID: 12345678)
Step:     Run type checks

Error snippet:
  frontend/lib/src/components/Bar.tsx:42:5
  error TS2322: Type 'string' is not assignable to type 'number'.

Proposed Fix:
  Change type annotation or fix the value type

─────────────────────────────────────────────────────────────────

Would you like me to:
  [1] Apply the fix automatically
  [2] Show the proposed changes first
  [3] Run local verification only
  [4] Skip this and move to next failure
多个失败场景:列出所有失败项,让用户选择处理对象:
PR #{NUMBER}: {TITLE}的CI失败分析
═══════════════════════════════════════════════════════════════

发现{N}个失败任务/检查项:

─────────────────────────────────────────────────────────────────

1. [代码检查] Python单元测试 → 运行代码检查器
   工作流:python-tests.yml(GitHub Actions)
   错误:lib/streamlit/elements/foo.py中存在Ruff格式错误
   自动修复:✅ `make autofix`

2. [类型检查] JavaScript单元测试 → 运行类型检查
   工作流:js-tests.yml(GitHub Actions)
   错误:TS2322: 类型'string'不能分配给类型'number'
   文件:frontend/lib/src/components/Bar.tsx:42
   自动修复:❌ 需要手动修复

─────────────────────────────────────────────────────────────────

你希望我处理哪些失败项?
选项:"1" | "1,2" | "1-2" | "all" | "仅自动修复项"
单个失败场景:显示详细分析结果:
─────────────────────────────────────────────────────────────────
分析对象:[类型检查] JavaScript单元测试 → 运行类型检查
─────────────────────────────────────────────────────────────────

类别:类型检查
工作流:js-tests.yml
任务:js-unit-tests(ID: 12345678)
步骤:运行类型检查

错误片段:
  frontend/lib/src/components/Bar.tsx:42:5
  error TS2322: Type 'string' is not assignable to type 'number'.

建议修复方案:
  修改类型注解或修正值的类型

─────────────────────────────────────────────────────────────────

你希望我:
  [1] 自动应用修复
  [2] 先显示建议的变更内容
  [3] 仅运行本地验证
  [4] 跳过此项,处理下一个失败

8. Apply Fix & Verify Locally

8. 应用修复并在本地验证

After user approval, apply fix and run verification:
bash
undefined
获得用户确认后,应用修复并运行验证:
bash
undefined

Run all checks (lint, types, tests) on changed files

对变更文件运行所有检查(代码检查、类型检查、测试)

make check
make check

Python tests (specific)

Python测试(指定用例)

uv run pytest lib/tests/path/to/test_file.py::test_name -v
uv run pytest lib/tests/path/to/test_file.py::test_name -v

Frontend tests (specific)

前端测试(指定用例)

cd frontend && yarn test path/to/test.test.tsx
cd frontend && yarn test path/to/test.test.tsx

E2E tests

E2E测试

make run-e2e-test {test_file.py}
make run-e2e-test {test_file.py}

E2E snapshots

更新E2E快照

make update-snapshots
undefined
make update-snapshots
undefined

9. Summary & Push

9. 总结并推送代码

bash
git status --short
git diff --stat
Report what failed, what changed, and local verification result.
bash
git add -A
git commit -m "fix: resolve CI failure in {workflow/step}"
git push
bash
git status --short
git diff --stat
报告失败内容、变更内容以及本地验证结果。
bash
git add -A
git commit -m "fix: resolve CI failure in {workflow/step}"
git push

10. Recheck CI Status

10. 重新检查CI状态

bash
gh pr checks --watch
bash
gh pr checks --watch

Or re-run failed jobs

或重新运行失败的任务

gh run rerun {RUN_ID} --failed
undefined
gh run rerun {RUN_ID} --failed
undefined

Rules

规则

  • Focus on root cause: First error, not cascading failures
  • Minimal fixes: Smallest change that fixes the issue
  • Don't skip tests: Never disable tests to "fix" CI
  • Verify locally: Always run appropriate local command
  • Preserve intent: Understand what code was trying to do
  • 聚焦根本原因:优先处理首个错误,而非连锁失败
  • 最小化变更:用最小的修改解决问题
  • 不跳过测试:绝不通过禁用测试来“修复”CI
  • 本地验证:始终运行对应的本地验证命令
  • 保留代码意图:理解代码原本的设计目的

Error Handling

错误处理

IssueSolution
Auth failed
gh auth login
with workflow/repo scopes
No PR for branch
gh run list
to check workflow runs
CI still running
gh pr checks --watch
Logs pendingRetry with job logs API
No failed checksAll passing ✅
Rate limitedWait and retry
Flaky testRe-run:
gh run rerun {RUN_ID} --failed
问题解决方案
身份认证失败使用工作流/仓库权限范围运行
gh auth login
分支无对应PR运行
gh run list
查看工作流运行记录
CI仍在运行运行
gh pr checks --watch
等待结果
日志未就绪重试任务日志API
无失败检查项所有检查已通过 ✅
请求受限等待后重试
测试不稳定重新运行:
gh run rerun {RUN_ID} --failed