github-upload-image-to-pr

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Upload Image to PR

向PR上传图片

Upload local images to a GitHub PR and embed them in the description or comments using browser automation tools.
通过浏览器自动化工具将本地图片上传至GitHub PR,并嵌入到描述或评论中。

How It Works

工作原理

Since the GitHub API does not support direct image uploads, this skill uses the PR comment textarea as a staging area for GitHub's image hosting — uploading files there to obtain persistent
user-attachments/assets/
URLs, then updating the PR description or posting a comment via the
gh
CLI.
由于GitHub API不支持直接上传图片,此技能会利用PR评论输入框作为GitHub图片托管的临时区域——在此处上传文件以获取持久的
user-attachments/assets/
格式URL,随后通过
gh
CLI更新PR描述或发布评论。

Step 0: Resolve PR context

步骤0:解析PR上下文

If the user didn't specify a PR number or URL, auto-detect it:
bash
undefined
如果用户未指定PR编号或URL,自动检测:
bash
undefined

Get PR number from the current branch

从当前分支获取PR编号

gh pr view --json number,url -q '"(.number) (.url)"'

If multiple repos or branches are involved, confirm with the user which PR to target.

Also, normalize the image paths to absolute paths. If a path contains special characters (e.g., Unicode narrow spaces from CleanShot X), copy the file to `/tmp/` first:

```bash
gh pr view --json number,url -q '"(.number) (.url)"'

如果涉及多个仓库或分支,请与用户确认目标PR。

同时,将图片路径转换为绝对路径。如果路径包含特殊字符(例如CleanShot X生成的Unicode窄空格),请先将文件复制到`/tmp/`目录:

```bash

e.g., to handle glob-matched paths with special chars

示例:处理包含特殊字符的通配符匹配路径

cp /path/to/CleanShotkeyword.png /tmp/screenshot.png
undefined
cp /path/to/CleanShotkeyword.png /tmp/screenshot.png
undefined

Tool Detection and Selection

工具检测与选择

Priority Order

优先级顺序

  1. Playwright MCP (MCP connection,
    mcp__playwright__*
    ) — connects to existing browser, login state preserved
  2. Chrome DevTools MCP (MCP connection,
    mcp__chrome-devtools__*
    ) — connects to existing browser, login state preserved
  3. agent-browser (CLI via Bash — fallback, login state preserved with
    --profile
    )
MCP-based tools connect to an already-running browser instance, so GitHub login state is automatically preserved. agent-browser can persist login state using
--profile ~/.agent-browser-github
.
  1. Playwright MCP(MCP连接,
    mcp__playwright__*
    )——连接到已运行的浏览器,保留登录状态
  2. Chrome DevTools MCP(MCP连接,
    mcp__chrome-devtools__*
    )——连接到已运行的浏览器,保留登录状态
  3. agent-browser(通过Bash调用CLI——备选方案,使用
    --profile
    保留登录状态)
基于MCP的工具会连接到已运行的浏览器实例,因此GitHub登录状态会自动保留。agent-browser可通过
--profile ~/.agent-browser-github
持久化登录状态。

Detection

检测方式

undefined
undefined

1. Search for MCP-based browser tools (preferred)

1. 优先搜索基于MCP的浏览器工具

ToolSearch: "browser navigate upload"
ToolSearch: "browser navigate upload"

2. Fall back to agent-browser only if no MCP tools found

2. 仅在未找到MCP工具时使用agent-browser作为备选

Bash: agent-browser --version
undefined
Bash: agent-browser --version
undefined

Tool Compatibility Matrix

工具兼容性矩阵

OperationPlaywright MCPChrome DevTools MCPagent-browser (CLI/Bash)
Navigate
browser_navigate
navigate_page
agent-browser --headed open {url}
Snapshot
browser_snapshot
take_snapshot
agent-browser snapshot
Screenshot
browser_take_screenshot
take_screenshot
agent-browser screenshot {path}
Click
browser_click
(ref)
click
(uid)
agent-browser click {ref}
File Upload
browser_file_upload
(paths)
upload_file
(uid, filePath)
agent-browser upload {ref} {path}
JS Eval
browser_evaluate
(function)
evaluate_script
(function)
agent-browser eval '{js}'
Login StatePreservedPreservedPreserved with
--profile
操作Playwright MCPChrome DevTools MCPagent-browser (CLI/Bash)
页面导航
browser_navigate
navigate_page
agent-browser --headed open {url}
页面快照
browser_snapshot
take_snapshot
agent-browser snapshot
截图
browser_take_screenshot
take_screenshot
agent-browser screenshot {path}
点击操作
browser_click
(ref)
click
(uid)
agent-browser click {ref}
文件上传
browser_file_upload
(paths)
upload_file
(uid, filePath)
agent-browser upload {ref} {absolute_path}
JS执行
browser_evaluate
(function)
evaluate_script
(function)
agent-browser eval '{js}'
登录状态已保留已保留使用--profile时保留

Steps

具体步骤

Step 1: Navigate to PR page and check login state

步骤1:导航至PR页面并检查登录状态

Navigate to the PR page and immediately take a snapshot to verify login state.
javascript
// Playwright MCP
browser_navigate({ url: "https://github.com/{owner}/{repo}/pull/{number}" })

// Chrome DevTools MCP
navigate_page({ url: "https://github.com/{owner}/{repo}/pull/{number}", type: "url" })

// agent-browser (use --profile to persist login state)
agent-browser --headed --profile ~/.agent-browser-github open "https://github.com/{owner}/{repo}/pull/{number}"
If SSO authentication screen appears: Take a snapshot, locate the "Continue" button, and click it.
If NOT logged in (agent-browser only):
  1. Navigate to
    https://github.com/login
  2. Ask the user to log in manually in the headed browser window.
  3. Wait for user confirmation, then navigate back to the PR page.
导航至PR页面后立即截取快照,验证登录状态。
javascript
// Playwright MCP
browser_navigate({ url: "https://github.com/{owner}/{repo}/pull/{number}" })

// Chrome DevTools MCP
navigate_page({ url: "https://github.com/{owner}/{repo}/pull/{number}", type: "url" })

// agent-browser(使用--profile持久化登录状态)
agent-browser --headed --profile ~/.agent-browser-github open "https://github.com/{owner}/{repo}/pull/{number}"
若出现SSO认证界面:截取快照,定位“Continue”按钮并点击。
若未登录(仅agent-browser)
  1. 导航至
    https://github.com/login
  2. 请用户在可视化浏览器窗口中手动登录。
  3. 等待用户确认后,返回PR页面。

Step 2: Locate the file upload input

步骤2:定位文件上传输入框

Take a snapshot/screenshot and scroll to the bottom to find the comment area.
GitHub renders a file upload input in the comment form. Try these selectors in order (GitHub's UI can change — if one fails, try the next):
javascript
// Shared JS for MCP-based tools — tries multiple known selectors
() => {
  const selectors = [
    'input[type="file"][id*="comment"]',
    'input[type="file"][id="fc-new_comment_field"]',
    '#new_comment_field',
    'input[type="file"]'
  ];
  for (const sel of selectors) {
    const el = document.querySelector(sel);
    if (el) return { found: true, id: el.id, selector: sel };
  }
  return { found: false };
}
For Chrome DevTools MCP, you can also take a snapshot to find the
uid
of the file upload element directly.
截取快照/截图,滚动至页面底部找到评论区域。
GitHub会在评论表单中渲染文件上传输入框。按以下顺序尝试选择器(GitHub UI可能变更——若一个失败,尝试下一个):
javascript
// 基于MCP工具的通用JS代码——尝试多个已知选择器
() => {
  const selectors = [
    'input[type="file"][id*="comment"]',
    'input[type="file"][id="fc-new_comment_field"]',
    '#new_comment_field',
    'input[type="file"]'
  ];
  for (const sel of selectors) {
    const el = document.querySelector(sel);
    if (el) return { found: true, id: el.id, selector: sel };
  }
  return { found: false };
}
对于Chrome DevTools MCP,也可直接截取快照以找到文件上传元素的
uid

Step 3: Upload images one by one

步骤3:逐个上传图片

Upload each image file using the detected tool. Wait 2–3 seconds between uploads to allow GitHub to process each file.
For multiple images, upload them all to the same comment textarea before extracting URLs — this is more efficient than navigating between uploads.
javascript
// Chrome DevTools MCP: upload_file requires the uid of the input element
// Playwright MCP: browser_file_upload takes the element ref and file path(s) array
// agent-browser: agent-browser upload {ref} {absolute_path}
Important: Always use absolute file paths.
使用检测到的工具上传每个图片文件。每次上传间隔2-3秒,以便GitHub处理文件。
若上传多张图片,先将所有图片上传至同一评论输入框,再提取URL——这比多次导航上传更高效。
javascript
// Chrome DevTools MCP:upload_file需要输入元素的uid
// Playwright MCP:browser_file_upload接收元素引用和文件路径数组
// agent-browser:agent-browser upload {ref} {absolute_path}
重要提示:始终使用绝对文件路径。

Step 4: Retrieve uploaded image URLs

步骤4:获取已上传图片的URL

Wait 3–5 seconds after the last upload, then read the textarea value. GitHub injects markdown image syntax like
![description](https://github.com/user-attachments/assets/...)
into the textarea:
javascript
// Shared JS — tries both known textarea IDs
() => {
  const ta = document.getElementById('new_comment_field')
          || document.querySelector('textarea[id*="comment"]');
  return ta ? ta.value : 'textarea not found';
}
bash
undefined
最后一次上传后等待3-5秒,然后读取输入框内容。GitHub会将类似
![description](https://github.com/user-attachments/assets/...)
的Markdown图片语法注入输入框:
javascript
// 通用JS代码——尝试两个已知的输入框ID
() => {
  const ta = document.getElementById('new_comment_field')
          || document.querySelector('textarea[id*="comment"]');
  return ta ? ta.value : 'textarea not found';
}
bash
undefined

agent-browser

agent-browser

agent-browser eval 'document.getElementById("new_comment_field")?.value || document.querySelector("textarea[id*=comment]")?.value || "not found"'

The response contains URLs in the format:
image

Extract all image URLs/markdown from the textarea value before clearing it.
agent-browser eval 'document.getElementById("new_comment_field")?.value || document.querySelector("textarea[id*=comment]")?.value || "not found"'

返回结果包含如下格式的URL:
image

在清空输入框前,提取其中所有图片URL/Markdown内容。

Step 5: Clear the textarea (do not submit the comment)

步骤5:清空输入框(请勿提交评论)

javascript
// MCP-based tools
() => {
  const ta = document.getElementById('new_comment_field')
           || document.querySelector('textarea[id*="comment"]');
  if (ta) { ta.value = ""; return "cleared"; }
  return "textarea not found";
}
bash
undefined
javascript
// 基于MCP的工具
() => {
  const ta = document.getElementById('new_comment_field')
           || document.querySelector('textarea[id*="comment"]');
  if (ta) { ta.value = ""; return "cleared"; }
  return "textarea not found";
}
bash
undefined

agent-browser

agent-browser

agent-browser eval 'const ta = document.getElementById("new_comment_field") || document.querySelector("textarea[id*=comment]"); if(ta){ta.value=""} "cleared"'
undefined
agent-browser eval 'const ta = document.getElementById("new_comment_field") || document.querySelector("textarea[id*=comment]"); if(ta){ta.value=""} "cleared"'
undefined

Step 6: Embed images in the PR

步骤6:将图片嵌入PR

Option A — Update PR description (append images to existing body):
bash
EXISTING_BODY=$(gh pr view {PR_NUMBER} --json body -q .body)

gh pr edit {PR_NUMBER} --body "$(printf '%s\n\n## Screenshots\n\n%s' "$EXISTING_BODY" "![screenshot](https://github.com/user-attachments/assets/...)")"
Option B — Post as a new comment:
bash
gh pr comment {PR_NUMBER} --body "## Screenshots

![screenshot](https://github.com/user-attachments/assets/...)"
Use Option A by default unless the user explicitly asks for a comment, or if the PR description is already long and a comment would be cleaner.
选项A——更新PR描述(将图片追加到现有内容后):
bash
EXISTING_BODY=$(gh pr view {PR_NUMBER} --json body -q .body)

gh pr edit {PR_NUMBER} --body "$(printf '%s\n\n## 截图\n\n%s' "$EXISTING_BODY" "![screenshot](https://github.com/user-attachments/assets/...)")"
选项B——发布为新评论
bash
gh pr comment {PR_NUMBER} --body "## 截图

![screenshot](https://github.com/user-attachments/assets/...)"
默认使用选项A,除非用户明确要求使用评论,或PR描述已过长,使用评论会更清晰。

Step 7: Verify the result

步骤7:验证结果

Reload the page and take a screenshot to confirm the images are displayed correctly.
重新加载页面并截取截图,确认图片已正确显示。

Tips

技巧

  • Image sizing: Control display size via HTML
    <img>
    tags:
    <img width="800" alt="description" src="..." />
  • Multiple images: Upload all images in one session to the same textarea; extract all URLs before clearing
  • Prefer MCP tools: Always prefer Playwright or Chrome DevTools MCP over agent-browser for simpler setup
  • agent-browser login persistence: Use
    --profile ~/.agent-browser-github
    to persist GitHub login across sessions
  • 图片尺寸:通过HTML
    <img>
    标签控制显示尺寸:
    <img width="800" alt="description" src="..." />
  • 多张图片:在同一会话中将所有图片上传至同一输入框;清空前提取所有URL
  • 优先使用MCP工具:始终优先选择Playwright或Chrome DevTools MCP,而非agent-browser,前者设置更简单
  • agent-browser登录持久化:使用
    --profile ~/.agent-browser-github
    在会话间持久化GitHub登录状态

Troubleshooting

故障排除

IssueSolution
Not logged in (MCP tools)SSO screen may appear — take snapshot, find "Continue" button, click it
Not logged in (agent-browser)Use
--headed
mode, navigate to login page, ask user to log in manually
Browser window not visibleFor agent-browser, ensure
--headed
flag is used
File path with special characters (e.g., Unicode narrow spaces from CleanShot)Copy file to
/tmp/
with a simple name:
cp /path/CleanShot*keyword*.png /tmp/screenshot.png
File upload failsEnsure the file path is absolute
Textarea doesn't contain URLs yetWait 3–5 seconds after upload before running JS eval; retry once if needed
Textarea selector not foundGitHub UI changes occasionally — use the multi-selector JS in Step 2 to find the current element
Chrome DevTools MCP disconnectedReconnect via
/mcp
command
agent-browser not found
npm install -g agent-browser && agent-browser install
No browser tools foundUse
ToolSearch
to search for available browser tools
PR not found / 404Private repos return 404 for unauthenticated users — check login state
问题解决方案
未登录(MCP工具)可能出现SSO界面——截取快照,找到“Continue”按钮并点击
未登录(agent-browser)使用
--headed
模式,导航至登录页面,请求用户手动登录
浏览器窗口不可见对于agent-browser,确保使用
--headed
参数
文件路径包含特殊字符(例如CleanShot生成的Unicode窄空格)将文件复制到
/tmp/
目录并使用简单名称:
cp /path/CleanShot*keyword*.png /tmp/screenshot.png
文件上传失败确保使用绝对文件路径
输入框中尚未出现URL上传后等待3-5秒再执行JS代码;若需要,重试一次
未找到输入框选择器GitHub UI偶尔会变更——使用步骤2中的多选择器JS代码查找当前元素
Chrome DevTools MCP断开连接通过
/mcp
命令重新连接
未找到agent-browser执行
npm install -g agent-browser && agent-browser install
未找到浏览器工具使用
ToolSearch
搜索可用的浏览器工具
PR未找到 / 404私有仓库对未认证用户返回404——检查登录状态

Notes

注意事项

  • GitHub
    user-attachments/assets/
    URLs are persistent — images remain accessible even without submitting the comment
  • Editing the description directly in the browser UI is fragile due to GitHub UI structure changes — updating via
    gh pr edit
    is strongly preferred
  • Multiple images can be uploaded in a single session before extracting URLs
  • MCP-based tools connect to existing browser instances, preserving cookies and login sessions
  • GitHub的
    user-attachments/assets/
    格式URL是持久化的——即使不提交评论,图片仍可访问
  • 直接在浏览器UI中编辑描述易受GitHub UI结构变更影响——强烈建议通过
    gh pr edit
    命令更新
  • 可在单次会话中上传多张图片,之后再提取URL
  • 基于MCP的工具连接到已运行的浏览器实例,可保留Cookie和登录会话