github-upload-image-to-pr

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Upload Image to PR

向PR上传图片

Upload local images to a GitHub PR and embed them in the description or comments using browser automation tools.

通过浏览器自动化工具将本地图片上传至GitHub PR，并嵌入到描述或评论中。

How It Works

工作原理

Since the GitHub API does not support direct image uploads, this skill uses the PR comment textarea as a staging area for GitHub's image hosting — uploading files there to obtain persistent

user-attachments/assets/

URLs, then updating the PR description or posting a comment via the

gh

CLI.

由于GitHub API不支持直接上传图片，此技能会利用PR评论输入框作为GitHub图片托管的临时区域——在此处上传文件以获取持久的

user-attachments/assets/

格式URL，随后通过

gh

CLI更新PR描述或发布评论。

Step 0: Resolve PR context

步骤0：解析PR上下文

If the user didn't specify a PR number or URL, auto-detect it:

bash

undefined

如果用户未指定PR编号或URL，自动检测：

bash

undefined

Get PR number from the current branch

从当前分支获取PR编号

gh pr view --json number,url -q '"(.number) (.url)"'


If multiple repos or branches are involved, confirm with the user which PR to target.

Also, normalize the image paths to absolute paths. If a path contains special characters (e.g., Unicode narrow spaces from CleanShot X), copy the file to `/tmp/` first:

```bash

gh pr view --json number,url -q '"(.number) (.url)"'


如果涉及多个仓库或分支，请与用户确认目标PR。

同时，将图片路径转换为绝对路径。如果路径包含特殊字符（例如CleanShot X生成的Unicode窄空格），请先将文件复制到`/tmp/`目录：

```bash

e.g., to handle glob-matched paths with special chars

示例：处理包含特殊字符的通配符匹配路径

cp /path/to/CleanShotkeyword.png /tmp/screenshot.png

undefined

cp /path/to/CleanShotkeyword.png /tmp/screenshot.png

undefined

Tool Detection and Selection

工具检测与选择

Priority Order

优先级顺序

Playwright MCP (MCP connection,
```
mcp__playwright__*
```
) — connects to existing browser, login state preserved
Chrome DevTools MCP (MCP connection,
```
mcp__chrome-devtools__*
```
) — connects to existing browser, login state preserved
agent-browser (CLI via Bash — fallback, login state preserved with
```
--profile
```
)

MCP-based tools connect to an already-running browser instance, so GitHub login state is automatically preserved. agent-browser can persist login state using

--profile ~/.agent-browser-github

Playwright MCP（MCP连接，
```
mcp__playwright__*
```
）——连接到已运行的浏览器，保留登录状态
Chrome DevTools MCP（MCP连接，
```
mcp__chrome-devtools__*
```
）——连接到已运行的浏览器，保留登录状态
agent-browser（通过Bash调用CLI——备选方案，使用
```
--profile
```
保留登录状态）

基于MCP的工具会连接到已运行的浏览器实例，因此GitHub登录状态会自动保留。agent-browser可通过

--profile ~/.agent-browser-github

持久化登录状态。

Detection

检测方式

undefined

undefined

1. Search for MCP-based browser tools (preferred)

1. 优先搜索基于MCP的浏览器工具

ToolSearch: "browser navigate upload"

2. Fall back to agent-browser only if no MCP tools found

2. 仅在未找到MCP工具时使用agent-browser作为备选

Bash: agent-browser --version

undefined

Bash: agent-browser --version

undefined

Tool Compatibility Matrix

工具兼容性矩阵

Operation	Playwright MCP	Chrome DevTools MCP	agent-browser (CLI/Bash)
Navigate	`browser_navigate`	`navigate_page`	`agent-browser --headed open {url}`
Snapshot	`browser_snapshot`	`take_snapshot`	`agent-browser snapshot`
Screenshot	`browser_take_screenshot`	`take_screenshot`	`agent-browser screenshot {path}`
Click	`browser_click` (ref)	`click` (uid)	`agent-browser click {ref}`
File Upload	`browser_file_upload` (paths)	`upload_file` (uid, filePath)	`agent-browser upload {ref} {path}`
JS Eval	`browser_evaluate` (function)	`evaluate_script` (function)	`agent-browser eval '{js}'`
Login State	Preserved	Preserved	Preserved with `--profile`

操作	Playwright MCP	Chrome DevTools MCP	agent-browser (CLI/Bash)
页面导航	`browser_navigate`	`navigate_page`	`agent-browser --headed open {url}`
页面快照	`browser_snapshot`	`take_snapshot`	`agent-browser snapshot`
截图	`browser_take_screenshot`	`take_screenshot`	`agent-browser screenshot {path}`
点击操作	`browser_click` (ref)	`click` (uid)	`agent-browser click {ref}`
文件上传	`browser_file_upload` (paths)	`upload_file` (uid, filePath)	`agent-browser upload {ref} {absolute_path}`
JS执行	`browser_evaluate` (function)	`evaluate_script` (function)	`agent-browser eval '{js}'`
登录状态	已保留	已保留	使用--profile时保留

Steps

具体步骤

Step 1: Navigate to PR page and check login state

步骤1：导航至PR页面并检查登录状态

Navigate to the PR page and immediately take a snapshot to verify login state.

javascript

// Playwright MCP
browser_navigate({ url: "https://github.com/{owner}/{repo}/pull/{number}" })

// Chrome DevTools MCP
navigate_page({ url: "https://github.com/{owner}/{repo}/pull/{number}", type: "url" })

// agent-browser (use --profile to persist login state)
agent-browser --headed --profile ~/.agent-browser-github open "https://github.com/{owner}/{repo}/pull/{number}"

If SSO authentication screen appears: Take a snapshot, locate the "Continue" button, and click it.

If NOT logged in (agent-browser only):

Navigate to
```
https://github.com/login
```
Ask the user to log in manually in the headed browser window.
Wait for user confirmation, then navigate back to the PR page.

导航至PR页面后立即截取快照，验证登录状态。

javascript

// Playwright MCP
browser_navigate({ url: "https://github.com/{owner}/{repo}/pull/{number}" })

// Chrome DevTools MCP
navigate_page({ url: "https://github.com/{owner}/{repo}/pull/{number}", type: "url" })

// agent-browser（使用--profile持久化登录状态）
agent-browser --headed --profile ~/.agent-browser-github open "https://github.com/{owner}/{repo}/pull/{number}"

若出现SSO认证界面：截取快照，定位“Continue”按钮并点击。

若未登录（仅agent-browser）：

导航至
```
https://github.com/login
```
请用户在可视化浏览器窗口中手动登录。
等待用户确认后，返回PR页面。

Step 2: Locate the file upload input

步骤2：定位文件上传输入框

Take a snapshot/screenshot and scroll to the bottom to find the comment area.

GitHub renders a file upload input in the comment form. Try these selectors in order (GitHub's UI can change — if one fails, try the next):

javascript

// Shared JS for MCP-based tools — tries multiple known selectors
() => {
  const selectors = [
    'input[type="file"][id*="comment"]',
    'input[type="file"][id="fc-new_comment_field"]',
    '#new_comment_field',
    'input[type="file"]'
  ];
  for (const sel of selectors) {
    const el = document.querySelector(sel);
    if (el) return { found: true, id: el.id, selector: sel };
  }
  return { found: false };
}

For Chrome DevTools MCP, you can also take a snapshot to find the

uid

of the file upload element directly.

截取快照/截图，滚动至页面底部找到评论区域。

GitHub会在评论表单中渲染文件上传输入框。按以下顺序尝试选择器（GitHub UI可能变更——若一个失败，尝试下一个）：

javascript

// 基于MCP工具的通用JS代码——尝试多个已知选择器
() => {
  const selectors = [
    'input[type="file"][id*="comment"]',
    'input[type="file"][id="fc-new_comment_field"]',
    '#new_comment_field',
    'input[type="file"]'
  ];
  for (const sel of selectors) {
    const el = document.querySelector(sel);
    if (el) return { found: true, id: el.id, selector: sel };
  }
  return { found: false };
}

对于Chrome DevTools MCP，也可直接截取快照以找到文件上传元素的

uid

。

Step 3: Upload images one by one

步骤3：逐个上传图片

Upload each image file using the detected tool. Wait 2–3 seconds between uploads to allow GitHub to process each file.

For multiple images, upload them all to the same comment textarea before extracting URLs — this is more efficient than navigating between uploads.

javascript

// Chrome DevTools MCP: upload_file requires the uid of the input element
// Playwright MCP: browser_file_upload takes the element ref and file path(s) array
// agent-browser: agent-browser upload {ref} {absolute_path}

Important: Always use absolute file paths.

使用检测到的工具上传每个图片文件。每次上传间隔2-3秒，以便GitHub处理文件。

若上传多张图片，先将所有图片上传至同一评论输入框，再提取URL——这比多次导航上传更高效。

javascript

// Chrome DevTools MCP：upload_file需要输入元素的uid
// Playwright MCP：browser_file_upload接收元素引用和文件路径数组
// agent-browser：agent-browser upload {ref} {absolute_path}

重要提示：始终使用绝对文件路径。

Step 4: Retrieve uploaded image URLs

步骤4：获取已上传图片的URL

Wait 3–5 seconds after the last upload, then read the textarea value. GitHub injects markdown image syntax like

![description](https://github.com/user-attachments/assets/...)

into the textarea:

javascript

// Shared JS — tries both known textarea IDs
() => {
  const ta = document.getElementById('new_comment_field')
          || document.querySelector('textarea[id*="comment"]');
  return ta ? ta.value : 'textarea not found';
}

bash

undefined

最后一次上传后等待3-5秒，然后读取输入框内容。GitHub会将类似

![description](https://github.com/user-attachments/assets/...)

的Markdown图片语法注入输入框：

javascript

// 通用JS代码——尝试两个已知的输入框ID
() => {
  const ta = document.getElementById('new_comment_field')
          || document.querySelector('textarea[id*="comment"]');
  return ta ? ta.value : 'textarea not found';
}

bash

undefined

agent-browser

agent-browser eval 'document.getElementById("new_comment_field")?.value || document.querySelector("textarea[id*=comment]")?.value || "not found"'


The response contains URLs in the format:


Extract all image URLs/markdown from the textarea value before clearing it.

agent-browser eval 'document.getElementById("new_comment_field")?.value || document.querySelector("textarea[id*=comment]")?.value || "not found"'


返回结果包含如下格式的URL：


在清空输入框前，提取其中所有图片URL/Markdown内容。

Step 5: Clear the textarea (do not submit the comment)

步骤5：清空输入框（请勿提交评论）

javascript

// MCP-based tools
() => {
  const ta = document.getElementById('new_comment_field')
           || document.querySelector('textarea[id*="comment"]');
  if (ta) { ta.value = ""; return "cleared"; }
  return "textarea not found";
}

bash

undefined

javascript

// 基于MCP的工具
() => {
  const ta = document.getElementById('new_comment_field')
           || document.querySelector('textarea[id*="comment"]');
  if (ta) { ta.value = ""; return "cleared"; }
  return "textarea not found";
}

bash

undefined

agent-browser

agent-browser eval 'const ta = document.getElementById("new_comment_field") || document.querySelector("textarea[id*=comment]"); if(ta){ta.value=""} "cleared"'

undefined

agent-browser eval 'const ta = document.getElementById("new_comment_field") || document.querySelector("textarea[id*=comment]"); if(ta){ta.value=""} "cleared"'

undefined

Step 6: Embed images in the PR

步骤6：将图片嵌入PR

Option A — Update PR description (append images to existing body):

bash

EXISTING_BODY=$(gh pr view {PR_NUMBER} --json body -q .body)

gh pr edit {PR_NUMBER} --body "$(printf '%s\n\n## Screenshots\n\n%s' "$EXISTING_BODY" "![screenshot](https://github.com/user-attachments/assets/...)")"

Option B — Post as a new comment:

bash

gh pr comment {PR_NUMBER} --body "## Screenshots

![screenshot](https://github.com/user-attachments/assets/...)"

Use Option A by default unless the user explicitly asks for a comment, or if the PR description is already long and a comment would be cleaner.

选项A——更新PR描述（将图片追加到现有内容后）：

bash

EXISTING_BODY=$(gh pr view {PR_NUMBER} --json body -q .body)

gh pr edit {PR_NUMBER} --body "$(printf '%s\n\n## 截图\n\n%s' "$EXISTING_BODY" "![screenshot](https://github.com/user-attachments/assets/...)")"

选项B——发布为新评论：

bash

gh pr comment {PR_NUMBER} --body "## 截图

![screenshot](https://github.com/user-attachments/assets/...)"

默认使用选项A，除非用户明确要求使用评论，或PR描述已过长，使用评论会更清晰。

Step 7: Verify the result

步骤7：验证结果

Reload the page and take a screenshot to confirm the images are displayed correctly.

重新加载页面并截取截图，确认图片已正确显示。

Tips

技巧

Image sizing: Control display size via HTML

<img>

tags:

<img width="800" alt="description" src="..." />

Multiple images: Upload all images in one session to the same textarea; extract all URLs before clearing
Prefer MCP tools: Always prefer Playwright or Chrome DevTools MCP over agent-browser for simpler setup
agent-browser login persistence: Use
```
--profile ~/.agent-browser-github
```
to persist GitHub login across sessions

图片尺寸：通过HTML

<img>

标签控制显示尺寸：

<img width="800" alt="description" src="..." />

多张图片：在同一会话中将所有图片上传至同一输入框；清空前提取所有URL
优先使用MCP工具：始终优先选择Playwright或Chrome DevTools MCP，而非agent-browser，前者设置更简单
agent-browser登录持久化：使用
```
--profile ~/.agent-browser-github
```
在会话间持久化GitHub登录状态

Troubleshooting

故障排除

Issue	Solution
Not logged in (MCP tools)	SSO screen may appear — take snapshot, find "Continue" button, click it
Not logged in (agent-browser)	Use `--headed` mode, navigate to login page, ask user to log in manually
Browser window not visible	For agent-browser, ensure `--headed` flag is used
File path with special characters (e.g., Unicode narrow spaces from CleanShot)	Copy file to `/tmp/` with a simple name: `cp /path/CleanShotkeyword.png /tmp/screenshot.png`
File upload fails	Ensure the file path is absolute
Textarea doesn't contain URLs yet	Wait 3–5 seconds after upload before running JS eval; retry once if needed
Textarea selector not found	GitHub UI changes occasionally — use the multi-selector JS in Step 2 to find the current element
Chrome DevTools MCP disconnected	Reconnect via `/mcp` command
agent-browser not found	`npm install -g agent-browser && agent-browser install`
No browser tools found	Use `ToolSearch` to search for available browser tools
PR not found / 404	Private repos return 404 for unauthenticated users — check login state

问题	解决方案
未登录（MCP工具）	可能出现SSO界面——截取快照，找到“Continue”按钮并点击
未登录（agent-browser）	使用 `--headed` 模式，导航至登录页面，请求用户手动登录
浏览器窗口不可见	对于agent-browser，确保使用 `--headed` 参数
文件路径包含特殊字符（例如CleanShot生成的Unicode窄空格）	将文件复制到 `/tmp/` 目录并使用简单名称： `cp /path/CleanShotkeyword.png /tmp/screenshot.png`
文件上传失败	确保使用绝对文件路径
输入框中尚未出现URL	上传后等待3-5秒再执行JS代码；若需要，重试一次
未找到输入框选择器	GitHub UI偶尔会变更——使用步骤2中的多选择器JS代码查找当前元素
Chrome DevTools MCP断开连接	通过 `/mcp` 命令重新连接
未找到agent-browser	执行 `npm install -g agent-browser && agent-browser install`
未找到浏览器工具	使用 `ToolSearch` 搜索可用的浏览器工具
PR未找到 / 404	私有仓库对未认证用户返回404——检查登录状态

Notes

注意事项

GitHub
```
user-attachments/assets/
```
URLs are persistent — images remain accessible even without submitting the comment
Editing the description directly in the browser UI is fragile due to GitHub UI structure changes — updating via
```
gh pr edit
```
is strongly preferred
Multiple images can be uploaded in a single session before extracting URLs
MCP-based tools connect to existing browser instances, preserving cookies and login sessions

GitHub的
```
user-attachments/assets/
```
格式URL是持久化的——即使不提交评论，图片仍可访问
直接在浏览器UI中编辑描述易受GitHub UI结构变更影响——强烈建议通过
```
gh pr edit
```
命令更新
可在单次会话中上传多张图片，之后再提取URL
基于MCP的工具连接到已运行的浏览器实例，可保留Cookie和登录会话