image-to-text
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseImage to Text
图片转文本
Extract all readable text from an image using OCR (Tesseract). Returns the full text content along with word-level bounding boxes and confidence scores.
使用OCR(Tesseract)从图片中提取所有可读取的文本。返回完整文本内容以及单词级别的边界框和置信度分数。
When to Use
适用场景
- Reading text content from a screenshot or design mockup
- Extracting UI copy (labels, buttons, headings) so you don't have to retype it
- Getting text positions and bounding boxes from a design image
- 读取截图或设计原型中的文本内容
- 提取UI文案(标签、按钮、标题),无需手动重新输入
- 获取设计图中的文本位置和边界框
How It Works
工作原理
- The image is passed to Tesseract.js for optical character recognition
- Tesseract segments the image into lines and words
- Returns the full text plus word-level details (position, confidence)
- 将图片传入Tesseract.js进行光学字符识别
- Tesseract将图片分割为行和单词
- 返回完整文本以及单词级别的详细信息(位置、置信度)
Usage
使用方法
bash
bash <skill-path>/scripts/image-to-text.sh <image-path> [language]Arguments:
- — Path to the image file (required)
image-path - — OCR language code (optional, defaults to
language). Common:eng,eng,fra,deu,spa,chi_simjpn
Examples:
bash
undefinedbash
bash <skill-path>/scripts/image-to-text.sh <image-path> [language]参数说明:
- — 图片文件路径(必填)
image-path - — OCR语言代码(可选,默认值为
language)。常用代码:eng(英文)、eng(法文)、fra(德文)、deu(西班牙文)、spa(简体中文)、chi_sim(日文)jpn
示例:
bash
undefinedExtract text from a screenshot
从截图中提取文本
bash <skill-path>/scripts/image-to-text.sh ./screenshot.png
bash <skill-path>/scripts/image-to-text.sh ./screenshot.png
Extract French text
提取法语文本
bash <skill-path>/scripts/image-to-text.sh ./mockup.png fra
undefinedbash <skill-path>/scripts/image-to-text.sh ./mockup.png fra
undefinedOutput
输出示例
json
{
"text": "Request work\nSuggestions\nPlumbing\nHVAC\nCleaning\nElectrical",
"confidence": 87.4,
"words": [
{
"text": "Request",
"confidence": 94.2,
"bbox": { "x0": 142, "y0": 180, "x1": 268, "y1": 204 }
},
{
"text": "work",
"confidence": 96.1,
"bbox": { "x0": 274, "y0": 180, "x1": 332, "y1": 204 }
}
],
"lines": [
{
"text": "Request work",
"confidence": 95.1,
"bbox": { "x0": 142, "y0": 180, "x1": 332, "y1": 204 }
}
]
}| Field | Type | Description |
|---|---|---|
| text | String | Full extracted text, newline-separated |
| confidence | Number | Overall confidence score (0-100) |
| words | Array | Each word with text, confidence, and bounding box |
| lines | Array | Each line with text, confidence, and bounding box |
json
{
"text": "Request work\nSuggestions\nPlumbing\nHVAC\nCleaning\nElectrical",
"confidence": 87.4,
"words": [
{
"text": "Request",
"confidence": 94.2,
"bbox": { "x0": 142, "y0": 180, "x1": 268, "y1": 204 }
},
{
"text": "work",
"confidence": 96.1,
"bbox": { "x0": 274, "y0": 180, "x1": 332, "y1": 204 }
}
],
"lines": [
{
"text": "Request work",
"confidence": 95.1,
"bbox": { "x0": 142, "y0": 180, "x1": 332, "y1": 204 }
}
]
}| 字段 | 类型 | 描述 |
|---|---|---|
| text | 字符串 | 提取的完整文本,按换行分隔 |
| confidence | 数字 | 整体置信度分数(0-100) |
| words | 数组 | 每个单词包含文本、置信度和边界框信息 |
| lines | 数组 | 每一行包含文本、置信度和边界框信息 |
Present Results to User
向用户展示结果
After extracting text, present the content grouped by lines:
Extracted text (87.4% confidence):
Request work
Suggestions
Plumbing
HVAC
Cleaning
Electrical
Found 6 lines, 6 words.Use the extracted text directly when implementing UI copy from a design.
提取文本后,按行分组展示内容:
提取的文本(置信度87.4%):
Request work
Suggestions
Plumbing
HVAC
Cleaning
Electrical
共找到6行,6个单词。从设计图实现UI文案时,可直接使用提取的文本。
Troubleshooting
故障排除
Low confidence / garbled text — Tesseract works best with clean, high-contrast text. Screenshots of rendered UI work well. Photos of text at angles or with noise may produce poor results.
Wrong language — Pass the correct language code as the second argument. Tesseract needs the right language model to recognize characters.
First run is slow — Tesseract downloads language data (~4MB for English) on the first run. Subsequent runs are faster.
低置信度/文本乱码 — Tesseract在处理清晰、高对比度的文本时效果最佳。渲染后的UI截图效果很好。倾斜或带有噪点的照片文本可能会产生较差的结果。
语言识别错误 — 传入正确的语言代码作为第二个参数。Tesseract需要匹配的语言模型才能识别字符。
首次运行缓慢 — Tesseract首次运行时会下载语言数据(英文约4MB)。后续运行速度会更快。