code-from-image

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Code From Image

从图片提取代码

Overview

概述

This skill provides guidance for extracting code or pseudocode from images and implementing it correctly. It covers OCR tool selection, handling ambiguous text extraction, and verification strategies to ensure accurate implementation.

本技能提供从图片中提取代码或伪代码并正确实现的指导，涵盖OCR工具选择、模糊文本提取处理以及确保实现准确性的验证策略。

Workflow

工作流程

Step 1: Environment Preparation

步骤1：环境准备

Before attempting to read an image, check available tools and packages:

Check what package managers are available (
```
pip
```
,
```
pip3
```
,
```
uv
```
,
```
conda
```
)
Check what image processing tools are installed (
```
tesseract
```
,
```
pytesseract
```
,
```
PIL/Pillow
```
)
Install missing dependencies before proceeding

This avoids wasted attempts with unavailable tools.

在尝试读取图片前，检查可用的工具和包：

检查可用的包管理器（
```
pip
```
、
```
pip3
```
、
```
uv
```
、
```
conda
```
）
检查已安装的图像处理工具（
```
tesseract
```
、
```
pytesseract
```
、
```
PIL/Pillow
```
）
先安装缺失的依赖项再继续操作

这样可以避免因工具不可用而做无用功。

Step 2: Image Analysis

步骤2：图片分析

Examine the image before OCR extraction:

Use
```
file <image>
```
to verify the file type and ensure it's a valid image
Open the image visually if possible to understand content structure
Note the image quality, contrast, and text clarity

在进行OCR提取前先检查图片：

使用
```
file <image>
```
命令验证文件类型，确保是有效图片
若可能，可视化打开图片以了解内容结构
记录图片的质量、对比度和文本清晰度

Step 3: OCR Extraction with Multiple Attempts

步骤3：多次尝试OCR提取

OCR is inherently error-prone. To maximize accuracy:

First attempt: Use standard OCR (pytesseract with default settings)
If output is garbled: Apply image preprocessing:
- Increase contrast
- Convert to grayscale
- Apply binarization (threshold)
- Resize the image (2x or 3x upscaling can help)
Compare outputs: If multiple OCR attempts yield different results, cross-reference them

Example preprocessing with PIL:

python

from PIL import Image, ImageEnhance, ImageFilter

img = Image.open("code.png")

OCR本质上容易出错，为了最大化准确率：

首次尝试：使用标准OCR（默认设置的pytesseract）
若输出乱码：应用图片预处理：
- 提高对比度
- 转换为灰度图
- 应用二值化（阈值处理）
- 调整图片大小（放大2倍或3倍可能有帮助）
对比输出：如果多次OCR尝试得到不同结果，交叉参考这些结果

使用PIL进行预处理的示例：

python

from PIL import Image, ImageEnhance, ImageFilter

img = Image.open("code.png")

Convert to grayscale

转换为灰度图

img = img.convert("L")

Increase contrast

提高对比度

enhancer = ImageEnhance.Contrast(img) img = enhancer.enhance(2.0)

Apply threshold for binarization

应用阈值进行二值化

img = img.point(lambda x: 0 if x < 128 else 255, '1') img.save("preprocessed.png")

undefined

img = img.point(lambda x: 0 if x < 128 else 255, '1') img.save("preprocessed.png")

undefined

Step 4: Interpreting OCR Output

步骤4：解读OCR输出

OCR frequently produces character substitution errors. Document all interpretations explicitly:

Common OCR Misreadings:

```
0
```
(zero) vs
```
O
```
(letter O) vs
```
o
```
(lowercase o)
```
1
```
(one) vs
```
l
```
(lowercase L) vs
```
I
```
(uppercase i)
```
S
```
vs
```
5
```
vs
```
$
```
```
G
```
vs
```
6
```
```
B
```
vs
```
8
```
```
:
```
vs
```
;
```
```
sha256
```
may appear as
```
cha256
```
or
```
sha2S6
```
Variable names may have incorrect characters (e.g.,
```
GALT
```
instead of
```
SALT
```
)
Quote characters may be mangled (
```
6"
```
instead of
```
b"
```
for byte strings)
Array slicing may be garbled (
```
h0[:10]
```
appearing as
```
hof:10]
```
)

Process for interpretation:

List each unclear portion of the OCR output
Document the most likely correct interpretation
Explain reasoning for each interpretation
Flag any interpretations with high uncertainty

OCR经常会出现字符替换错误，需明确记录所有解读内容：

常见OCR识别错误：

```
0
```
（数字零）与
```
O
```
（大写字母O）与
```
o
```
（小写字母o）
```
1
```
（数字一）与
```
l
```
（小写字母L）与
```
I
```
（大写字母I）
```
S
```
与
```
5
```
与
```
$
```
```
G
```
与
```
6
```
```
B
```
与
```
8
```
```
:
```
与
```
;
```
```
sha256
```
可能被识别为
```
cha256
```
或
```
sha2S6
```
变量名可能包含错误字符（例如，
```
GALT
```
被识别成
```
SALT
```
）
引号字符可能被混淆（例如字节字符串的
```
b"
```
被识别成
```
6"
```
）
数组切片可能被识别错误（例如
```
h0[:10]
```
被识别成
```
hof:10]
```
）

解读流程：

列出OCR输出中每个不清晰的部分
记录最可能正确的解读结果
解释每个解读的理由
标记任何不确定性高的解读

Step 5: Implementation

步骤5：实现代码

When implementing the extracted code:

Preserve the algorithm structure: Follow the logic as written, don't optimize prematurely
Handle encoding explicitly: For cryptographic operations, be explicit about string vs bytes encoding
Add basic error handling: Include try/except for file operations and external calls
Log intermediate values: Print or log intermediate results for debugging

实现提取的代码时：

保留算法结构：按照编写的逻辑执行，不要过早优化
明确处理编码：对于加密操作，要明确区分字符串与字节编码
添加基础错误处理：为文件操作和外部调用添加try/except块
记录中间值：打印或记录中间结果以便调试

Step 6: Verification

步骤6：验证

Verify the implementation systematically:

If a hint is provided (e.g., expected output prefix): Use it to validate, but don't rely on it exclusively
Trace through the algorithm manually: Verify your understanding matches the implementation
Test with known inputs: If possible, create test cases with predictable outputs
Check edge cases: Empty inputs, special characters, boundary conditions

Warning: Using hints as the sole validation is brittle. A correct output prefix doesn't guarantee the algorithm is fully correct for all inputs.

系统地验证实现结果：

若提供提示（例如预期输出前缀）：用它来验证，但不要完全依赖
手动跟踪算法流程：验证你的理解是否与实现一致
使用已知输入测试：若可能，创建具有可预测输出的测试用例
检查边界情况：空输入、特殊字符、边界条件

警告：仅使用提示作为唯一验证方式不可靠。正确的输出前缀并不保证算法对所有输入都完全正确。

Common Pitfalls

常见陷阱

OCR-Related

OCR相关

Accepting first OCR output without verification: Always cross-check unclear characters
Not documenting assumptions: When interpreting garbled text, explicitly state what you're assuming
Skipping preprocessing: Image enhancement significantly improves OCR accuracy

未验证就接受首次OCR输出：始终交叉检查不清晰的字符
未记录假设：解读乱码文本时，明确说明你的假设
跳过预处理：图片增强能显著提高OCR准确率

Implementation-Related

实现相关

String vs bytes confusion: In Python, cryptographic functions often require bytes (
```
b"string"
```
) not strings
Missing imports: Ensure all required modules are imported before running
Silent failures: Add explicit error messages for file operations

混淆字符串与字节：在Python中，加密函数通常需要字节（
```
b"string"
```
）而非字符串
缺失导入：运行前确保导入所有必需的模块
静默失败：为文件操作添加明确的错误提示

Verification-Related

验证相关

Over-relying on partial hints: A matching prefix doesn't mean the full output is correct
Not validating intermediate steps: Check values at each stage, not just the final output
Assuming OCR was correct: If output doesn't match expectations, revisit OCR interpretation

过度依赖部分提示：匹配前缀不代表完整输出正确
未验证中间步骤：检查每个阶段的值，而不只是最终输出
假设OCR结果正确：如果输出与预期不符，重新审视OCR解读

Fallback Strategy

备选策略

If the initial interpretation produces incorrect results:

Re-examine the original image, focusing on unclear characters
Try alternative OCR preprocessing techniques
List all ambiguous characters and test alternative interpretations systematically
If multiple interpretations exist, implement and test each one

如果初始解读产生错误结果：

重新检查原始图片，重点关注不清晰的字符
尝试其他OCR预处理技术
列出所有模糊字符，系统地测试备选解读
若存在多种解读，分别实现并测试每种情况

Example Workflow

工作流程示例

For a task like "Extract pseudocode from image and compute hash":

Check environment:
```
which tesseract
```
,
```
pip3 list | grep -i pil
```
Install if needed:
```
pip3 install pillow pytesseract
```
Analyze image:
```
file code.png
```
Extract text with OCR
If garbled, preprocess image and retry OCR
Document interpretations: "OCR shows
```
GALT = 6"0000...
```
- interpreting as
```
SALT = b"0000..."
```
because G/S confusion is common and 6" likely represents b" for bytes"
Implement the algorithm
Verify output against any provided hints
If verification fails, revisit step 5-6 with alternative interpretations

对于诸如"从图片提取伪代码并计算哈希值"的任务：

检查环境：
```
which tesseract
```
、
```
pip3 list | grep -i pil
```
若需要则安装：
```
pip3 install pillow pytesseract
```
分析图片：
```
file code.png
```
使用OCR提取文本
若输出乱码，预处理图片并重试OCR
记录解读内容："OCR显示
```
GALT = 6"0000...
```
- 解读为
```
SALT = b"0000..."
```
，因为G与S的混淆很常见，且
```
6"
```
可能代表字节的
```
b"
```
"
实现算法
根据提供的提示验证输出
若验证失败，回到步骤5-6尝试备选解读