Loading...
Loading...
OCR skill using PaddleOCR model via SiliconFlow API. This skill should be used when the user asks to "recognize text from an image", "extract text from a photo", "OCR this image", "read text from screenshot", or mentions "PaddleOCR", "image text recognition", "text extraction from images".
npx skill4agent add aotenjou/silicon-paddleocr silicon-paddle-ocrSILICONFLOW_API_KEYexport SILICONFLOW_API_KEY="your_api_key"python3 scripts/ocr_skill.py [options] image_path| Argument | Description |
|---|---|
| Image file path(s) or glob pattern (required) |
| API key (default: from SILICONFLOW_API_KEY env) |
| OCR model name (default: PaddlePaddle/PaddleOCR-VL-1.5) |
| Recognition prompt for custom behavior |
| Output results in JSON format |
| Save results to specified file |
| Maximum tokens in response (default: 2000) |
python3 scripts/ocr_skill.py /path/to/image.jpgpython3 scripts/ocr_skill.py /path/to/images/*.pngpython3 scripts/ocr_skill.py --json /path/to/image.jpgpython3 scripts/ocr_skill.py -p "Please identify and format table content as Markdown" /path/to/table.jpgpython3 scripts/ocr_skill.py --json --output results.json /path/to/images/*.jpg--- image.jpg ---
识别到的文字内容
识别到 X 处文字区域{
"image.jpg": {
"image_path": "/path/to/image.jpg",
"image_size": [width, height],
"texts": [
{
"text": "识别的文字",
"box": [[x1, y1], [x2, y2], [x3, y3], [x4, y4]]
}
],
"full_text": "所有文本的组合"
},
"image2.png": { ... }
}boxreferences/api-configuration.mdexamples/sample-usage.shscripts/ocr_skill.py