Loading...
Loading...
Use when OCR-specialized extraction is needed with Alibaba Cloud Model Studio Qwen OCR models (`qwen-vl-ocr`, `qwen-vl-ocr-latest`, and snapshots), including document parsing, table parsing, multilingual OCR, formula recognition, and key information extraction.
npx skill4agent add cinience/alicloud-skills aliyun-qwen-ocrmkdir -p output/aliyun-qwen-ocr
python -m py_compile skills/ai/multimodal/aliyun-qwen-ocr/scripts/prepare_ocr_request.py && echo "py_compile_ok" > output/aliyun-qwen-ocr/validate.txtoutput/aliyun-qwen-ocr/validate.txtoutput/aliyun-qwen-ocr/qwen-vl-ocrqwen-vl-ocr-latestqwen-vl-ocr-2025-11-20qwen-vl-ocr-2025-08-28qwen-vl-ocr-2025-04-13qwen-vl-ocr-2024-10-28qwen-vl-ocrqwen-vl-ocr-latestqwen-vl-ocr-2025-11-20python3 -m venv .venv
. .venv/bin/activate
python -m pip install requestsDASHSCOPE_API_KEYdashscope_api_key~/.alibabacloud/credentialsimagedata:modelqwen-vl-ocrprompttasktask_configenable_rotatefalsemin_pixelsmax_pixelsmax_tokenstemperaturetextmodelusagetasktext_recognitionkey_information_extractiondocument_parsingtable_parsingformula_recognitionmulti_lanadvanced_recognitionpython skills/ai/multimodal/aliyun-qwen-ocr/scripts/prepare_ocr_request.py \
--image "https://example.com/invoice.png" \
--prompt "Extract seller name, invoice date, amount, and tax number in JSON."python skills/ai/multimodal/aliyun-qwen-ocr/scripts/prepare_ocr_request.py \
--image "https://example.com/table.png" \
--task table_parsing \
--model qwen-vl-ocr-2025-11-20qwen-vl-ocr4096qwen-vl-ocr-2025-11-20max_pixelsoutput/aliyun-qwen-ocr/request.jsonOUTPUT_DIRreferences/api_reference.mdreferences/sources.md