Loading...
Loading...
Use the official MinerU (mineru.net) parsing API to convert a URL (HTML pages like WeChat articles, or direct PDF/Office/image links) into clean Markdown + structured outputs. Use when web_fetch/browser can’t access or extracts messy content, and you want higher-fidelity parsing (layout/table/formula/OCR).
npx skill4agent add blessonism/openclaw-search-skills mineru-extractscripts/mineru_parse_documents.py--file-sources{ ok, items, errors }scripts/mineru_extract.pyMINERU_TOKEN.pdf/.doc/.ppt/.png/.jpgpipelineMinerU-HTML.env# In the mineru-extract skill directory: .env
MINERU_TOKEN=your_token_here
MINERU_API_BASE=https://mineru.netpython3 mineru-extract/scripts/mineru_parse_documents.py \
--file-sources "<URL1>\n<URL2>" \
--language ch \
--enable-ocr \
--model-version MinerU-HTMLpython3 mineru-extract/scripts/mineru_parse_documents.py \
--file-sources "<URL>" \
--model-version MinerU-HTML \
--emit-markdown --max-chars 20000python3 mineru-extract/scripts/mineru_extract.py "<URL>" --model MinerU-HTML --print > /tmp/out.md~/.openclaw/workspace/mineru/<task_id>/result.ziptask_idfull_zip_urlout_dirmarkdown_path--modelpipeline | vlm | MinerU-HTMLMinerU-HTML--ocr/--no-ocrpipelinevlm--table/--no-table--formula/--no-formula--language ch|en|...--page-ranges "2,4-6"--timeout 600--poll-interval 2err_msg