Loading...
Loading...
Found 7 Skills
Minimal image-understanding smoke test for Model Studio Qwen VL.
通过 MiniMax MCP 进行图像理解,适用于 OpenClaw 平台。如果你是 Claude Code 用户,请忽略此技能。
Process multimodal inputs (images, video, audio, PDFs) with Gemini 3 Pro. Covers image understanding, video analysis, audio processing, document extraction, media resolution control, OCR, and token optimization. Use when analyzing images, processing video, transcribing audio, extracting PDF content, or working with multimodal data.
[QwenCloud] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qwencloud-vision). DO NOT TRIGGER when: user wants to generate/create images (use qwencloud-image-generation), generate videos (use qwencloud-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.
Understand images with Alibaba Cloud Model Studio Qwen VL models (qwen3-vl-plus/qwen3-vl-flash and latest aliases). Use when building image Q&A, visual analysis, OCR-like extraction, chart/table reading, or screenshot understanding workflows.
Use MiniMax MCP for image understanding and analysis. Trigger conditions: (1) Users request to analyze images, understand images, describe image content (2) Need to identify objects, text, and scenes in images (3) Use MiniMax's understand_image feature
MiniMax Coding Plan MCP - Web search and image understanding tools for developers