Loading...
Loading...
Found 32 Skills
Use when visual reasoning is needed with Alibaba Cloud Model Studio QVQ models, including step-by-step image reasoning, chart analysis, and visually grounded problem solving.
Audit, plan, and safely optimize Shopify image alt text for product media, collection featured images, article featured images, and article inline images. Use when a merchant wants an AI agent to scan Shopify images, test whether the active AI model can inspect images, generate concise alt text with multimodal image understanding when available or context-only fallback when it is not, review the proposed changes in batches, and apply approved Shopify Admin updates.
Use when implementing ANY computer vision feature - image analysis, object detection, pose detection, person segmentation, subject lifting, hand/body pose tracking.
Vision and multimodal capabilities for Claude including image analysis, PDF processing, and document understanding. Activate for image input, base64 encoding, multiple images, and visual analysis.
多模态产品图片相似度分析与分组。当用户提到产品图片相似度、视觉分组、查找外观相似的商品、基于图片去重、竞品同款检测、同款商品聚类、按外观分组、image similarity, product image comparison, visual clustering, same-style recognition, appearance deduplication, image grouping时触发此技能。即使用户未明确说"图片相似度",只要其意图涉及商品主图对比、视觉聚类、识别视觉上相同或相似的商品,或根据外观、颜色、构图等视觉特征对商品列表进行后处理,也应触发此技能。
Perform comprehensive forensic analysis of disk images using Autopsy to recover files, examine artifacts, and build investigation timelines.
Systematic visual geolocation reasoning from images. [VAD] Analyzes photos, street views, or satellite imagery to determine location. Uses visual clue extraction, hypothesis formation, and web verification. [NÄR] Use when: geolocate, identify location, where is this, find this place, geographic analysis, location from image, OSINT geolocation [EXPERTISE] Visual analysis, geographic indicators, verification strategies
[QianWen] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qianwen-vision). DO NOT TRIGGER when: user wants to generate/create images (use qianwen-image-generation), generate videos (use qianwen-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.
Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.
Analyze text and images for harmful content using Azure AI Content Safety (@azure-rest/ai-content-safety). Use when moderating user-generated content, detecting hate speech, violence, sexual conten...
Analyze images using OpenAI's Vision API. Use bash command to execute the vision script like 'bash <base_dir>/scripts/vision.sh <image> <question>'. Can understand image content, objects, text, colors, and answer questions about images.
Image analysis and manipulation. Use when: user wants to analyze images, extract metadata, convert formats, resize, or get image information.