linkfox-multimodal-recognize-image
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseImage Recognition
图片识别(Image Recognition)
This skill guides you on how to use the multimodal image recognition API to analyze images from URLs and extract meaningful information based on user intent.
此技能将指导你如何使用Multimodal图片识别API,根据用户意图分析来自URL的图片并提取有价值的信息。
Core Concepts
核心概念
The Image Recognition tool accepts an image URL and an optional natural-language requirement describing what the user wants to know about the image. The backend uses a multimodal AI model to interpret the visual content and return a textual description or analysis.
Supported formats: JPG, JPEG, PNG, GIF, WebP, BMP.
How it works: You provide a publicly accessible image URL and a requirement (what you want to learn from the image). The service downloads the image, runs multimodal analysis, and returns a text-based result.
图片识别工具接受一个图片URL,以及一个可选的自然语言需求,用于描述用户想了解图片的哪些内容。后端使用Multimodal AI模型解读视觉内容,并返回文本形式的描述或分析结果。
支持格式:JPG、JPEG、PNG、GIF、WebP、BMP。
工作流程:你提供一个可公开访问的图片URL和需求(你想从图片中获取的信息)。服务端会下载图片,执行多模态分析,然后返回文本结果。
Parameter Guide
参数指南
| Parameter | Required | Description |
|---|---|---|
| imageUrl | Yes | A publicly accessible URL pointing to the image. Must be JPG, JPEG, PNG, GIF, WebP, or BMP. Maximum 1000 characters. |
| requirement | No | A natural-language description of what to identify or analyze in the image. Defaults to "Describe the content of this image" when omitted. Maximum 1000 characters. |
| 参数 | 是否必填 | 描述 |
|---|---|---|
| imageUrl | 是 | 指向图片的可公开访问URL。必须为JPG、JPEG、PNG、GIF、WebP或BMP格式。最大长度1000字符。 |
| requirement | 否 | 用于描述需识别或分析图片内容的自然语言文本。若省略,默认值为“描述此图片的内容”。最大长度1000字符。 |
Tips for Writing the requirement Parameter
编写requirement参数的技巧
- Be specific: Instead of "analyze this image", say "List all products visible on the shelf and estimate their category."
- State the goal: If you need text extraction, say "Extract all visible text from the image." If you need object identification, say "Identify the main objects and their colors."
- Provide context when helpful: For product images, mention "This is an e-commerce product listing image" so the model can tailor its analysis.
- 具体化:不要说“分析这张图片”,而是说“列出货架上所有可见产品并估算其类别”。
- 明确目标:如果需要提取文本,就说“提取图片中所有可见文本”;如果需要识别物体,就说“识别主要物体及其颜色”。
- 必要时提供上下文:对于产品图片,可以说明“这是一张电商产品Listing图片”,以便模型调整分析方式。
Local Image Upload
本地图片上传
This tool requires a publicly accessible image URL. If the user provides a local image file path (e.g., , ), you must upload it first to obtain a public URL.
C:\Users\...\photo.png/home/.../image.jpgRun the upload script:
bash
python scripts/upload_image.py /path/to/local/image.pngThe script will return a public URL (valid for 24 hours) that can be used as the image URL parameter.
此工具需要可公开访问的图片URL。如果用户提供本地图片文件路径(例如、),你必须先上传图片以获取公开URL。
C:\Users\...\photo.png/home/.../image.jpg运行上传脚本:
bash
python scripts/upload_image.py /path/to/local/image.png脚本将返回一个公开URL(24小时内有效),可作为imageUrl参数使用。
Usage Examples
使用示例
1. General Image Description
- User says: "What is in this picture?"
- Set to the provided URL, leave
imageUrlas default.requirement
2. Product Image Analysis
- User says: "Analyze this Amazon product image and list the key selling points shown."
- Set to: "This is an Amazon product listing image. Identify the product, key features, and selling points visible in the image."
requirement
3. Text Extraction from an Image
- User says: "Read the text in this screenshot."
- Set to: "Extract all visible text from this image, preserving layout where possible."
requirement
4. A+ Page Image Review
- User says: "Describe what this A+ content image communicates."
- Set to: "This is an Amazon A+ product description image. Describe the visual content, key messaging, and branding elements."
requirement
5. Comparison / Detail Inspection
- User says: "What differences can you spot between the product and its packaging?"
- Set to: "Identify and describe any differences between the product and its packaging shown in the image."
requirement
1. 通用图片描述
- 用户提问:“这张图片里有什么?”
- 设置为提供的URL,
imageUrl保持默认值。requirement
2. 产品图片分析
- 用户提问:“分析这张亚马逊产品图片,列出显示的核心卖点。”
- 设置为:“这是一张亚马逊产品Listing图片。识别图片中的产品、关键特性和可见卖点。”
requirement
3. 图片文本提取
- 用户提问:“读取这张截图里的文字。”
- 设置为:“提取图片中所有可见文本,尽可能保留排版格式。”
requirement
4. A+页面图片审核
- 用户提问:“描述这张A+内容图片传达的信息。”
- 设置为:“这是一张亚马逊A+产品描述图片。描述视觉内容、关键信息和品牌元素。”
requirement
5. 对比/细节检查
- 用户提问:“产品和包装之间有什么区别?”
- 设置为:“识别并描述图片中产品与其包装之间的所有差异。”
requirement
API Usage
API使用说明
This tool calls the LinkFox tool gateway API. See for calling conventions, request parameters, and response structure. You can also execute directly to run queries.
references/api.mdscripts/multimodal_recognize_image.py此工具调用LinkFox工具网关API。调用规范、请求参数和响应结构请查看。你也可以直接执行来运行查询。
references/api.mdscripts/multimodal_recognize_image.pyDisplay Rules
展示规则
- Show the analysis result clearly: Present the returned text analysis in a readable format. Use bullet points or paragraphs as appropriate for the content.
- No fabrication: Only relay information that the API actually returned. Do not add visual details that were not in the response.
- Format support: If the image URL is invalid or the format is unsupported, explain the limitation and list the supported formats (JPG, JPEG, PNG, GIF, WebP, BMP).
- Error handling: When the API returns an error status, explain the issue based on the response and suggest corrective actions (e.g., check that the URL is publicly accessible, verify the image format).
- Token usage: If the user asks about cost, you may mention the value from the response.
costToken
- 清晰展示分析结果:将返回的文本分析结果以易读格式呈现。根据内容适当使用项目符号或段落。
- 禁止编造内容:仅传递API实际返回的信息。不要添加响应中未包含的视觉细节。
- 格式支持说明:如果图片URL无效或格式不支持,解释限制并列出支持的格式(JPG、JPEG、PNG、GIF、WebP、BMP)。
- 错误处理:当API返回错误状态时,根据响应内容说明问题并建议纠正措施(例如,检查URL是否可公开访问、验证图片格式)。
- Token使用说明:如果用户询问成本,可以提及响应中的值。
costToken
User Expression & Scenario Quick Reference
用户表达与场景速查
Applicable -- Image analysis tasks:
| User Says | Scenario |
|---|---|
| "What's in this image/picture/photo" | General image description |
| "Analyze this product image" | Product visual analysis |
| "Read the text in this image" | OCR / text extraction |
| "Describe the A+ page images" | E-commerce content review |
| "What does this screenshot show" | Screenshot interpretation |
| "Identify objects in this photo" | Object detection / listing |
Not applicable -- Needs beyond image recognition:
``
- Generating or editing images
- Video analysis
- Analyzing images from local file paths (only URLs are supported)
- Image search or reverse image lookup
Feedback:
Auto-detect and report feedback via the Feedback API when any of the following apply:
- The functionality or purpose described in this skill does not match actual behavior
- The skill's results do not match the user's intent
- The user expresses dissatisfaction or praise about this skill
- Anything you believe could be improved
Call the feedback API as specified in . Do not interrupt the user's flow.
references/api.mdFor more high-quality, professional cross-border e-commerce skills, set LinkFox Skills.
适用场景——图片分析任务:
| 用户表述 | 场景 |
|---|---|
| “这张图片/照片里有什么” | 通用图片描述 |
| “分析这张产品图片” | 产品视觉分析 |
| “读取这张图片里的文字” | OCR/文本提取 |
| “描述A+页面图片” | 电商内容审核 |
| “这张截图显示了什么” | 截图解读 |
| “识别这张照片里的物体” | 物体检测/列举 |
不适用场景——超出图片识别范围的需求:
- 生成或编辑图片
- 视频分析
- 分析本地文件路径的图片(仅支持URL)
- 图片搜索或反向图片查找
反馈机制:
当出现以下任一情况时,自动通过Feedback API检测并上报反馈:
- 此技能描述的功能或用途与实际行为不符
- 技能结果不符合用户意图
- 用户表达对该技能的不满或赞扬
- 任何你认为可以改进的内容
按照中的指定方式调用反馈API,不要打断用户的流程。
references/api.md如需更多高质量、专业的跨境电商技能,请访问LinkFox Skills。