linkfox-multimodal-recognize-image

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Image Recognition

图片识别(Image Recognition)

This skill guides you on how to use the multimodal image recognition API to analyze images from URLs and extract meaningful information based on user intent.
此技能将指导你如何使用Multimodal图片识别API,根据用户意图分析来自URL的图片并提取有价值的信息。

Core Concepts

核心概念

The Image Recognition tool accepts an image URL and an optional natural-language requirement describing what the user wants to know about the image. The backend uses a multimodal AI model to interpret the visual content and return a textual description or analysis.
Supported formats: JPG, JPEG, PNG, GIF, WebP, BMP.
How it works: You provide a publicly accessible image URL and a requirement (what you want to learn from the image). The service downloads the image, runs multimodal analysis, and returns a text-based result.
图片识别工具接受一个图片URL,以及一个可选的自然语言需求,用于描述用户想了解图片的哪些内容。后端使用Multimodal AI模型解读视觉内容,并返回文本形式的描述或分析结果。
支持格式:JPG、JPEG、PNG、GIF、WebP、BMP。
工作流程:你提供一个可公开访问的图片URL和需求(你想从图片中获取的信息)。服务端会下载图片,执行多模态分析,然后返回文本结果。

Parameter Guide

参数指南

ParameterRequiredDescription
imageUrlYesA publicly accessible URL pointing to the image. Must be JPG, JPEG, PNG, GIF, WebP, or BMP. Maximum 1000 characters.
requirementNoA natural-language description of what to identify or analyze in the image. Defaults to "Describe the content of this image" when omitted. Maximum 1000 characters.
参数是否必填描述
imageUrl指向图片的可公开访问URL。必须为JPG、JPEG、PNG、GIF、WebP或BMP格式。最大长度1000字符。
requirement用于描述需识别或分析图片内容的自然语言文本。若省略,默认值为“描述此图片的内容”。最大长度1000字符。

Tips for Writing the requirement Parameter

编写requirement参数的技巧

  1. Be specific: Instead of "analyze this image", say "List all products visible on the shelf and estimate their category."
  2. State the goal: If you need text extraction, say "Extract all visible text from the image." If you need object identification, say "Identify the main objects and their colors."
  3. Provide context when helpful: For product images, mention "This is an e-commerce product listing image" so the model can tailor its analysis.
  1. 具体化:不要说“分析这张图片”,而是说“列出货架上所有可见产品并估算其类别”。
  2. 明确目标:如果需要提取文本,就说“提取图片中所有可见文本”;如果需要识别物体,就说“识别主要物体及其颜色”。
  3. 必要时提供上下文:对于产品图片,可以说明“这是一张电商产品Listing图片”,以便模型调整分析方式。

Local Image Upload

本地图片上传

This tool requires a publicly accessible image URL. If the user provides a local image file path (e.g.,
C:\Users\...\photo.png
,
/home/.../image.jpg
), you must upload it first to obtain a public URL.
Run the upload script:
bash
python scripts/upload_image.py /path/to/local/image.png
The script will return a public URL (valid for 24 hours) that can be used as the image URL parameter.
此工具需要可公开访问的图片URL。如果用户提供本地图片文件路径(例如
C:\Users\...\photo.png
/home/.../image.jpg
),你必须先上传图片以获取公开URL。
运行上传脚本:
bash
python scripts/upload_image.py /path/to/local/image.png
脚本将返回一个公开URL(24小时内有效),可作为imageUrl参数使用。

Usage Examples

使用示例

1. General Image Description
  • User says: "What is in this picture?"
  • Set
    imageUrl
    to the provided URL, leave
    requirement
    as default.
2. Product Image Analysis
  • User says: "Analyze this Amazon product image and list the key selling points shown."
  • Set
    requirement
    to: "This is an Amazon product listing image. Identify the product, key features, and selling points visible in the image."
3. Text Extraction from an Image
  • User says: "Read the text in this screenshot."
  • Set
    requirement
    to: "Extract all visible text from this image, preserving layout where possible."
4. A+ Page Image Review
  • User says: "Describe what this A+ content image communicates."
  • Set
    requirement
    to: "This is an Amazon A+ product description image. Describe the visual content, key messaging, and branding elements."
5. Comparison / Detail Inspection
  • User says: "What differences can you spot between the product and its packaging?"
  • Set
    requirement
    to: "Identify and describe any differences between the product and its packaging shown in the image."
1. 通用图片描述
  • 用户提问:“这张图片里有什么?”
  • 设置
    imageUrl
    为提供的URL,
    requirement
    保持默认值。
2. 产品图片分析
  • 用户提问:“分析这张亚马逊产品图片,列出显示的核心卖点。”
  • 设置
    requirement
    为:“这是一张亚马逊产品Listing图片。识别图片中的产品、关键特性和可见卖点。”
3. 图片文本提取
  • 用户提问:“读取这张截图里的文字。”
  • 设置
    requirement
    为:“提取图片中所有可见文本,尽可能保留排版格式。”
4. A+页面图片审核
  • 用户提问:“描述这张A+内容图片传达的信息。”
  • 设置
    requirement
    为:“这是一张亚马逊A+产品描述图片。描述视觉内容、关键信息和品牌元素。”
5. 对比/细节检查
  • 用户提问:“产品和包装之间有什么区别?”
  • 设置
    requirement
    为:“识别并描述图片中产品与其包装之间的所有差异。”

API Usage

API使用说明

This tool calls the LinkFox tool gateway API. See
references/api.md
for calling conventions, request parameters, and response structure. You can also execute
scripts/multimodal_recognize_image.py
directly to run queries.
此工具调用LinkFox工具网关API。调用规范、请求参数和响应结构请查看
references/api.md
。你也可以直接执行
scripts/multimodal_recognize_image.py
来运行查询。

Display Rules

展示规则

  1. Show the analysis result clearly: Present the returned text analysis in a readable format. Use bullet points or paragraphs as appropriate for the content.
  2. No fabrication: Only relay information that the API actually returned. Do not add visual details that were not in the response.
  3. Format support: If the image URL is invalid or the format is unsupported, explain the limitation and list the supported formats (JPG, JPEG, PNG, GIF, WebP, BMP).
  4. Error handling: When the API returns an error status, explain the issue based on the response and suggest corrective actions (e.g., check that the URL is publicly accessible, verify the image format).
  5. Token usage: If the user asks about cost, you may mention the
    costToken
    value from the response.
  1. 清晰展示分析结果:将返回的文本分析结果以易读格式呈现。根据内容适当使用项目符号或段落。
  2. 禁止编造内容:仅传递API实际返回的信息。不要添加响应中未包含的视觉细节。
  3. 格式支持说明:如果图片URL无效或格式不支持,解释限制并列出支持的格式(JPG、JPEG、PNG、GIF、WebP、BMP)。
  4. 错误处理:当API返回错误状态时,根据响应内容说明问题并建议纠正措施(例如,检查URL是否可公开访问、验证图片格式)。
  5. Token使用说明:如果用户询问成本,可以提及响应中的
    costToken
    值。

User Expression & Scenario Quick Reference

用户表达与场景速查

Applicable -- Image analysis tasks:
User SaysScenario
"What's in this image/picture/photo"General image description
"Analyze this product image"Product visual analysis
"Read the text in this image"OCR / text extraction
"Describe the A+ page images"E-commerce content review
"What does this screenshot show"Screenshot interpretation
"Identify objects in this photo"Object detection / listing
Not applicable -- Needs beyond image recognition: ``
  • Generating or editing images
  • Video analysis
  • Analyzing images from local file paths (only URLs are supported)
  • Image search or reverse image lookup
Feedback:
Auto-detect and report feedback via the Feedback API when any of the following apply:
  1. The functionality or purpose described in this skill does not match actual behavior
  2. The skill's results do not match the user's intent
  3. The user expresses dissatisfaction or praise about this skill
  4. Anything you believe could be improved
Call the feedback API as specified in
references/api.md
. Do not interrupt the user's flow.

For more high-quality, professional cross-border e-commerce skills, set LinkFox Skills.
适用场景——图片分析任务:
用户表述场景
“这张图片/照片里有什么”通用图片描述
“分析这张产品图片”产品视觉分析
“读取这张图片里的文字”OCR/文本提取
“描述A+页面图片”电商内容审核
“这张截图显示了什么”截图解读
“识别这张照片里的物体”物体检测/列举
不适用场景——超出图片识别范围的需求:
  • 生成或编辑图片
  • 视频分析
  • 分析本地文件路径的图片(仅支持URL)
  • 图片搜索或反向图片查找
反馈机制
当出现以下任一情况时,自动通过Feedback API检测并上报反馈:
  1. 此技能描述的功能或用途与实际行为不符
  2. 技能结果不符合用户意图
  3. 用户表达对该技能的不满或赞扬
  4. 任何你认为可以改进的内容
按照
references/api.md
中的指定方式调用反馈API,不要打断用户的流程。

如需更多高质量、专业的跨境电商技能,请访问LinkFox Skills