vision-analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Vision Analysis

视觉分析

Analyze images using the MiniMax
MiniMax_understand_image
MCP tool available in the MiniMax Token Plan.
使用MiniMax Token Plan中提供的MiniMax
MiniMax_understand_image
MCP工具分析图像。

Prerequisites

前提条件

  • MiniMax Token Plan subscription with valid
    MINIMAX_API_KEY
  • MiniMax MCP configured (
    MiniMax_understand_image
    tool available)
  • 订阅MiniMax Token Plan并拥有有效的
    MINIMAX_API_KEY
  • 已配置MiniMax MCP(可使用
    MiniMax_understand_image
    工具)

If MCP tool is not configured

若未配置MCP工具

Step 1: The agent should fetch setup instructions from: https://platform.minimaxi.com/docs/token-plan/mcp-guide
Step 2: Detect the user's environment (OpenCode, Cursor, Claude Code, etc.) and output the exact commands needed. Common examples:
OpenCode — add to
~/.config/opencode/opencode.json
or
package.json
:
json
{
  "mcp": {
    "MiniMax": {
      "type": "local",
      "command": ["uvx", "minimax-coding-plan-mcp", "-y"],
      "environment": {
        "MINIMAX_API_KEY": "YOUR_TOKEN_PLAN_KEY",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      },
      "enabled": true
    }
  }
}
Claude Code:
bash
claude mcp add -s user MiniMax --env MINIMAX_API_KEY=your-key --env MINIMAX_API_HOST=https://api.minimaxi.com -- uvx minimax-coding-plan-mcp -y
Cursor — add to MCP settings:
json
{
  "mcpServers": {
    "MiniMax": {
      "command": "uvx",
      "args": ["minimax-coding-plan-mcp"],
      "env": {
        "MINIMAX_API_KEY": "your-key",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      }
    }
  }
}
Step 3: After configuration, tell the user to restart their app and verify with
/mcp
.
Important: If the user does not have a MiniMax Token Plan subscription, inform them that the
understand_image
tool requires one — it cannot be used with free or other tier API keys.
步骤1: Agent应从以下链接获取设置说明: https://platform.minimaxi.com/docs/token-plan/mcp-guide
步骤2: 检测用户的环境(OpenCode、Cursor、Claude Code等)并输出所需的准确命令。常见示例:
OpenCode — 添加至
~/.config/opencode/opencode.json
package.json
json
{
  "mcp": {
    "MiniMax": {
      "type": "local",
      "command": ["uvx", "minimax-coding-plan-mcp", "-y"],
      "environment": {
        "MINIMAX_API_KEY": "YOUR_TOKEN_PLAN_KEY",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      },
      "enabled": true
    }
  }
}
Claude Code
bash
claude mcp add -s user MiniMax --env MINIMAX_API_KEY=your-key --env MINIMAX_API_HOST=https://api.minimaxi.com -- uvx minimax-coding-plan-mcp -y
Cursor — 添加至MCP设置:
json
{
  "mcpServers": {
    "MiniMax": {
      "command": "uvx",
      "args": ["minimax-coding-plan-mcp"],
      "env": {
        "MINIMAX_API_KEY": "your-key",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      }
    }
  }
}
步骤3: 配置完成后,告知用户重启应用并通过
/mcp
命令验证配置。
重要提示: 若用户未订阅MiniMax Token Plan,需告知其
understand_image
工具需要该订阅,无法使用免费或其他层级的API密钥。

Analysis Modes

分析模式

ModeWhen to usePrompt strategy
describe
General image understandingAsk for detailed description
ocr
Text extraction from screenshots, documentsAsk to extract all text verbatim
ui-review
UI mockups, wireframes, design filesAsk for design critique with suggestions
chart-data
Charts, graphs, data visualizationsAsk to extract data points and trends
object-detect
Identify objects, people, activitiesAsk to list and locate all elements
模式使用场景提示策略
describe
通用图像理解请求生成详细描述
ocr
从截图、文档中提取文本请求逐字提取所有文本
ui-review
UI原型图、线框图、设计文件请求提供设计评估及改进建议
chart-data
图表、图形、数据可视化请求提取数据点和趋势
object-detect
识别物体、人物、活动请求列出并定位所有元素

Workflow

工作流程

Step 1: Auto-detect image

步骤1:自动检测图像

The skill triggers automatically when a message contains an image file path or URL with extensions:
.jpg
,
.jpeg
,
.png
,
.gif
,
.webp
,
.bmp
,
.svg
Extract the image path from the message.
当消息包含带有以下扩展名的图片文件路径或URL时,该技能自动触发:
.jpg
.jpeg
.png
.gif
.webp
.bmp
.svg
从消息中提取图片路径。

Step 2: Select analysis mode and call MCP tool

步骤2:选择分析模式并调用MCP工具

Use the
MiniMax_understand_image
tool with a mode-specific prompt:
describe:
Provide a detailed description of this image. Include: main subject, setting/background,
colors/style, any text visible, notable objects, and overall composition.
ocr:
Extract all text visible in this image verbatim. Preserve structure and formatting
(headers, lists, columns). If no text is found, say so.
ui-review:
You are a UI/UX design reviewer. Analyze this interface mockup or design. Provide:
(1) Strengths — what works well, (2) Issues — usability or design problems,
(3) Specific, actionable suggestions for improvement. Be constructive and detailed.
chart-data:
Extract all data from this chart or graph. List: chart title, axis labels, all
data points/series with values if readable, and a brief summary of the trend.
object-detect:
List all distinct objects, people, and activities you can identify. For each,
describe what it is and its approximate location in the image.
使用带有模式特定提示的
MiniMax_understand_image
工具:
describe模式:
Provide a detailed description of this image. Include: main subject, setting/background,
colors/style, any text visible, notable objects, and overall composition.
ocr模式:
Extract all text visible in this image verbatim. Preserve structure and formatting
(headers, lists, columns). If no text is found, say so.
ui-review模式:
You are a UI/UX design reviewer. Analyze this interface mockup or design. Provide:
(1) Strengths — what works well, (2) Issues — usability or design problems,
(3) Specific, actionable suggestions for improvement. Be constructive and detailed.
chart-data模式:
Extract all data from this chart or graph. List: chart title, axis labels, all
data points/series with values if readable, and a brief summary of the trend.
object-detect模式:
List all distinct objects, people, and activities you can identify. For each,
describe what it is and its approximate location in the image.

Step 3: Present results

步骤3:呈现结果

Return the analysis clearly. For
describe
, use readable prose. For
ocr
, preserve structure. For
ui-review
, use a structured critique format.
清晰返回分析结果。对于
describe
模式,使用易读的散文式表述;对于
ocr
模式,保留文本结构;对于
ui-review
模式,使用结构化的评估格式。

Output Format Example

输出格式示例

For describe mode:
undefined
describe模式示例:
undefined

Image Description

Image Description

[Detailed description of the image contents...]

For ocr mode:
[Detailed description of the image contents...]

ocr模式示例:

Extracted Text

Extracted Text

[Preserved text structure from the image]

For ui-review mode:
[Preserved text structure from the image]

ui-review模式示例:

UI Design Review

UI Design Review

Strengths

Strengths

  • ...
  • ...

Issues

Issues

  • ...
  • ...

Suggestions

Suggestions

  • ...
undefined
  • ...
undefined

Notes

注意事项

  • Images up to 20MB supported (JPEG, PNG, GIF, WebP)
  • Local file paths work if MiniMax MCP is configured with file access
  • The
    MiniMax_understand_image
    tool is provided by the
    minimax-coding-plan-mcp
    package
  • 支持最大20MB的图片(JPEG、PNG、GIF、WebP格式)
  • 若MiniMax MCP配置了文件访问权限,本地文件路径可正常使用
  • MiniMax_understand_image
    工具由
    minimax-coding-plan-mcp
    包提供