openai-image-vision

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

OpenAI Image Vision

OpenAI 图像视觉分析

Analyze images using OpenAI's GPT-4 Vision API. The model can understand visual elements including objects, shapes, colors, textures, and text within images.

使用OpenAI的GPT-4 Vision API分析图像。该模型可识别图像中的视觉元素，包括物体、形状、颜色、纹理和文字。

Setup

配置步骤

This skill requires an OpenAI API key. If not configured:

Get your API key from https://platform.openai.com/api-keys

Set the key using:

env_config(action="set", key="OPENAI_API_KEY", value="your-key")

Optional: Set custom API base URL (default: https://api.openai.com/v1):

bash

env_config(action="set", key="OPENAI_API_BASE", value="your-base-url")

该功能需要OpenAI API密钥。若尚未配置：

从https://platform.openai.com/api-keys获取你的API密钥

通过以下命令设置密钥：

env_config(action="set", key="OPENAI_API_KEY", value="your-key")

可选：设置自定义API基础URL（默认值：https://api.openai.com/v1）：

bash

env_config(action="set", key="OPENAI_API_BASE", value="your-base-url")

Usage

使用方法

Important: Scripts are located relative to this skill's base directory.

When you see this skill in

<available_skills>

, note the

<base_dir>

path.

CRITICAL: Always use

bash

command to execute the script:

bash

undefined

重要提示：脚本位于该功能的基础目录下。

当你在

<available_skills>

中看到该功能时，请记下

<base_dir>

路径。

关键注意事项：必须使用

bash

命令执行脚本：

bash

undefined

General pattern (MUST start with bash):

通用格式（必须以bash开头）：

bash "<base_dir>/scripts/vision.sh" "<image_path_or_url>" "<question>" [model]

DO NOT execute the script directly like this (WRONG):

请勿直接执行脚本（错误示例）：

"<base_dir>/scripts/vision.sh" ...

Parameters:

参数说明：

- image_path_or_url: Local image file path or HTTP(S) URL (required)

- image_path_or_url：本地图像文件路径或HTTP(S) URL（必填）

- question: Question to ask about the image (required)

- question：关于图像的问题（必填）

- model: OpenAI model to use (default: gpt-4.1-mini)

- model：要使用的OpenAI模型（默认值：gpt-4.1-mini）

Options: gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4-turbo

可选值：gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4-turbo

undefined

undefined

Examples

使用示例

Analyze a local image

分析本地图片

bash

bash "<base_dir>/scripts/vision.sh" "/path/to/image.jpg" "What's in this image?"

bash

bash "<base_dir>/scripts/vision.sh" "/path/to/image.jpg" "这张图片里有什么？"

Analyze an image from URL

分析网络图片

bash

bash "<base_dir>/scripts/vision.sh" "https://example.com/image.jpg" "Describe this image in detail"

bash

bash "<base_dir>/scripts/vision.sh" "https://example.com/image.jpg" "详细描述这张图片"

Use specific model

使用指定模型

bash

bash "<base_dir>/scripts/vision.sh" "/path/to/photo.png" "What colors are prominent?" "gpt-4o-mini"

bash

bash "<base_dir>/scripts/vision.sh" "/path/to/photo.png" "图片中最突出的颜色是什么？" "gpt-4o-mini"

Extract text from image

提取图片文字

bash

bash "<base_dir>/scripts/vision.sh" "/path/to/document.jpg" "Extract all text from this image"

bash

bash "<base_dir>/scripts/vision.sh" "/path/to/document.jpg" "提取这张图片中的所有文字"

Analyze multiple aspects

多维度分析图片

bash

bash "<base_dir>/scripts/vision.sh" "image.jpg" "List all objects you can see and describe the overall scene"

bash

bash "<base_dir>/scripts/vision.sh" "image.jpg" "列出你能看到的所有物体，并描述整体场景"

Supported Image Formats

支持的图像格式

JPEG (.jpg, .jpeg)
PNG (.png)
GIF (.gif)
WebP (.webp)

Performance Optimization: Files larger than 1MB are automatically compressed to 800px (longest side) to avoid command-line parameter limits. This happens transparently without affecting analysis quality.

JPEG（.jpg、.jpeg）
PNG（.png）
GIF（.gif）
WebP（.webp）

性能优化：大于1MB的文件会自动压缩至最长边800像素，以避免命令行参数限制。该过程在后台完成，不会影响分析质量。

Response Format

响应格式

The script returns a JSON response:

json

{
  "model": "gpt-4.1-mini",
  "content": "The image shows...",
  "usage": {
    "prompt_tokens": 1234,
    "completion_tokens": 567,
    "total_tokens": 1801
  }
}

Or in case of error:

json

{
  "error": "Error description",
  "details": "Additional error information"
}

脚本会返回JSON格式的响应：

json

{
  "model": "gpt-4.1-mini",
  "content": "图像显示...",
  "usage": {
    "prompt_tokens": 1234,
    "completion_tokens": 567,
    "total_tokens": 1801
  }
}

若出现错误，响应格式如下：

json

{
  "error": "错误描述",
  "details": "额外错误信息"
}

Notes

注意事项

Image size: Images are automatically resized if too large
Timeout: 60 seconds for API calls
Rate limits: Subject to your OpenAI API plan limits
Privacy: Images are sent to OpenAI's servers for processing
Local files: Automatically converted to base64 for API submission
URLs: Can be passed directly to the API without downloading

图像大小：若图像过大，会自动调整尺寸
超时设置：API调用超时时间为60秒
速率限制：受你的OpenAI API套餐限制
隐私说明：图像会发送至OpenAI服务器进行处理
本地文件：会自动转换为base64格式提交至API
网络URL：可直接传递给API，无需下载