openai-image-vision
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOpenAI Image Vision
OpenAI 图像视觉分析
Analyze images using OpenAI's GPT-4 Vision API. The model can understand visual elements including objects, shapes, colors, textures, and text within images.
使用OpenAI的GPT-4 Vision API分析图像。该模型可识别图像中的视觉元素,包括物体、形状、颜色、纹理和文字。
Setup
配置步骤
This skill requires an OpenAI API key. If not configured:
- Get your API key from https://platform.openai.com/api-keys
- Set the key using:
env_config(action="set", key="OPENAI_API_KEY", value="your-key")
Optional: Set custom API base URL (default: https://api.openai.com/v1):
bash
env_config(action="set", key="OPENAI_API_BASE", value="your-base-url")该功能需要OpenAI API密钥。若尚未配置:
- 从https://platform.openai.com/api-keys获取你的API密钥
- 通过以下命令设置密钥:
env_config(action="set", key="OPENAI_API_KEY", value="your-key")
可选:设置自定义API基础URL(默认值:https://api.openai.com/v1):
bash
env_config(action="set", key="OPENAI_API_BASE", value="your-base-url")Usage
使用方法
Important: Scripts are located relative to this skill's base directory.
When you see this skill in , note the path.
<available_skills><base_dir>CRITICAL: Always use command to execute the script:
bashbash
undefined重要提示:脚本位于该功能的基础目录下。
当你在中看到该功能时,请记下路径。
<available_skills><base_dir>关键注意事项:必须使用命令执行脚本:
bashbash
undefinedGeneral pattern (MUST start with bash):
通用格式(必须以bash开头):
bash "<base_dir>/scripts/vision.sh" "<image_path_or_url>" "<question>" [model]
bash "<base_dir>/scripts/vision.sh" "<image_path_or_url>" "<question>" [model]
DO NOT execute the script directly like this (WRONG):
请勿直接执行脚本(错误示例):
"<base_dir>/scripts/vision.sh" ...
"<base_dir>/scripts/vision.sh" ...
Parameters:
参数说明:
- image_path_or_url: Local image file path or HTTP(S) URL (required)
- image_path_or_url:本地图像文件路径或HTTP(S) URL(必填)
- question: Question to ask about the image (required)
- question:关于图像的问题(必填)
- model: OpenAI model to use (default: gpt-4.1-mini)
- model:要使用的OpenAI模型(默认值:gpt-4.1-mini)
Options: gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4-turbo
可选值:gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4-turbo
undefinedundefinedExamples
使用示例
Analyze a local image
分析本地图片
bash
bash "<base_dir>/scripts/vision.sh" "/path/to/image.jpg" "What's in this image?"bash
bash "<base_dir>/scripts/vision.sh" "/path/to/image.jpg" "这张图片里有什么?"Analyze an image from URL
分析网络图片
bash
bash "<base_dir>/scripts/vision.sh" "https://example.com/image.jpg" "Describe this image in detail"bash
bash "<base_dir>/scripts/vision.sh" "https://example.com/image.jpg" "详细描述这张图片"Use specific model
使用指定模型
bash
bash "<base_dir>/scripts/vision.sh" "/path/to/photo.png" "What colors are prominent?" "gpt-4o-mini"bash
bash "<base_dir>/scripts/vision.sh" "/path/to/photo.png" "图片中最突出的颜色是什么?" "gpt-4o-mini"Extract text from image
提取图片文字
bash
bash "<base_dir>/scripts/vision.sh" "/path/to/document.jpg" "Extract all text from this image"bash
bash "<base_dir>/scripts/vision.sh" "/path/to/document.jpg" "提取这张图片中的所有文字"Analyze multiple aspects
多维度分析图片
bash
bash "<base_dir>/scripts/vision.sh" "image.jpg" "List all objects you can see and describe the overall scene"bash
bash "<base_dir>/scripts/vision.sh" "image.jpg" "列出你能看到的所有物体,并描述整体场景"Supported Image Formats
支持的图像格式
- JPEG (.jpg, .jpeg)
- PNG (.png)
- GIF (.gif)
- WebP (.webp)
Performance Optimization: Files larger than 1MB are automatically compressed to 800px (longest side) to avoid command-line parameter limits. This happens transparently without affecting analysis quality.
- JPEG(.jpg、.jpeg)
- PNG(.png)
- GIF(.gif)
- WebP(.webp)
性能优化:大于1MB的文件会自动压缩至最长边800像素,以避免命令行参数限制。该过程在后台完成,不会影响分析质量。
Response Format
响应格式
The script returns a JSON response:
json
{
"model": "gpt-4.1-mini",
"content": "The image shows...",
"usage": {
"prompt_tokens": 1234,
"completion_tokens": 567,
"total_tokens": 1801
}
}Or in case of error:
json
{
"error": "Error description",
"details": "Additional error information"
}脚本会返回JSON格式的响应:
json
{
"model": "gpt-4.1-mini",
"content": "图像显示...",
"usage": {
"prompt_tokens": 1234,
"completion_tokens": 567,
"total_tokens": 1801
}
}若出现错误,响应格式如下:
json
{
"error": "错误描述",
"details": "额外错误信息"
}Notes
注意事项
- Image size: Images are automatically resized if too large
- Timeout: 60 seconds for API calls
- Rate limits: Subject to your OpenAI API plan limits
- Privacy: Images are sent to OpenAI's servers for processing
- Local files: Automatically converted to base64 for API submission
- URLs: Can be passed directly to the API without downloading
- 图像大小:若图像过大,会自动调整尺寸
- 超时设置:API调用超时时间为60秒
- 速率限制:受你的OpenAI API套餐限制
- 隐私说明:图像会发送至OpenAI服务器进行处理
- 本地文件:会自动转换为base64格式提交至API
- 网络URL:可直接传递给API,无需下载