baoyu-gemini-web
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGemini Web Client
Gemini Web 客户端
Supports:
- Text generation
- Image generation (download + save)
- Reference image upload (attach images for vision tasks)
- Multi-turn conversations within the same executor instance ()
keepSession - Experimental video generation () — Gemini may return an async placeholder; download might require Gemini web UI
generateVideo
支持功能:
- 文本生成
- 图像生成(下载+保存)
- 参考图片上传(为视觉任务附加图片)
- 同一执行器实例内的多轮对话()
keepSession - 实验性视频生成()——Gemini可能返回异步占位符;下载可能需要打开Gemini网页端
generateVideo
Quick start
快速开始
bash
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.pngbash
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.pngMulti-turn conversation (agent generates unique sessionId)
多轮对话(Agent会生成唯一的sessionId)
npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123
npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
undefinednpx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123
npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
undefinedExecutor options (programmatic)
执行器选项(编程式调用)
This skill is typically consumed via (see ).
createGeminiWebExecutor(geminiOptions)scripts/executor.tsKey options in :
GeminiWebOptions- Upload local images as references (vision input).
referenceImages?: string | string[] - Reuse Gemini
keepSession?: booleanto continue the same conversation across calls (required if you want reference images to persist across multiple messages).chatMetadata - Generate a video and (best-effort) download to the given path. Gemini may return
generateVideo?: string(async); in that case you must open Gemini web UI to download the result.video_gen_chip
Notes:
- cannot be combined with
generateVideo/generateImage.editImage - When and
keepSession=trueis set, reference images are uploaded once per executor instance.referenceImages
该技能通常通过调用(详见)。
createGeminiWebExecutor(geminiOptions)scripts/executor.tsGeminiWebOptions- 上传本地图片作为参考(视觉输入)。
referenceImages?: string | string[] - 复用Gemini的
keepSession?: boolean以在多次调用中延续同一对话(如果希望参考图片在多条消息中保持有效,则需要开启此选项)。chatMetadata - 生成视频并(尽最大努力)下载到指定路径。Gemini可能返回
generateVideo?: string(异步生成);这种情况下你必须打开Gemini网页端才能下载结果。video_gen_chip
注意事项:
- 不能与
generateVideo/generateImage同时使用。editImage - 当且设置了
keepSession=true时,参考图片会在每个执行器实例中仅上传一次。referenceImages
Commands
命令说明
Text generation
文本生成
bash
undefinedbash
undefinedSimple prompt (positional)
简单提示(位置参数)
npx -y bun scripts/main.ts "Your prompt here"
npx -y bun scripts/main.ts "你的提示内容"
Explicit prompt flag
显式指定prompt参数
npx -y bun scripts/main.ts --prompt "Your prompt here"
npx -y bun scripts/main.ts -p "Your prompt here"
npx -y bun scripts/main.ts --prompt "你的提示内容"
npx -y bun scripts/main.ts -p "你的提示内容"
With model selection
选择模型
npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro
npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro
Pipe from stdin
从标准输入管道传入
echo "Summarize this" | npx -y bun scripts/main.ts
undefinedecho "总结这段内容" | npx -y bun scripts/main.ts
undefinedImage generation
图像生成
bash
undefinedbash
undefinedGenerate image with default path (./generated.png)
生成图像并保存到默认路径(./generated.png)
npx -y bun scripts/main.ts --prompt "A sunset over mountains" --image
npx -y bun scripts/main.ts --prompt "山间日落" --image
Generate image with custom path
生成图像并保存到自定义路径
npx -y bun scripts/main.ts --prompt "A cute robot" --image robot.png
npx -y bun scripts/main.ts --prompt "一只可爱的机器人" --image robot.png
Shorthand
简写形式
npx -y bun scripts/main.ts "A dragon" --image=dragon.png
undefinednpx -y bun scripts/main.ts "一条龙" --image=dragon.png
undefinedOutput formats
输出格式
bash
undefinedbash
undefinedPlain text (default)
纯文本(默认)
npx -y bun scripts/main.ts "Hello"
npx -y bun scripts/main.ts "Hello"
JSON output
JSON格式输出
npx -y bun scripts/main.ts "Hello" --json
undefinednpx -y bun scripts/main.ts "Hello" --json
undefinedOptions
选项列表
| Option | Description |
|---|---|
| Prompt text |
| Read prompt from files (concatenated in order) |
| Model: gemini-3-pro (default), gemini-2.5-pro, gemini-2.5-flash |
| Generate image, save to path (default: generated.png) |
| Session ID for multi-turn conversation (agent generates unique ID) |
| List saved sessions (max 100, sorted by update time) |
| Output as JSON |
| Refresh cookies only, then exit |
| Custom cookie file path |
| Chrome profile directory |
| Show help |
CLI note: supports text generation, image generation, and multi-turn conversations via . Reference images and video generation are exposed via the executor API.
scripts/main.ts--sessionId| 选项 | 描述 |
|---|---|
| 提示文本 |
| 从文件读取提示内容(按顺序拼接) |
| 模型:gemini-3-pro(默认)、gemini-2.5-pro、gemini-2.5-flash |
| 生成图像并保存到指定路径(默认:generated.png) |
| 多轮对话的会话ID(Agent会生成唯一ID) |
| 列出已保存的会话(最多100个,按更新时间排序) |
| 以JSON格式输出 |
| 仅刷新Cookie,然后退出 |
| 自定义Cookie文件路径 |
| Chrome配置文件目录 |
| 显示帮助信息 |
CLI说明:支持文本生成、图像生成,以及通过实现的多轮对话。参考图片上传和视频生成功能通过执行器API暴露。
scripts/main.ts--sessionIdModels
模型列表
- - Default, latest model
gemini-3-pro - - Previous generation pro
gemini-2.5-pro - - Fast, lightweight
gemini-2.5-flash
- - 默认模型,最新版本
gemini-3-pro - - 上一代专业版模型
gemini-2.5-pro - - 快速、轻量型模型
gemini-2.5-flash
Authentication
身份验证
First run opens Chrome to authenticate with Google. Cookies are cached for subsequent runs.
bash
undefined首次运行时会打开Chrome浏览器,通过Google账号完成身份验证。Cookie会被缓存以便后续使用。
bash
undefinedForce cookie refresh
强制刷新Cookie
npx -y bun scripts/main.ts --login
undefinednpx -y bun scripts/main.ts --login
undefinedEnvironment variables
环境变量
| Variable | Description |
|---|---|
| Data directory |
| Cookie file path |
| Chrome profile directory |
| Chrome executable path |
| 变量名 | 描述 |
|---|---|
| 数据目录 |
| Cookie文件路径 |
| Chrome配置文件目录 |
| Chrome可执行文件路径 |
Examples
使用示例
Generate text response
生成文本响应
bash
npx -y bun scripts/main.ts "What is the capital of France?"bash
npx -y bun scripts/main.ts "法国的首都是什么?"Generate image
生成图像
bash
npx -y bun scripts/main.ts "A photorealistic image of a golden retriever puppy" --image puppy.pngbash
npx -y bun scripts/main.ts "一张照片级真实感的金毛幼犬图片" --image puppy.pngGet JSON output for parsing
获取JSON格式输出以便解析
bash
npx -y bun scripts/main.ts "Hello" --json | jq '.text'bash
npx -y bun scripts/main.ts "Hello" --json | jq '.text'Generate image from prompt files
根据文件中的提示生成图像
bash
undefinedbash
undefinedConcatenate system.md + content.md as prompt
将system.md和content.md的内容拼接作为提示
npx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
undefinednpx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
undefinedMulti-turn conversation
多轮对话
bash
undefinedbash
undefinedStart a session with unique ID (agent generates this)
用唯一ID启动会话(Agent会生成该ID)
npx -y bun scripts/main.ts "You are a helpful math tutor." --sessionId task-abc123
npx -y bun scripts/main.ts "你是一位乐于助人的数学家教。" --sessionId task-abc123
Continue the conversation (remembers context)
继续对话(会记住上下文)
npx -y bun scripts/main.ts "What is 2+2?" --sessionId task-abc123
npx -y bun scripts/main.ts "Now multiply that by 10" --sessionId task-abc123
npx -y bun scripts/main.ts "2+2等于多少?" --sessionId task-abc123
npx -y bun scripts/main.ts "现在把结果乘以10是多少?" --sessionId task-abc123
List recent sessions (max 100, sorted by update time)
列出最近的会话(最多100个,按更新时间排序)
npx -y bun scripts/main.ts --list-sessions
Session files are stored in `~/Library/Application Support/baoyu-skills/gemini-web/sessions/<id>.json` and contain:
- `id`: Session ID
- `metadata`: Gemini chat metadata for continuation
- `messages`: Array of `{role, content, timestamp, error?}`
- `createdAt`, `updatedAt`: Timestampsnpx -y bun scripts/main.ts --list-sessions
会话文件存储在`~/Library/Application Support/baoyu-skills/gemini-web/sessions/<id>.json`路径下,包含以下内容:
- `id`: 会话ID
- `metadata`: 用于延续对话的Gemini聊天元数据
- `messages`: 消息数组,格式为`{role, content, timestamp, error?}`
- `createdAt`, `updatedAt`: 时间戳