baoyu-gemini-web

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Web Client

Gemini Web 客户端

Supports:
  • Text generation
  • Image generation (download + save)
  • Reference image upload (attach images for vision tasks)
  • Multi-turn conversations within the same executor instance (
    keepSession
    )
  • Experimental video generation (
    generateVideo
    ) — Gemini may return an async placeholder; download might require Gemini web UI
支持功能:
  • 文本生成
  • 图像生成(下载+保存)
  • 参考图片上传(为视觉任务附加图片)
  • 同一执行器实例内的多轮对话(
    keepSession
  • 实验性视频生成(
    generateVideo
    )——Gemini可能返回异步占位符;下载可能需要打开Gemini网页端

Quick start

快速开始

bash
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png
bash
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png

Multi-turn conversation (agent generates unique sessionId)

多轮对话(Agent会生成唯一的sessionId)

npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123 npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
undefined
npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123 npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
undefined

Executor options (programmatic)

执行器选项(编程式调用)

This skill is typically consumed via
createGeminiWebExecutor(geminiOptions)
(see
scripts/executor.ts
).
Key options in
GeminiWebOptions
:
  • referenceImages?: string | string[]
    Upload local images as references (vision input).
  • keepSession?: boolean
    Reuse Gemini
    chatMetadata
    to continue the same conversation across calls (required if you want reference images to persist across multiple messages).
  • generateVideo?: string
    Generate a video and (best-effort) download to the given path. Gemini may return
    video_gen_chip
    (async); in that case you must open Gemini web UI to download the result.
Notes:
  • generateVideo
    cannot be combined with
    generateImage
    /
    editImage
    .
  • When
    keepSession=true
    and
    referenceImages
    is set, reference images are uploaded once per executor instance.
该技能通常通过
createGeminiWebExecutor(geminiOptions)
调用(详见
scripts/executor.ts
)。
GeminiWebOptions
中的关键选项:
  • referenceImages?: string | string[]
    上传本地图片作为参考(视觉输入)。
  • keepSession?: boolean
    复用Gemini的
    chatMetadata
    以在多次调用中延续同一对话(如果希望参考图片在多条消息中保持有效,则需要开启此选项)。
  • generateVideo?: string
    生成视频并(尽最大努力)下载到指定路径。Gemini可能返回
    video_gen_chip
    (异步生成);这种情况下你必须打开Gemini网页端才能下载结果。
注意事项:
  • generateVideo
    不能与
    generateImage
    /
    editImage
    同时使用。
  • keepSession=true
    且设置了
    referenceImages
    时,参考图片会在每个执行器实例中仅上传一次。

Commands

命令说明

Text generation

文本生成

bash
undefined
bash
undefined

Simple prompt (positional)

简单提示(位置参数)

npx -y bun scripts/main.ts "Your prompt here"
npx -y bun scripts/main.ts "你的提示内容"

Explicit prompt flag

显式指定prompt参数

npx -y bun scripts/main.ts --prompt "Your prompt here" npx -y bun scripts/main.ts -p "Your prompt here"
npx -y bun scripts/main.ts --prompt "你的提示内容" npx -y bun scripts/main.ts -p "你的提示内容"

With model selection

选择模型

npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro
npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro

Pipe from stdin

从标准输入管道传入

echo "Summarize this" | npx -y bun scripts/main.ts
undefined
echo "总结这段内容" | npx -y bun scripts/main.ts
undefined

Image generation

图像生成

bash
undefined
bash
undefined

Generate image with default path (./generated.png)

生成图像并保存到默认路径(./generated.png)

npx -y bun scripts/main.ts --prompt "A sunset over mountains" --image
npx -y bun scripts/main.ts --prompt "山间日落" --image

Generate image with custom path

生成图像并保存到自定义路径

npx -y bun scripts/main.ts --prompt "A cute robot" --image robot.png
npx -y bun scripts/main.ts --prompt "一只可爱的机器人" --image robot.png

Shorthand

简写形式

npx -y bun scripts/main.ts "A dragon" --image=dragon.png
undefined
npx -y bun scripts/main.ts "一条龙" --image=dragon.png
undefined

Output formats

输出格式

bash
undefined
bash
undefined

Plain text (default)

纯文本(默认)

npx -y bun scripts/main.ts "Hello"
npx -y bun scripts/main.ts "Hello"

JSON output

JSON格式输出

npx -y bun scripts/main.ts "Hello" --json
undefined
npx -y bun scripts/main.ts "Hello" --json
undefined

Options

选项列表

OptionDescription
--prompt <text>
,
-p
Prompt text
--promptfiles <files...>
Read prompt from files (concatenated in order)
--model <id>
,
-m
Model: gemini-3-pro (default), gemini-2.5-pro, gemini-2.5-flash
--image [path]
Generate image, save to path (default: generated.png)
--sessionId <id>
Session ID for multi-turn conversation (agent generates unique ID)
--list-sessions
List saved sessions (max 100, sorted by update time)
--json
Output as JSON
--login
Refresh cookies only, then exit
--cookie-path <path>
Custom cookie file path
--profile-dir <path>
Chrome profile directory
--help
,
-h
Show help
CLI note:
scripts/main.ts
supports text generation, image generation, and multi-turn conversations via
--sessionId
. Reference images and video generation are exposed via the executor API.
选项描述
--prompt <text>
,
-p
提示文本
--promptfiles <files...>
从文件读取提示内容(按顺序拼接)
--model <id>
,
-m
模型:gemini-3-pro(默认)、gemini-2.5-pro、gemini-2.5-flash
--image [path]
生成图像并保存到指定路径(默认:generated.png)
--sessionId <id>
多轮对话的会话ID(Agent会生成唯一ID)
--list-sessions
列出已保存的会话(最多100个,按更新时间排序)
--json
以JSON格式输出
--login
仅刷新Cookie,然后退出
--cookie-path <path>
自定义Cookie文件路径
--profile-dir <path>
Chrome配置文件目录
--help
,
-h
显示帮助信息
CLI说明:
scripts/main.ts
支持文本生成、图像生成,以及通过
--sessionId
实现的多轮对话。参考图片上传和视频生成功能通过执行器API暴露。

Models

模型列表

  • gemini-3-pro
    - Default, latest model
  • gemini-2.5-pro
    - Previous generation pro
  • gemini-2.5-flash
    - Fast, lightweight
  • gemini-3-pro
    - 默认模型,最新版本
  • gemini-2.5-pro
    - 上一代专业版模型
  • gemini-2.5-flash
    - 快速、轻量型模型

Authentication

身份验证

First run opens Chrome to authenticate with Google. Cookies are cached for subsequent runs.
bash
undefined
首次运行时会打开Chrome浏览器,通过Google账号完成身份验证。Cookie会被缓存以便后续使用。
bash
undefined

Force cookie refresh

强制刷新Cookie

npx -y bun scripts/main.ts --login
undefined
npx -y bun scripts/main.ts --login
undefined

Environment variables

环境变量

VariableDescription
GEMINI_WEB_DATA_DIR
Data directory
GEMINI_WEB_COOKIE_PATH
Cookie file path
GEMINI_WEB_CHROME_PROFILE_DIR
Chrome profile directory
GEMINI_WEB_CHROME_PATH
Chrome executable path
变量名描述
GEMINI_WEB_DATA_DIR
数据目录
GEMINI_WEB_COOKIE_PATH
Cookie文件路径
GEMINI_WEB_CHROME_PROFILE_DIR
Chrome配置文件目录
GEMINI_WEB_CHROME_PATH
Chrome可执行文件路径

Examples

使用示例

Generate text response

生成文本响应

bash
npx -y bun scripts/main.ts "What is the capital of France?"
bash
npx -y bun scripts/main.ts "法国的首都是什么?"

Generate image

生成图像

bash
npx -y bun scripts/main.ts "A photorealistic image of a golden retriever puppy" --image puppy.png
bash
npx -y bun scripts/main.ts "一张照片级真实感的金毛幼犬图片" --image puppy.png

Get JSON output for parsing

获取JSON格式输出以便解析

bash
npx -y bun scripts/main.ts "Hello" --json | jq '.text'
bash
npx -y bun scripts/main.ts "Hello" --json | jq '.text'

Generate image from prompt files

根据文件中的提示生成图像

bash
undefined
bash
undefined

Concatenate system.md + content.md as prompt

将system.md和content.md的内容拼接作为提示

npx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
undefined
npx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
undefined

Multi-turn conversation

多轮对话

bash
undefined
bash
undefined

Start a session with unique ID (agent generates this)

用唯一ID启动会话(Agent会生成该ID)

npx -y bun scripts/main.ts "You are a helpful math tutor." --sessionId task-abc123
npx -y bun scripts/main.ts "你是一位乐于助人的数学家教。" --sessionId task-abc123

Continue the conversation (remembers context)

继续对话(会记住上下文)

npx -y bun scripts/main.ts "What is 2+2?" --sessionId task-abc123 npx -y bun scripts/main.ts "Now multiply that by 10" --sessionId task-abc123
npx -y bun scripts/main.ts "2+2等于多少?" --sessionId task-abc123 npx -y bun scripts/main.ts "现在把结果乘以10是多少?" --sessionId task-abc123

List recent sessions (max 100, sorted by update time)

列出最近的会话(最多100个,按更新时间排序)

npx -y bun scripts/main.ts --list-sessions

Session files are stored in `~/Library/Application Support/baoyu-skills/gemini-web/sessions/<id>.json` and contain:
- `id`: Session ID
- `metadata`: Gemini chat metadata for continuation
- `messages`: Array of `{role, content, timestamp, error?}`
- `createdAt`, `updatedAt`: Timestamps
npx -y bun scripts/main.ts --list-sessions

会话文件存储在`~/Library/Application Support/baoyu-skills/gemini-web/sessions/<id>.json`路径下,包含以下内容:
- `id`: 会话ID
- `metadata`: 用于延续对话的Gemini聊天元数据
- `messages`: 消息数组,格式为`{role, content, timestamp, error?}`
- `createdAt`, `updatedAt`: 时间戳