gemini-visual

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Visual - Front-End & Visual Development Assistant

Gemini Visual - 前端与视觉开发助手

Overview

概述

A comprehensive toolkit leveraging Google Gemini 3's advanced visual reasoning capabilities for front-end development and design tasks. Gemini 3 provides state-of-the-art multimodal understanding with spatial reasoning, document understanding, and high-resolution image processing.
这是一个借助Google Gemini 3先进视觉推理能力打造的综合工具包,用于前端开发和设计任务。Gemini 3具备顶尖的多模态理解能力,包括空间推理、文档理解和高分辨率图像处理。

When to Use

适用场景

  • UI/UX Analysis: Analyze screenshots for layout issues, visual hierarchy, and design patterns
  • Accessibility Audits: Check contrast ratios, text readability, and WCAG compliance
  • Design Comparison: Compare mockups, before/after screenshots, or different design variations
  • Color Palette Extraction: Extract colors from images with HEX, RGB, and HSL values
  • Screenshot to Code: Generate HTML/CSS from design screenshots
  • UI Asset Generation: Create icons, backgrounds, gradients, and UI graphics
  • Responsive Design Review: Analyze multi-device screenshots for consistency
  • Visual Debugging: Identify rendering issues, broken layouts, or visual bugs
  • Design from Brief: Generate designs, code, and components from text descriptions
  • Interactive Design Sessions: Multi-turn conversations for iterative design refinement
  • UI/UX分析:分析截图中的布局问题、视觉层级和设计模式
  • 无障碍审计:检查对比度、文本可读性和WCAG合规性
  • 设计对比:对比原型图、前后版本截图或不同设计变体
  • 调色板提取:从图像中提取包含HEX、RGB和HSL值的颜色
  • 截图转代码:将设计截图生成HTML/CSS代码
  • UI资源生成:创建图标、背景、渐变和UI图形
  • 响应式设计审查:分析多设备截图的一致性
  • 视觉调试:识别渲染问题、布局断裂或视觉bug
  • 根据需求文档设计:从文本描述生成设计、代码和组件
  • 交互式设计会话:多轮对话进行迭代设计优化

Prerequisites

前置要求

  • Python 3.9+
  • google-genai
    package
  • GEMINI_API_KEY
    environment variable
  • Python 3.9+
  • google-genai
  • GEMINI_API_KEY
    环境变量

Installation

安装

bash
pip install google-genai
bash
pip install google-genai

Getting an API Key

获取API密钥

  1. Go to Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the key and set it as an environment variable:
bash
export GEMINI_API_KEY="your-api-key"
Add to your shell profile (
~/.zshrc
or
~/.bashrc
) for persistence:
bash
echo 'export GEMINI_API_KEY="your-api-key"' >> ~/.zshrc
  1. 访问 Google AI Studio
  2. 使用Google账号登录
  3. 点击「Create API Key」
  4. 复制密钥并设置为环境变量:
bash
export GEMINI_API_KEY="your-api-key"
添加到shell配置文件(
~/.zshrc
~/.bashrc
)以持久化:
bash
echo 'export GEMINI_API_KEY="your-api-key"' >> ~/.zshrc

Scripts Overview

脚本概述

ScriptPurpose
analyze_ui.py
Analyze UI screenshots for issues, patterns, and suggestions
generate_ui_assets.py
Generate icons, backgrounds, and UI graphics
compare_designs.py
Compare two designs and highlight differences
extract_colors.py
Extract color palettes from images
screenshot_to_code.py
Convert screenshots to HTML/CSS code
design_from_brief.py
Generate designs and code from text briefs (no image required)
脚本用途
analyze_ui.py
分析UI截图中的问题、模式并给出建议
generate_ui_assets.py
生成图标、背景和UI图形
compare_designs.py
对比两个设计并突出差异
extract_colors.py
从图像中提取调色板
screenshot_to_code.py
将截图转换为HTML/CSS代码
design_from_brief.py
根据文本需求文档生成设计和代码(无需图像)

Available Models

可用模型

ModelBest ForContext
gemini-3-pro-preview
Visual analysis, code generation, complex reasoning1M input, 64k output
gemini-3-pro-image-preview
Asset generation, image editing65k input, 32k output
模型最佳适用场景上下文窗口
gemini-3-pro-preview
视觉分析、代码生成、复杂推理1M输入,64k输出
gemini-3-pro-image-preview
资源生成、图像编辑65k输入,32k输出

Media Resolution Options

媒体分辨率选项

Control token usage and detail level with
--resolution
:
ResolutionTokens/ImageBest For
low
70Quick scans, thumbnails
medium
560Standard screenshots, OCR
high
1120Detailed UI analysis
ultra_high
2240+Fine text, complex layouts

通过
--resolution
控制令牌使用量和细节级别:
分辨率每图像令牌数最佳适用场景
low
70快速扫描、缩略图
medium
560标准截图、OCR
high
1120详细UI分析
ultra_high
2240+精细文本、复杂布局

Script Reference

脚本参考

1. analyze_ui.py - UI Analysis

1. analyze_ui.py - UI分析

Analyze UI screenshots for design issues, accessibility problems, and improvement suggestions.
python scripts/analyze_ui.py [options] IMAGE

Required:
  IMAGE                    Path to UI screenshot to analyze

Options:
  -m, --mode MODE          Analysis mode (default: comprehensive)
                           Modes: comprehensive, accessibility, layout, ux
  -r, --resolution RES     Media resolution (default: high)
  -o, --output FILE        Save analysis to file (JSON or text)
  -f, --format FORMAT      Output format: text, json, markdown (default: text)
  --thinking LEVEL         Thinking level: low, high (default: high)
  -v, --verbose            Show detailed progress
Examples:
bash
undefined
分析UI截图中的设计问题、无障碍问题并给出改进建议。
python scripts/analyze_ui.py [options] IMAGE

Required:
  IMAGE                    要分析的UI截图路径

Options:
  -m, --mode MODE          分析模式(默认:comprehensive)
                           模式:comprehensive, accessibility, layout, ux
  -r, --resolution RES     媒体分辨率(默认:high)
  -o, --output FILE        将分析结果保存到文件(JSON或文本)
  -f, --format FORMAT      输出格式:text, json, markdown(默认:text)
  --thinking LEVEL         思考级别:low, high(默认:high)
  -v, --verbose            显示详细进度
示例:
bash
undefined

Comprehensive UI analysis

全面UI分析

python scripts/analyze_ui.py screenshot.png
python scripts/analyze_ui.py screenshot.png

Accessibility-focused analysis

聚焦无障碍的分析

python scripts/analyze_ui.py -m accessibility app_screen.png
python scripts/analyze_ui.py -m accessibility app_screen.png

Layout analysis with JSON output

布局分析并输出JSON

python scripts/analyze_ui.py -m layout -f json -o report.json mockup.png
python scripts/analyze_ui.py -m layout -f json -o report.json mockup.png

Quick UX review

快速UX审查

python scripts/analyze_ui.py -m ux --thinking low mobile_app.png

**Analysis Modes:**

- **comprehensive**: Full analysis including layout, colors, typography, accessibility, and UX
- **accessibility**: WCAG compliance, contrast ratios, screen reader compatibility, touch targets
- **layout**: Visual hierarchy, spacing, alignment, responsive issues
- **ux**: User flow, affordances, cognitive load, interaction patterns

---
python scripts/analyze_ui.py -m ux --thinking low mobile_app.png

**分析模式:**

- **comprehensive**:全面分析,包括布局、颜色、排版、无障碍和UX
- **accessibility**:WCAG合规性、对比度、屏幕阅读器兼容性、触摸目标
- **layout**:视觉层级、间距、对齐、响应式问题
- **ux**:用户流程、功能可见性、认知负荷、交互模式

---

2. generate_ui_assets.py - Asset Generation

2. generate_ui_assets.py - 资源生成

Generate UI assets like icons, backgrounds, patterns, and graphics.
python scripts/generate_ui_assets.py [options]

Required:
  -p, --prompt TEXT        Description of asset to generate

Options:
  -t, --type TYPE          Asset type (default: icon)
                           Types: icon, background, pattern, illustration, badge
  -s, --style STYLE        Design style (default: modern)
                           Styles: modern, minimal, flat, gradient, glassmorphism,
                                   neumorphism, material, ios, outlined
  -c, --colors COLORS      Color palette (comma-separated HEX or names)
  -a, --aspect-ratio RATIO Aspect ratio (default: 1:1)
  --size SIZE              Resolution: 1K, 2K, 4K (default: 1K)
  -o, --output FILE        Output file path
  -r, --reference IMAGE    Reference image for style guidance
  -v, --verbose            Show detailed progress
Examples:
bash
undefined
生成图标、背景、图案和图形等UI资源。
python scripts/generate_ui_assets.py [options]

Required:
  -p, --prompt TEXT        要生成的资源描述

Options:
  -t, --type TYPE          资源类型(默认:icon)
                           类型:icon, background, pattern, illustration, badge
  -s, --style STYLE        设计风格(默认:modern)
                           风格:modern, minimal, flat, gradient, glassmorphism,
                                   neumorphism, material, ios, outlined
  -c, --colors COLORS      调色板(逗号分隔的HEX值或颜色名称)
  -a, --aspect-ratio RATIO 宽高比(默认:1:1)
  --size SIZE              分辨率:1K, 2K, 4K(默认:1K)
  -o, --output FILE        输出文件路径
  -r, --reference IMAGE    用于风格参考的图像
  -v, --verbose            显示详细进度
示例:
bash
undefined

Generate app icon

生成应用图标

python scripts/generate_ui_assets.py -p "Weather app icon with sun and clouds" -t icon
python scripts/generate_ui_assets.py -p "Weather app icon with sun and clouds" -t icon

Create gradient background

创建渐变背景

python scripts/generate_ui_assets.py -p "Soft gradient for login screen" -t background
-c "#667eea,#764ba2" -a 9:16 -o login_bg.png
python scripts/generate_ui_assets.py -p "Soft gradient for login screen" -t background
-c "#667eea,#764ba2" -a 9:16 -o login_bg.png

Generate pattern for UI

生成UI用图案

python scripts/generate_ui_assets.py -p "Subtle geometric pattern for card backgrounds"
-t pattern -s minimal -o pattern.png
python scripts/generate_ui_assets.py -p "Subtle geometric pattern for card backgrounds"
-t pattern -s minimal -o pattern.png

Create illustration from reference

根据参考图生成插画

python scripts/generate_ui_assets.py -p "Onboarding illustration, person using phone"
-t illustration -r brand_style.png -o onboarding.png
python scripts/generate_ui_assets.py -p "Onboarding illustration, person using phone"
-t illustration -r brand_style.png -o onboarding.png

Generate badge/label

生成徽章/标签

python scripts/generate_ui_assets.py -p "Premium badge with star" -t badge
-c "gold,white" -s gradient

---
python scripts/generate_ui_assets.py -p "Premium badge with star" -t badge
-c "gold,white" -s gradient

---

3. compare_designs.py - Design Comparison

3. compare_designs.py - 设计对比

Compare two design screenshots and analyze differences.
python scripts/compare_designs.py [options] IMAGE1 IMAGE2

Required:
  IMAGE1                   First design image (before/version A)
  IMAGE2                   Second design image (after/version B)

Options:
  -m, --mode MODE          Comparison mode (default: full)
                           Modes: full, visual, content, accessibility
  -f, --format FORMAT      Output format: text, json, markdown (default: text)
  -o, --output FILE        Save comparison to file
  -r, --resolution RES     Media resolution (default: high)
  -v, --verbose            Show detailed progress
Examples:
bash
undefined
对比两个设计截图并分析差异。
python scripts/compare_designs.py [options] IMAGE1 IMAGE2

Required:
  IMAGE1                   第一个设计图像(旧版本/版本A)
  IMAGE2                   第二个设计图像(新版本/版本B)

Options:
  -m, --mode MODE          对比模式(默认:full)
                           模式:full, visual, content, accessibility
  -f, --format FORMAT      输出格式:text, json, markdown(默认:text)
  -o, --output FILE        将对比结果保存到文件
  -r, --resolution RES     媒体分辨率(默认:high)
  -v, --verbose            显示详细进度
示例:
bash
undefined

Full design comparison

全面设计对比

python scripts/compare_designs.py before.png after.png
python scripts/compare_designs.py before.png after.png

Visual-only comparison

仅视觉对比

python scripts/compare_designs.py -m visual old_design.png new_design.png
python scripts/compare_designs.py -m visual old_design.png new_design.png

Compare for accessibility changes

对比无障碍变更

python scripts/compare_designs.py -m accessibility v1.png v2.png -o report.md
python scripts/compare_designs.py -m accessibility v1.png v2.png -o report.md

A/B test comparison as JSON

A/B测试对比并输出JSON

python scripts/compare_designs.py -m full variant_a.png variant_b.png -f json

**Comparison Modes:**

- **full**: Comprehensive comparison of all visual and functional aspects
- **visual**: Focus on colors, typography, spacing, visual hierarchy
- **content**: Text changes, layout shifts, information architecture
- **accessibility**: Contrast, readability, touch targets, WCAG compliance changes

---
python scripts/compare_designs.py -m full variant_a.png variant_b.png -f json

**对比模式:**

- **full**:全面对比所有视觉和功能方面
- **visual**:聚焦颜色、排版、间距、视觉层级
- **content**:文本变更、布局偏移、信息架构
- **accessibility**:对比度、可读性、触摸目标、WCAG合规性变更

---

4. extract_colors.py - Color Palette Extraction

4. extract_colors.py - 调色板提取

Extract color palettes from images with multiple output formats.
python scripts/extract_colors.py [options] IMAGE

Required:
  IMAGE                    Image to extract colors from

Options:
  -n, --count COUNT        Number of colors to extract (default: 6)
  -f, --format FORMAT      Output format: text, json, css, tailwind, scss (default: text)
  -o, --output FILE        Save palette to file
  --named                  Include closest CSS color names
  --contrast               Calculate contrast ratios between colors
  -v, --verbose            Show detailed progress
Examples:
bash
undefined
从图像中提取调色板,支持多种输出格式。
python scripts/extract_colors.py [options] IMAGE

Required:
  IMAGE                    要提取颜色的图像

Options:
  -n, --count COUNT        要提取的颜色数量(默认:6)
  -f, --format FORMAT      输出格式:text, json, css, tailwind, scss(默认:text)
  -o, --output FILE        将调色板保存到文件
  --named                  包含最接近的CSS颜色名称
  --contrast               计算颜色之间的对比度
  -v, --verbose            显示详细进度
示例:
bash
undefined

Extract 6 main colors

提取6种主色调

python scripts/extract_colors.py screenshot.png
python scripts/extract_colors.py screenshot.png

Extract palette as CSS variables

提取调色板为CSS变量

python scripts/extract_colors.py -f css -o colors.css brand_image.png
python scripts/extract_colors.py -f css -o colors.css brand_image.png

Get Tailwind config

获取Tailwind配置

python scripts/extract_colors.py -f tailwind -o tailwind.config.js design.png
python scripts/extract_colors.py -f tailwind -o tailwind.config.js design.png

Detailed palette with contrast info

包含对比度信息的详细调色板

python scripts/extract_colors.py -n 8 --named --contrast hero_image.jpg
python scripts/extract_colors.py -n 8 --named --contrast hero_image.jpg

SCSS variables output

输出SCSS变量

python scripts/extract_colors.py -f scss -o _colors.scss mockup.png

---
python scripts/extract_colors.py -f scss -o _colors.scss mockup.png

---

5. screenshot_to_code.py - Screenshot to Code

5. screenshot_to_code.py - 截图转代码

Convert UI screenshots to HTML/CSS code.
python scripts/screenshot_to_code.py [options] IMAGE

Required:
  IMAGE                    UI screenshot to convert

Options:
  -f, --framework FRAME    CSS framework (default: tailwind)
                           Frameworks: tailwind, css, bootstrap, vanilla
  -c, --components         Extract as reusable components
  --responsive             Generate responsive code
  -o, --output DIR         Output directory for files
  -r, --resolution RES     Media resolution (default: ultra_high)
  --thinking LEVEL         Thinking level: low, high (default: high)
  -v, --verbose            Show detailed progress
Examples:
bash
undefined
将UI截图转换为HTML/CSS代码。
python scripts/screenshot_to_code.py [options] IMAGE

Required:
  IMAGE                    要转换的UI截图

Options:
  -f, --framework FRAME    CSS框架(默认:tailwind)
                           框架:tailwind, css, bootstrap, vanilla
  -c, --components         提取为可复用组件
  --responsive             生成响应式代码
  -o, --output DIR         文件输出目录
  -r, --resolution RES     媒体分辨率(默认:ultra_high)
  --thinking LEVEL         思考级别:low, high(默认:high)
  -v, --verbose            显示详细进度
示例:
bash
undefined

Convert to Tailwind HTML

转换为Tailwind HTML

python scripts/screenshot_to_code.py landing_page.png
python scripts/screenshot_to_code.py landing_page.png

Generate vanilla CSS

生成原生CSS

python scripts/screenshot_to_code.py -f vanilla -o ./output mockup.png
python scripts/screenshot_to_code.py -f vanilla -o ./output mockup.png

Create responsive Bootstrap components

创建响应式Bootstrap组件

python scripts/screenshot_to_code.py -f bootstrap -c --responsive card.png
python scripts/screenshot_to_code.py -f bootstrap -c --responsive card.png

Full page conversion with components

完整页面转换并提取组件

python scripts/screenshot_to_code.py -f tailwind -c --responsive -o ./components page.png

---
python scripts/screenshot_to_code.py -f tailwind -c --responsive -o ./components page.png

---

6. design_from_brief.py - Text-Based Design Assistant

6. design_from_brief.py - 基于文本的设计助手

Generate frontend designs, code, and components from text descriptions without needing visual input.
python scripts/design_from_brief.py [options]

Input (one required):
  -p, --prompt TEXT      Design brief or prompt text
  -b, --brief-file FILE  Read brief from a file
  --interactive          Start interactive design session

Options:
  -m, --mode MODE        Generation mode (default: code)
                         Modes: design, code, component, review, brainstorm
  -fw, --framework FW    Framework for code generation (default: tailwind)
                         Frameworks: tailwind, css, bootstrap, react, vue, svelte, vanilla
  -c, --context TEXT     Additional context (existing code, constraints)
  -f, --format FORMAT    Output format: text, json, markdown (default: text)
  -o, --output FILE      Save output to file
  -v, --verbose          Show detailed progress
Examples:
bash
undefined
无需视觉输入,直接从文本描述生成前端设计、代码和组件。
python scripts/design_from_brief.py [options]

Input (one required):
  -p, --prompt TEXT      设计需求文档或提示文本
  -b, --brief-file FILE  从文件读取需求文档
  --interactive          启动交互式设计会话

Options:
  -m, --mode MODE        生成模式(默认:code)
                         模式:design, code, component, review, brainstorm
  -fw, --framework FW    代码生成框架(默认:tailwind)
                         框架:tailwind, css, bootstrap, react, vue, svelte, vanilla
  -c, --context TEXT     附加上下文(现有代码、约束条件)
  -f, --format FORMAT    输出格式:text, json, markdown(默认:text)
  -o, --output FILE      将输出保存到文件
  -v, --verbose          显示详细进度
示例:
bash
undefined

Generate code from a brief

根据需求生成代码

python scripts/design_from_brief.py -p "Create a pricing table with 3 tiers" -m code -fw tailwind
python scripts/design_from_brief.py -p "Create a pricing table with 3 tiers" -m code -fw tailwind

Get design advice and guidance

获取设计建议和指导

python scripts/design_from_brief.py -p "Design a modern SaaS landing page" -m design
python scripts/design_from_brief.py -p "Design a modern SaaS landing page" -m design

Generate a React component

生成React组件

python scripts/design_from_brief.py -p "A toggle switch with smooth animation" -m component -fw react
python scripts/design_from_brief.py -p "A toggle switch with smooth animation" -m component -fw react

Review a design idea

审查设计想法

python scripts/design_from_brief.py -p "Is a hamburger menu good for desktop navigation?" -m review
python scripts/design_from_brief.py -p "Is a hamburger menu good for desktop navigation?" -m review

Brainstorm creative ideas

头脑风暴创意

python scripts/design_from_brief.py -p "Ideas for a fitness app dashboard" -m brainstorm
python scripts/design_from_brief.py -p "Ideas for a fitness app dashboard" -m brainstorm

Read brief from file

从文件读取需求

python scripts/design_from_brief.py -b project_brief.txt -m code -fw vue
python scripts/design_from_brief.py -b project_brief.txt -m code -fw vue

Interactive multi-turn session

交互式多轮会话

python scripts/design_from_brief.py --interactive -m code -fw tailwind

**Generation Modes:**

- **design**: Get design guidance including colors, typography, layout, and UX recommendations
- **code**: Generate complete, production-ready frontend code
- **component**: Design and code reusable UI components with props and variants
- **review**: Get feedback and recommendations on design ideas or code
- **brainstorm**: Generate creative ideas and multiple design directions

**Supported Frameworks:**

| Framework | Description |
|-----------|-------------|
| `tailwind` | Tailwind CSS utility classes |
| `css` | Custom CSS with variables and BEM |
| `bootstrap` | Bootstrap 5 components |
| `react` | React functional components with TypeScript |
| `vue` | Vue 3 Composition API with TypeScript |
| `svelte` | Svelte components |
| `vanilla` | Plain HTML/CSS/JavaScript |

**Interactive Session Commands:**

When using `--interactive`, you can use these commands:

| Command | Description |
|---------|-------------|
| `/mode <mode>` | Change generation mode |
| `/framework <fw>` | Change framework |
| `/save <file>` | Save last response to file |
| `/clear` | Clear conversation history |
| `/quit` | Exit session |

---
python scripts/design_from_brief.py --interactive -m code -fw tailwind

**生成模式:**

- **design**:获取设计指导,包括颜色、排版、布局和UX建议
- **code**:生成完整的生产级前端代码
- **component**:设计并编写带属性和变体的可复用UI组件
- **review**:获取设计想法或代码的反馈和建议
- **brainstorm**:生成创意想法和多种设计方向

**支持的框架:**

| 框架 | 描述 |
|-----------|-------------|
| `tailwind` | Tailwind CSS工具类 |
| `css` | 带变量和BEM规范的自定义CSS |
| `bootstrap` | Bootstrap 5组件 |
| `react` | 带TypeScript的React函数式组件 |
| `vue` | 带TypeScript的Vue 3组合式API |
| `svelte` | Svelte组件 |
| `vanilla` | 原生HTML/CSS/JavaScript |

**交互式会话命令:**

使用`--interactive`时,可使用以下命令:

| 命令 | 描述 |
|---------|-------------|
| `/mode <mode>` | 切换生成模式 |
| `/framework <fw>` | 切换框架 |
| `/save <file>` | 将上一次响应保存到文件 |
| `/clear` | 清除对话历史 |
| `/quit` | 退出会话 |

---

Use Cases

应用场景

Front-End Development Workflow

前端开发工作流

bash
undefined
bash
undefined

1. Analyze a design mockup

1. 分析设计原型

python scripts/analyze_ui.py mockup.png -f markdown -o analysis.md
python scripts/analyze_ui.py mockup.png -f markdown -o analysis.md

2. Extract brand colors

2. 提取品牌颜色

python scripts/extract_colors.py mockup.png -f tailwind -o colors.js
python scripts/extract_colors.py mockup.png -f tailwind -o colors.js

3. Generate code from screenshot

3. 从截图生成代码

python scripts/screenshot_to_code.py mockup.png -f tailwind -c -o ./src
python scripts/screenshot_to_code.py mockup.png -f tailwind -c -o ./src

4. Generate placeholder icons

4. 生成占位图标

python scripts/generate_ui_assets.py -p "Settings gear icon" -t icon -s outlined
undefined
python scripts/generate_ui_assets.py -p "Settings gear icon" -t icon -s outlined
undefined

Design Review Process

设计审查流程

bash
undefined
bash
undefined

Compare design iterations

对比设计迭代版本

python scripts/compare_designs.py v1.png v2.png -f markdown -o review.md
python scripts/compare_designs.py v1.png v2.png -f markdown -o review.md

Check accessibility

检查无障碍性

python scripts/analyze_ui.py final_design.png -m accessibility -o a11y_report.json
undefined
python scripts/analyze_ui.py final_design.png -m accessibility -o a11y_report.json
undefined

Asset Production Pipeline

资源生产流水线

bash
undefined
bash
undefined

Generate icon set

生成图标集

for icon in "home" "search" "profile" "settings"; do python scripts/generate_ui_assets.py -p "${icon} icon" -t icon -s outlined -o icons/${icon}.png done
for icon in "home" "search" "profile" "settings"; do python scripts/generate_ui_assets.py -p "${icon} icon" -t icon -s outlined -o icons/${icon}.png done

Generate backgrounds for different screens

为不同屏幕生成背景

python scripts/generate_ui_assets.py -p "Auth screen gradient" -t background -a 9:16 -o bg_auth.png python scripts/generate_ui_assets.py -p "Dashboard header" -t background -a 21:9 -o bg_header.png
undefined
python scripts/generate_ui_assets.py -p "Auth screen gradient" -t background -a 9:16 -o bg_auth.png python scripts/generate_ui_assets.py -p "Dashboard header" -t background -a 21:9 -o bg_header.png
undefined

Design from Text Brief

根据文本需求设计

bash
undefined
bash
undefined

Start with design exploration

从设计探索开始

python scripts/design_from_brief.py -p "E-commerce product page for sneakers" -m design -o design_spec.md
python scripts/design_from_brief.py -p "E-commerce product page for sneakers" -m design -o design_spec.md

Generate the code

生成代码

python scripts/design_from_brief.py -p "E-commerce product page with image gallery, size selector, add to cart button, and reviews section" -m code -fw react -o product_page.tsx
python scripts/design_from_brief.py -p "E-commerce product page with image gallery, size selector, add to cart button, and reviews section" -m code -fw react -o product_page.tsx

Create reusable components

创建可复用组件

python scripts/design_from_brief.py -p "Star rating component, 1-5 stars, supports half stars, shows count" -m component -fw react
python scripts/design_from_brief.py -p "Star rating component, 1-5 stars, supports half stars, shows count" -m component -fw react

Interactive refinement session

交互式优化会话

python scripts/design_from_brief.py --interactive -m code -fw tailwind
python scripts/design_from_brief.py --interactive -m code -fw tailwind

Then iterate: "Add a sticky header", "Make the CTA more prominent", etc.

然后迭代:"Add a sticky header", "Make the CTA more prominent", etc.

undefined
undefined

Full Project Workflow (No Images Needed)

完整项目工作流(无需图像)

bash
undefined
bash
undefined

1. Brainstorm ideas

1. 头脑风暴想法

python scripts/design_from_brief.py -p "Modern dashboard for analytics SaaS" -m brainstorm
python scripts/design_from_brief.py -p "Modern dashboard for analytics SaaS" -m brainstorm

2. Get detailed design spec

2. 获取详细设计规范

python scripts/design_from_brief.py -p "Analytics dashboard with sidebar nav, KPI cards, charts, and data tables" -m design -f markdown -o design.md
python scripts/design_from_brief.py -p "Analytics dashboard with sidebar nav, KPI cards, charts, and data tables" -m design -f markdown -o design.md

3. Generate components

3. 生成组件

python scripts/design_from_brief.py -p "KPI card showing metric, trend, and sparkline"
-m component -fw react -o components/KPICard.tsx
python scripts/design_from_brief.py -p "Data table with sorting, filtering, pagination"
-m component -fw react -o components/DataTable.tsx
python scripts/design_from_brief.py -p "KPI card showing metric, trend, and sparkline"
-m component -fw react -o components/KPICard.tsx
python scripts/design_from_brief.py -p "Data table with sorting, filtering, pagination"
-m component -fw react -o components/DataTable.tsx

4. Generate full page layout

4. 生成完整页面布局

python scripts/design_from_brief.py -p "Dashboard layout combining sidebar, header, and main content area with the KPI cards and data table" -m code -fw react

---
python scripts/design_from_brief.py -p "Dashboard layout combining sidebar, header, and main content area with the KPI cards and data table" -m code -fw react

---

API Capabilities Summary

API能力总结

Gemini 3 Visual Features

Gemini 3视觉特性

  • Spatial Reasoning: Understands element positioning, alignment, and relationships
  • Document Understanding: OCR, text extraction, layout parsing
  • High Resolution: Up to ultra-high resolution for fine details and small text
  • Multi-Image Input: Compare up to 3,600 images per request
  • Thought Signatures: Maintains reasoning context for multi-turn editing
  • 空间推理:理解元素定位、对齐和关系
  • 文档理解:OCR、文本提取、布局解析
  • 高分辨率:支持超高分辨率以识别精细细节和小文本
  • 多图像输入:每个请求最多可对比3600张图像
  • 思维签名:在多轮编辑中保持推理上下文

Image Support

图像支持

  • Input Formats: PNG, JPEG, WebP, HEIC, HEIF
  • Max File Size: 20MB inline, larger via File API
  • Output Formats: PNG (generated images)
  • Resolutions: 1K, 2K, 4K (generation), configurable input resolution

  • 输入格式:PNG, JPEG, WebP, HEIC, HEIF
  • 最大文件大小:20MB内联,更大文件可通过File API上传
  • 输出格式:PNG(生成的图像)
  • 分辨率:生成支持1K、2K、4K,输入分辨率可配置

Troubleshooting

故障排除

"GEMINI_API_KEY environment variable not set"

"GEMINI_API_KEY environment variable not set"

Set your API key:
bash
export GEMINI_API_KEY="your-api-key"
设置API密钥:
bash
export GEMINI_API_KEY="your-api-key"

"Rate limit exceeded"

"Rate limit exceeded"

Wait a few minutes and retry. For batch operations, add delays between requests.
等待几分钟后重试。批量操作时,在请求之间添加延迟。

Low quality analysis

分析质量低

Use higher resolution:
bash
python scripts/analyze_ui.py -r ultra_high screenshot.png
使用更高分辨率:
bash
python scripts/analyze_ui.py -r ultra_high screenshot.png

Missing fine details

缺失精细细节

For detailed UI analysis, use ultra_high resolution:
bash
python scripts/analyze_ui.py -r ultra_high -m comprehensive app.png
对于详细UI分析,使用ultra_high分辨率:
bash
python scripts/analyze_ui.py -r ultra_high -m comprehensive app.png

Code generation issues

代码生成问题

Use high thinking level and ultra_high resolution:
bash
python scripts/screenshot_to_code.py --thinking high -r ultra_high mockup.png
使用高思考级别和ultra_high分辨率:
bash
python scripts/screenshot_to_code.py --thinking high -r ultra_high mockup.png

Asset generation blocked by content moderation

资源生成被内容审核拦截

If you see "No image content returned. The request may have been blocked by content moderation", try:
  • Simplifying or rephrasing your prompt
  • Removing specific brand names or copyrighted references
  • Using more generic descriptions
This is a safety filter from the Gemini API and not all prompts will be accepted.
如果看到"No image content returned. The request may have been blocked by content moderation",请尝试:
  • 简化或重写提示词
  • 移除特定品牌名称或版权引用
  • 使用更通用的描述
这是Gemini API的安全过滤机制,并非所有提示词都会被接受。

Resolution affects token usage

分辨率影响令牌使用

The
--resolution
parameter controls how many tokens are used for image processing:
  • low
    : ~70 tokens/image - Quick scans, thumbnails
  • medium
    : ~560 tokens/image - Standard screenshots, OCR
  • high
    : ~1120 tokens/image - Detailed UI analysis (default)
Higher resolution provides better accuracy but uses more tokens.

--resolution
参数控制图像处理的令牌使用量:
  • low
    : ~70令牌/图像 - 快速扫描、缩略图
  • medium
    : ~560令牌/图像 - 标准截图、OCR
  • high
    : ~1120令牌/图像 - 详细UI分析(默认)
更高分辨率提供更好的准确性,但会消耗更多令牌。

Best Practices

最佳实践

  1. Use appropriate resolution: Higher resolution = more tokens but better accuracy
  2. Match model to task: Use
    gemini-3-pro-preview
    for analysis,
    gemini-3-pro-image-preview
    for generation
  3. Provide context: Add relevant details to prompts for better results
  4. Iterate on assets: Use multi-turn conversations for refinement
  5. Combine scripts: Use extraction + generation for consistent design systems
  6. Save outputs: Use
    -o
    flag to save reports for documentation
  7. Batch similar tasks: Process multiple similar items together for efficiency
  1. 选择合适的分辨率:分辨率越高,令牌消耗越多,但准确性越好
  2. 匹配模型与任务:使用
    gemini-3-pro-preview
    进行分析,
    gemini-3-pro-image-preview
    进行生成
  3. 提供上下文:在提示词中添加相关细节以获得更好的结果
  4. 迭代优化资源:使用多轮对话进行细化
  5. 组合脚本使用:结合提取与生成功能打造一致的设计系统
  6. 保存输出结果:使用
    -o
    参数保存报告用于文档记录
  7. 批量处理相似任务:将多个相似任务放在一起处理以提高效率

Sources

参考资料