seedream-image

Original：🇨🇳 Chinese

Translated

1 scriptsChecked / no sensitive code detected

Generates image prompts for Seedream 5.0/4.0 (Jimeng AI), and can call the API to generate images and automatically download them to the output/ directory. Workflow: describe your idea → the agent outputs a prompt for review → user confirms → the agent runs generate.py. It covers text-to-image, image editing, multi-image fusion, character consistency, knowledge cards, posters, PPT backgrounds, e-commerce images, avatars, and group/storyboard generation. Activate this tool when the user mentions terms like seedream, jimeng, AI image generation, text-to-image, image-to-image, seedream prompt, prompt keyword, one-click image generation, knowledge card, poster design, e-commerce image, character consistency, or image generation.

23installs

Sourceppdbxdawj/seedream-image-skill

Added on2026-03-15

NPX Install

npx skill4agent add ppdbxdawj/seedream-image-skill seedream-image

SKILL.md Content (Chinese)

View Translation Comparison →

Seedream Image Assistant | Seedream Jimeng Image Assistant

Seedream 5.0 is ByteDance's next-generation AI image generation model, available on Jimeng AI, Jianying, CapCut, and Volcengine Ark.

Seedream 5.0 is ByteDance's next-generation AI image generation model, now live on Jimeng AI, Jianying, CapCut, and Volcengine Ark.

Core Capabilities

Capability	Description
Real-time Web Search	Automatically fetches trending information when the prompt contains time-sensitive keywords
Multi-step Reasoning	Interprets abstract concepts (e.g., "serene tech feel" → desaturated colors + clean lines + cold lighting)
Multi-round Editing	Iterative refinement: local edits, style transfer, element addition/removal, text rendering
High Resolution	Native 2K resolution, AI-enhanced 4K, generation time of 2-5 seconds
Character Consistency	Maintains facial features, clothing, and pose across multiple images (ready for storyboard use)
Text Rendering	99%+ accuracy for Chinese/English text; use quotation marks for optimal results

Prompt Structure

Basic Structure (Text-to-Image)

[Subject Description] + [Action/Behavior] + [Environment/Background] + [Material/Texture] + [Lighting Effect] + [Composition Requirements] + [Style Keywords]

Describe the subject, action, and environment in natural language
Use short phrases for style, color, lighting, and composition
Enclose text content in quotation marks, e.g.:
```
"Hello World"
```

Four-stage Structure (Advanced)

Subject → Environment → Material/Texture → Lighting

Image Editing Prompt Formula

Change Action + Target Object + Change Features
Example: "Change the knight's helmet to gold"

Style Vocabulary Library

Realistic Photography

realistic movie still

commercial photography

documentary photography

hyper-realistic

RAW film texture

Lenses:

85mm prime lens

35mm wide-angle lens

telephoto compression

fisheye lens

Lighting:

Rembrandt lighting

ring light

split lighting

golden hour warm light

blue hour cold light

neon lighting

Anime/Illustration

Japanese Anime:

Studio Ghibli style

Makoto Shinkai style

Japanese shoujo manga

cel-shaded texture

Western Style:

American comic style

DC comic style

Western realistic characters

Pop Art

Chinese Style:

Chinese trendy illustration

ink wash painting style

Chinese meticulous painting

cyber Chinese style

Others:

pixel art

low-poly

flat illustration

thick oil painting

watercolor hand-drawn

Design/Commercial

minimalism

Bauhaus style

frosted glass texture

high-quality metal

cyberpunk

movie poster level

brand VI visual

infographic

knowledge card

Lighting Modifiers

dramatic side lighting

soft diffused light

high contrast

low saturation

Morandi color palette

cyber neon

warm orange tone

cool blue tone

film grain

Common Prompt Templates

Realistic Characters

[Gender, Age, Appearance], [Clothing Description], [Facial Expression], [Environment Background], 85mm prime lens, natural light, realistic movie still style, ultra-high definition, rich details

Landscape/Scene

[Scene Description], [Time/Weather], [Lighting Description], [Composition], [Style Keywords], cinematic composition, 8K ultra-clear

Knowledge Card (Complete Template)

Generate an image in the [format/carrier] style to explain/display "[core concept]" to [target audience].
The image should have [style feature A], [style feature B], and [layout requirement C], with an overall feel similar to [familiar reference].

Brand/Poster (Negative Space Template)

[Visual Subject Description], [Material Description], [Lighting Effect],
All visual subjects are concentrated on the [left/right] side of the frame, leaving a large clean background area on the [right/left] side for later text layout.
Background: [Background Description]

Continuous Storyboards (Character Consistency)

Refer to the facial features and hairstyle in [Image 1], change the outfit to [scene style],
Generate N consecutive storyboard images of [scene description], [style], set in the same scene with continuous actions.

E-commerce Products

Create a [platform] style display image for this [product], similar to the style of [brand reference],
Clean background, highlight product texture, professional commercial photography

Quick Scene Reference

Scene	Prompt Keywords	Notes
Avatar	`avatar icon` `square composition` `solid color background`	Specifying a style reference image yields better results
Knowledge Card	`infographic` `knowledge graph` `clear layout`	Explain the target audience and core concept
PPT Background	`negative space composition` `biased to [left/right] side` `matte background`	Emphasize negative space on one side for layout
Character Cosplay	`keep facial features unchanged` `realistic texture clothing` `same pose`	Upload original image + target character image
Journal/Planner	`handwritten font` `paper texture` `collage style` `beige background`	Include date and weather to enhance atmosphere
Glass Icon	`frosted glass texture` `gradient color` `C4D` `OC rendering`	Pure white background + simple composition
Poster Design	`movie poster level` `dramatic lighting` `large negative space`	Clarify text content and position
Amulet/Chinese Trend	`Classic of Mountains and Seas` `Chinese trendy ticket` `ink wash` `seal carving`	Add "wish" text to enhance emotional appeal

Advanced Techniques

1. Web Search Trigger

The system automatically searches the web when the prompt contains time-sensitive terms:

2026 popular colors

latest XX model

this year's XX trend

Milan Winter Olympics

2. Image Editing

Designated Area: "Replace the [area] in the image with..."
Style Transfer: "Keep the content unchanged, change to [style]"
Element Control: "Add/remove [element] from the frame"
Lighting Adjustment: "Change the frame's lighting to [lighting type]"
Filter Addition: "Add [filter name] filter to the frame"
Makeup Modification: "Add [makeup description] to the character"

3. Text Rendering

Enclose the text to be generated in quotation marks:

"Boundless Creativity" written in the center of the image

4. Composition Control

Golden Ratio:
```
rule of thirds
```
```
golden spiral
```

Perspective:

bird's-eye view

low-angle shot

frontal eye-level shot

45-degree oblique angle

Negative Space:

large negative space

clean background

subject biased to [direction]

5. Multi-Image Fusion

Supports up to 14 reference images. When fusing, specify which element to reference from each image:

Reference the style of Image 1, the color tone of Image 2, and the character's pose of Image 3

6. Batch Image Generation

Trigger words:

a series of

batch images

generate N consecutive

storyboard images

Negative Prompt Writing

Clearly state unwanted elements at the end of the prompt:

```
Clean background, no cluttered elements
```

Keep facial features unchanged, do not alter facial characteristics

```
No text watermarks
```
```
No overexposure
```

Platform Entries

Platform	URL	Description
Jimeng AI	https://jimeng.jianying.com/	Main site, approximately 20 free 2K generations per day
Volcengine Ark	https://console.volcengine.com/ark	Enterprise API, supports 4K generation
Jianying	App Store	AI Painting → Seedream 5.0
CapCut (Overseas)	App Store	AI Image

Image Generation Script

generate.py

calls the Jimeng 4.0 API, and images are automatically downloaded to

--output-dir

(default:

output/

).

Environment Preparation

Create a

.env

file in the same directory as

generate.py

and enter

VOLC_ACCESSKEY

and

VOLC_SECRETKEY

, or export them in the terminal. The script automatically reads the

.env

file in the same directory. Run

pip install -r requirements.txt

.

Usage

bash

# Text-to-image
python generate.py --prompt "A cat playing in the garden, watercolor style"

# Image editing (input reference image)
python generate.py --prompt "Change the background to a beach" --image-urls "https://example.com/photo.jpg"

# Specify resolution + force single image
python generate.py --prompt "E-commerce main image, product close-up" --width 2560 --height 1440 --force-single

# Batch image generation
python generate.py --prompt "Generate 4 consecutive blind box images about spring, summer, autumn, and winter"

Usage in Skill Workflow

Generate a prompt following this Skill's rules, and wait for user confirmation.
Soft prompt before execution: defaults to 1 image. For multiple images (batch), add
```
--no-force-single
```
or retain terms like "batch images" or "a series of".

Execute

python generate.py --prompt "<confirmed_prompt>"

(add

--no-force-single

for batch generation).

After the script completes polling, images are stored in
```
output/
```
. Display the storage path and URL.

Parameter Description

Parameter	Description
`--prompt`	Required, the generation prompt
`--image-urls`	Input reference image URLs (up to 10 images)
`--width` / `--height`	Specify output width and height (must be passed together); if not passed, the system will adapt intelligently
`--size`	Output area (pixels), default is 2K (2048×2048)
`--scale`	Text influence degree (0~1, default 0.5); higher values mean stronger text influence
`--force-single`	Output only 1 image (default)
`--no-force-single`	Allow multiple images (batch), the number is determined by the model based on the prompt
`--watermark`	Add AI watermark
`--output-dir`	Directory for saving generated images (default: output/); URLs and base64 data will be written here

References

Detailed examples & use cases → examples.md
Official docs, API params, size chart, full style dictionary → reference.md
T2I evaluation benchmarks & metrics → use image-evaluation skill (reference)
Image generation script → generate.py
Dependencies → requirements.txt