EmojiGen Nano Banana

Use this skill to reproduce the EmojiGen Pro workflow as a reusable agent workflow instead of a browser app.

Read this skill end to end before you start work. Do not jump straight to writing a config, building a prompt, or calling a model until you have read the SOP and decided how you will satisfy every step.

What to collect before doing work

Do not start generation until you have either explicit answers or safe defaults for:

Reference image path.
Output mode:
```
animated
```
or
```
static
```
.
Emotion list, or a category prompt that can be expanded into emotions.
Style target, such as
```
皮克斯 3D
```
,
```
吉卜力
```
,
```
Q版 LINE
```
.
Optional custom text and color.
Output directory.
Backend choice:
- Gemini Developer API via
```
GEMINI_API_KEY
```
  ,
```
GOOGLE_API_KEY
```
  , or
```
API_KEY
```
- Vertex AI via
```
GOOGLE_GENAI_USE_VERTEXAI=true
```
  plus
```
GOOGLE_CLOUD_PROJECT
```
  and
```
GOOGLE_CLOUD_LOCATION
```
- Another image tool chosen by the agent when Gemini access is unavailable

Before generation, inspect the current reference image and rewrite

characterNotes

for this exact subject. Never reuse stale

characterNotes

propNotes

, or style notes from a previous run on a different person.

If the user is adapting the original EmojiGen Pro repository, first reconstruct the workflow from the codebase before you rewrite anything. Preserve the original sequence:

Collect or generate emotion labels.
Assemble one long prompt for a strict 4x6 sticker sheet.
Generate the sheet image from the reference image.
Slice the sheet into frames or stickers.
Encode GIFs for animated mode.

Default decisions

Only use these image models:

Nano Banana Pro

gemini-3-pro-image-preview

Nano Banana 2

gemini-3.1-flash-image-preview

Default to
```
Nano Banana Pro
```
unless the user explicitly asks for
```
Nano Banana 2
```
.
Default style:
```
皮克斯 3D
```
Default
```
removeBackground
```
:
```
false
```
Random emotions should be generated by the agent locally by default. Do not depend on a Gemini text model unless the user explicitly wants model-generated wording.
Keep count constraints hard:
- static mode always resolves to exactly
```
24
```
  stickers
- animated mode only allows
```
1
```
  ,
```
2
```
  , or
```
4
```
  GIFs
Force image generation settings to:
- aspect ratio
```
3:2
```
- image size
```
2K
```
Keep the output contract stable even if image generation uses a fallback tool:
- ```
prompt.txt
```
- ```
resolved-config.json
```
- ```
grid.*
```
- extracted
```
stickers/
```
- ```
manifest.json
```

Working sequence

0. Stage the source image when the path is unstable

If the image came from the clipboard, a pasted chat image, or any source whose original path is unreliable, save it into

/tmp

first:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs stage-image \
  --from-clipboard

Or copy a known file into

/tmp

so later steps use a stable path:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs stage-image \
  --input /abs/path/to/source.png

Use the staged path for all later steps.

1. Prepare config

Start from

assets/example-config.json

. Fill only the fields needed for the current task.

If the user did not give an emotion list, leave

emotions

empty and provide

categoryPrompt

Then:

infer a category prompt from the request and let the agent produce the random emotions directly, or
only if the user explicitly wants model-generated wording, run:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs suggest-emotions \
  --category "职场打工人, 加班, 摸鱼, 收到, 崩溃, 阴阳怪气" \
  --count 4

2. Run preflight before generation

Preflight checks the backend, confirms the staged reference path, and resolves missing random emotions without starting image generation:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs preflight \
  --config path/to/config.json \
  --reference /tmp/emojigen-input-123.png

3. Build the prompt

Always build the prompt through the script so the wording stays consistent:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs build-prompt \
  --config path/to/config.json \
  --out path/to/output/prompt.txt

Do not stop here.

build-prompt

is not the delivery workflow.

4. Generate the 4x6 grid

If Gemini or Vertex AI is available, prefer the built-in generator:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs generate-grid \
  --config path/to/config.json \
  --reference path/to/reference.png \
  --out path/to/output/grid.png

The script rejects image models outside

Nano Banana Pro

and

Nano Banana 2

, and always sends

3:2

2K

Do not take

prompt.txt

and call a raw image model yourself when the built-in workflow is available. That bypasses the skill's staging, preflight, slicing, background-removal, and quality gates.

If another image tool is a better fit, still use this skill. Build the prompt with this skill, generate the grid elsewhere, then continue with

make-assets

5. Produce GIFs or static stickers

If you already have a grid image, run:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs make-assets \
  --config path/to/config.json \
  --grid path/to/output/grid.png \
  --out-dir path/to/output

This creates square crops, optional background removal, and GIF outputs for animated mode.

Keep

removeBackground: false

by default. Only enable background removal when the user explicitly wants transparent stickers and the generated sheet clearly uses a flat, separable background.

Read

manifest.json

after

make-assets

run

. If

manifest.quality.status

warn

, do not deliver the result yet. Rerun with stricter

characterNotes

, stronger square-safe composition constraints, or

removeBackground: false

Background removal uses a corner-connected flood-fill strategy. This is safer than making every near-background color transparent, and avoids punching holes in faces or clothing when skin tones are similar to the background.

Treat square-safe composition as a hard requirement, not a style preference. The final assets are cropped to square cells, so the subject must stay centered and stable across frames or the GIF will jitter after slicing.

6. Full end-to-end run

When no step needs manual intervention, use the orchestration command:

bash

node skills/emojigen-nano-banana/scripts/emojigen.mjs run \
  --config path/to/config.json \
  --reference path/to/reference.png \
  --out-dir /tmp/emojigen-run \
  --deliver-dir path/to/workspace-output \
  --cleanup-temp

Use

--deliver-dir

to copy the finished assets into the working directory or a client delivery folder.

Use

--cleanup-temp

after delivery when the outputs were generated under

/tmp/emojigen-*

. macOS may eventually clear

/tmp

, but not immediately enough for agent workflows.

Treat this as the preferred path. The default expectation is:

```
stage-image
```
```
preflight
```
```
run
```
inspect
```
manifest.quality
```
deliver only if quality is acceptable

Do not skip any of these steps unless the user explicitly narrows the task and you can still preserve output quality.

Fallback rules

If no Gemini credentials are present, say that explicitly and either ask for credentials or use another image-capable tool.
If another tool generated the grid, say that the final GIF packaging still came from this skill.
If background removal damages line art or text, rerun with
```
removeBackground: false
```
and keep the pure solid background from prompt-time constraints.
Do not proactively enable background removal just because the script supports it.
If the input image arrived as a pasted or clipboard image, stage it to
```
/tmp
```
before any prompt or generation step.
If the user only asked for random emotions, do not call a text model by default. Generate them directly unless the user explicitly wants a model to brainstorm them.

References

Read
```
references/workflow.md
```
for CLI usage, environment variable precedence, and output layout.
Read
```
references/model-backends.md
```
when choosing between Gemini API and Vertex AI.

emojigen-nano-banana

NPX Install

Tags

SKILL.md Content

EmojiGen Nano Banana

What to collect before doing work

Default decisions

Working sequence

0. Stage the source image when the path is unstable

1. Prepare config

2. Run preflight before generation

3. Build the prompt

4. Generate the 4x6 grid

5. Produce GIFs or static stickers

6. Full end-to-end run

Fallback rules

References