jimeng_mcp_skill

Original：🇨🇳 Chinese

Translated

Use jimeng-mcp-server for AI image and video generation. Use this skill when users request to generate images from text, synthesize multiple images, create videos from text descriptions, or add animations to static images. Supports four core capabilities: text-to-image, image synthesis, text-to-video, and image-to-video. Requires jimeng-mcp-server to run locally or be accessed via SSE/HTTP.

3installs

Sourcewwwzhouhui/skills_collection

Added on2026-02-15

NPX Install

npx skill4agent add wwwzhouhui/skills_collection jimeng_mcp_skill

SKILL.md Content (Chinese)

View Translation Comparison →

Jimeng AI Generation Skill

Overview

The Jimeng Skill enables AI-driven image and video generation through jimeng-mcp-server, an MCP (Model Context Protocol) server integrated with Jimeng AI's multimodal generation capabilities. This skill allows you to create visual content directly through natural language instructions.

Core Capabilities:

🎨 Text-to-Image: Generate high-quality images from text descriptions
🎭 Image Synthesis: Intelligently merge and blend multiple images
🎬 Text-to-Video: Create short videos from text prompts
🎞️ Image-to-Video: Add animation effects to static images

When to Use This Skill:

Users request to generate, create or produce images or videos
Users mention "jimeng", "Jimeng" or request AI visual content generation
Users provide text descriptions and expect visual outputs
Users want to combine, merge or synthesize multiple images
Users want to add animation or motion effects to static images

Prerequisites

Before using this skill, ensure jimeng-mcp-server is properly configured:

Server Must Be Running, in one of the following modes:
- stdio Mode: Configured in MCP clients (Claude Desktop, Cherry Studio)
- SSE Mode: Run as an HTTP server with SSE transmission
- HTTP Mode: Run as a REST API server
Environment Variables Configured:
- ```
JIMENG_API_KEY
```
  : Your Jimeng API key (obtained from Jimeng website cookies)
- ```
JIMENG_API_URL
```
  : API endpoint (default: http://127.0.0.1:8001)
- ```
JIMENG_MODEL
```
  : Model name (default: jimeng-4.5)
Backend API Running: The jimeng-free-api-all Docker container must be active

For detailed setup instructions, refer to

references/setup_guide.md

Quick Start

Basic Usage Workflow

When users request image or video generation, follow this workflow:

Identify Task Type based on user input
Extract Required Parameters from the request
Call the Corresponding jimeng-mcp-server Tool
Return Generated Content URLs to the user

Example Requests

Text-to-Image:

User: "Generate an image with Jimeng: Shiba Inu under cherry blossom trees"
→ Use the text_to_image tool with parameter prompt="Shiba Inu under cherry blossom trees"

Image Synthesis:

User: "Help me synthesize these two images, with the style leaning towards the first one"
→ Use the image_composition tool and provide image URLs

Text-to-Video:

User: "Create a 5-second video: Scene of a pony crossing a river"
→ Use the text_to_video tool, set the prompt and duration

Image-to-Video:

User: "Add animation effects to this image"
→ Use the image_to_video tool and provide the image URL

Core Capabilities

1. Text-to-Image

Generate images from text descriptions using the Jimeng 4.5 engine.

Tool:

text_to_image

Parameters:

```
prompt
```
(required): Text description of the desired image
```
model
```
(optional): Model version (default: jimeng-4.5)
```
ratio
```
(optional): Image aspect ratio ("1:1", "4:3", "3:4", "16:9", "9:16")
```
resolution
```
(optional): Resolution preset ("1k", "2k", "4k", default: 2k)
```
negativePrompt
```
(optional): Elements to avoid in the generated image

Common Aspect Ratios:

16:9 → Landscape/widescreen (video covers, banners)
1:1 → Square (avatars, social media)
9:16 → Portrait/mobile screen (short video covers)
4:3 → Standard landscape (blog illustrations)
3:4 → Standard portrait (portrait photos)

Usage Example:

python

# User request: "Generate an image: Beach at sunset with coconut trees"
{
  "model": "jimeng-4.5",
  "prompt": "Beach at sunset with coconut trees",
  "ratio": "16:9",
  "resolution": "2k"
}

Return Result: Returns an array containing multiple image URLs, which can be displayed or downloaded.

Tips:

Higher resolution (4k) is suitable for print and high-quality displays
Lower resolution (1k) is suitable for quick previews
Use descriptive prompts for better results
Specify art style, lighting, and atmosphere to enhance control

2. Image Synthesis

Merge and blend multiple images through intelligent fusion.

Tool:

image_composition

Parameters:

```
prompt
```
(required): Description of how to synthesize the images
```
images
```
(required): Array of 2-5 image URLs to synthesize
```
model
```
(optional): Model version (default: jimeng-4.5)
```
ratio
```
(optional): Output image aspect ratio ("1:1", "4:3", "3:4", "16:9", "9:16")
```
resolution
```
(optional): Resolution preset ("1k", "2k", "4k", default: 2k)

Usage Example:

python

# User request: "Synthesize these two images, retaining the style of the first one"
{
  "model": "jimeng-4.5",
  "prompt": "Seamlessly blend the two images while maintaining the artistic style of the first image",
  "images": [
    "https://example.com/image1.jpg",
    "https://example.com/image2.jpg"
  ],
  "ratio": "4:3",
  "resolution": "2k"
}

Usage Scenarios:

Blend portraits with backgrounds
Style transfer between images
Create artistic composite works
Merge elements from multiple photos

Tips:

Provide clear synthesis instructions in the prompt
Images should have compatible resolutions
Describe the desired blending style (seamless, artistic, realistic)

3. Text-to-Video

Create short videos from text descriptions.

Tool:

text_to_video

Parameters:

```
prompt
```
(required): Text description of the video scene
```
model
```
(optional): Model version (default: jimeng-video-3.0)
```
ratio
```
(optional): Video aspect ratio ("16:9", "9:16", "4:3", "3:4", "1:1")
```
resolution
```
(optional): Preset resolution ("480p", "720p", "1080p")

Resolution Presets:

"480p" → Quick preview
"720p" → Balanced quality/speed (recommended)
"1080p" → High quality

Usage Example:

python

# User request: "Generate a 5-second video: Kitten fishing"
{
  "model": "jimeng-video-3.0",
  "prompt": "An orange kitten sitting by the river, holding a fishing rod and focusing on fishing, sunny weather",
  "ratio": "16:9",
  "resolution": "720p"
}

Video Features:

Duration: Typically 3-5 seconds
Format: MP4
Generation Time: 30-60 seconds
Frame Rate: 24-30 fps

Tips:

Include scene details, actions, and atmosphere
Keep prompts focused on a single clear action
Specify time of day, weather, or mood for better results
Start with 720p to balance quality and speed

4. Image-to-Video Animation

Add motion and animation effects to static images.

Tool:

image_to_video

Parameters:

```
prompt
```
(required): Description of the desired animation effect
```
file_paths
```
(required): Array of image URLs to animate
```
model
```
(optional): Model version (default: jimeng-video-3.0)
```
ratio
```
(optional): Video aspect ratio ("16:9", "9:16", "4:3", "3:4", "1:1")
```
resolution
```
(optional): Preset resolution ("480p", "720p", "1080p")

Usage Example:

python

# User request: "Animate this photo with gentle camera zoom"
{
  "model": "jimeng-video-3.0",
  "prompt": "Add gentle motion effects and natural camera zoom to create a cinematic feel",
  "file_paths": ["https://example.com/photo.jpg"],
  "ratio": "16:9",
  "resolution": "720p"
}

Animation Types:

Character motion
Camera movements
Scene transitions
Environmental effects (wind, rain, etc.)

Tips:

Describe the desired type of motion
Consider image content when selecting effects
Portrait photos suit subtle movements
Landscape photos suit pan/zoom effects

Workflow Guide

Decision Tree

Receive User Request
    │
    ├─ Contains "generate image" or "create image"?
    │   └─ Yes → Use text_to_image
    │
    ├─ Contains "synthesize" or "merge/blend images"?
    │   └─ Yes → Use image_composition
    │
    ├─ Contains "generate video" or "create video"?
    │   └─ Yes → Use text_to_video
    │
    └─ Contains "animate" or "animate image"?
        └─ Yes → Use image_to_video

Parameter Extraction

When processing user requests:

Extract Prompt: User's description of the desired content
Identify Aspect Ratio: Extract size preferences (landscape/portrait/square) corresponding to the ratio parameter
Parse Resolution Requirements: Look for quality requirements corresponding to the resolution parameter
Collect Image URLs: For synthesis and animation tasks

Error Handling

If tool execution fails:

Check Server Status: Verify if jimeng-mcp-server is running
Validate API Key: Ensure JIMENG_API_KEY is configured
Check Parameters: Confirm all required fields are provided
Check Image URLs: Verify URLs for synthesis/animation are accessible
Report Errors Clearly: Explain the problem and suggest solutions

Common Errors:

```
API key not configured
```
: Set JIMENG_API_KEY in the environment
```
Server not responding
```
: Start the jimeng-free-api-all Docker container
```
Invalid image URL
```
: Ensure the URL is publicly accessible
```
Generation timeout
```
: Large videos may take 60+ seconds

Advanced Usage

Combine Multiple Tools

For complex creative tasks, tools can be used in a chain:

Example: Create Animated Artwork

Use
```
text_to_image
```
to generate a base image
Use
```
image_to_video
```
to add animation to the result

Example: Synthesize and Optimize

Use
```
image_composition
```
to synthesize images
Generate variants with adjusted prompts

Optimization Tips

Speed Up Generation:

Use lower resolution (720p instead of 1080p, or 1k instead of 2k)
Keep prompts concise yet descriptive

Improve Quality:

Use detailed, specific prompts
Select appropriate ratio based on the scene
Use higher resolution (2k or 4k)
Specify art style and techniques
Include lighting and atmosphere descriptions

Batch Processing

When users request multiple generations:

Process requests sequentially (one at a time)
Provide progress updates for each item
Collect all results before final response
Consider resource limits (API quotas)

Troubleshooting

Server Connection Issues

Symptom: Tool returns connection errors

Solutions:

Check if the jimeng-free-api-all Docker container is running:
bash
```
docker ps | grep jimeng
```
Verify server accessibility:
bash
```
curl http://127.0.0.1:8001/health
```
Restart the Docker container if needed

API Key Issues

Symptom: "Invalid API key" or authentication errors

Solutions:

Verify JIMENG_API_KEY in the .env file
Obtain a new API key from Jimeng website cookies (sessionid value)
Ensure the key format is correct (no extra spaces or quotes)

Generation Quality Issues

Symptom: Poor quality or unexpected results

Solutions:

Optimize prompts with more specific details
Adjust the
```
ratio
```
parameter to select an appropriate aspect ratio
Try different
```
resolution
```
settings
Add
```
negativePrompt
```
to exclude unwanted elements

Timeout Errors

Symptom: Generation takes too long or times out

Solutions:

Video generation typically takes 30-60 seconds - please be patient
If timeouts persist, try lower resolution
Check server resource usage
Verify network connection to Jimeng API

Resources

references/

```
setup_guide.md
```
: Detailed installation and configuration instructions
```
api_reference.md
```
: Complete API documentation for all tools

Project Links

GitHub Repository: https://github.com/wwwzhouhui/jimeng-mcp-server
Backend API: https://github.com/wwwzhouhui/jimeng-free-api-all
Jimeng Official Website: https://jimeng.jianying.com/

Best Practices

Always Verify Server Status Before Attempting Generation
Use Appropriate Resolution Based on Use Case and Speed Requirements (ratio controls aspect ratio, resolution controls clarity)
Provide Detailed Prompts for Better Generation Quality
Handle Errors Gracefully and Provide Clear User Feedback
Consider Rate Limits When Processing Multiple Requests
Test with Simple Prompts Before Complex Synthesis
Cache Frequently Used Parameters such as preferred ratio and resolution

Limitations

Free Tier Limits: Official Jimeng API allows 66 credits per day
Video Duration: Typically limited to 3-10 seconds
Generation Time: Videos may take 30-60 seconds to generate
Image Synthesis: Best results with 2-3 images, maximum 5 images supported
Server Dependency: Requires jimeng-free-api-all backend to run
Network Requirements: Internet access required to call Jimeng API

jimeng_mcp_skill

NPX Install

Tags

SKILL.md Content (Chinese)

Jimeng AI Generation Skill

Overview

Prerequisites

Quick Start

Basic Usage Workflow

Example Requests

Core Capabilities

1. Text-to-Image

2. Image Synthesis

3. Text-to-Video

4. Image-to-Video Animation

Workflow Guide

Decision Tree

Parameter Extraction

Error Handling

Advanced Usage

Combine Multiple Tools

Optimization Tips

Batch Processing

Troubleshooting

Server Connection Issues

API Key Issues

Generation Quality Issues

Timeout Errors

Resources

references/

Project Links

Best Practices

Limitations