jimeng_mcp_skill

Original🇨🇳 Chinese
Translated

Use jimeng-mcp-server for AI image and video generation. Use this skill when users request to generate images from text, synthesize multiple images, create videos from text descriptions, or add animations to static images. Supports four core capabilities: text-to-image, image synthesis, text-to-video, and image-to-video. Requires jimeng-mcp-server to run locally or be accessed via SSE/HTTP.

3installs
Added on

NPX Install

npx skill4agent add wwwzhouhui/skills_collection jimeng_mcp_skill

SKILL.md Content (Chinese)

View Translation Comparison →

Jimeng AI Generation Skill

Overview

The Jimeng Skill enables AI-driven image and video generation through jimeng-mcp-server, an MCP (Model Context Protocol) server integrated with Jimeng AI's multimodal generation capabilities. This skill allows you to create visual content directly through natural language instructions.
Core Capabilities:
  • 🎨 Text-to-Image: Generate high-quality images from text descriptions
  • 🎭 Image Synthesis: Intelligently merge and blend multiple images
  • 🎬 Text-to-Video: Create short videos from text prompts
  • 🎞️ Image-to-Video: Add animation effects to static images
When to Use This Skill:
  • Users request to generate, create or produce images or videos
  • Users mention "jimeng", "Jimeng" or request AI visual content generation
  • Users provide text descriptions and expect visual outputs
  • Users want to combine, merge or synthesize multiple images
  • Users want to add animation or motion effects to static images

Prerequisites

Before using this skill, ensure jimeng-mcp-server is properly configured:
  1. Server Must Be Running, in one of the following modes:
    • stdio Mode: Configured in MCP clients (Claude Desktop, Cherry Studio)
    • SSE Mode: Run as an HTTP server with SSE transmission
    • HTTP Mode: Run as a REST API server
  2. Environment Variables Configured:
    • JIMENG_API_KEY
      : Your Jimeng API key (obtained from Jimeng website cookies)
    • JIMENG_API_URL
      : API endpoint (default: http://127.0.0.1:8001)
    • JIMENG_MODEL
      : Model name (default: jimeng-4.5)
  3. Backend API Running: The jimeng-free-api-all Docker container must be active
For detailed setup instructions, refer to
references/setup_guide.md
.

Quick Start

Basic Usage Workflow

When users request image or video generation, follow this workflow:
  1. Identify Task Type based on user input
  2. Extract Required Parameters from the request
  3. Call the Corresponding jimeng-mcp-server Tool
  4. Return Generated Content URLs to the user

Example Requests

Text-to-Image:
User: "Generate an image with Jimeng: Shiba Inu under cherry blossom trees"
→ Use the text_to_image tool with parameter prompt="Shiba Inu under cherry blossom trees"
Image Synthesis:
User: "Help me synthesize these two images, with the style leaning towards the first one"
→ Use the image_composition tool and provide image URLs
Text-to-Video:
User: "Create a 5-second video: Scene of a pony crossing a river"
→ Use the text_to_video tool, set the prompt and duration
Image-to-Video:
User: "Add animation effects to this image"
→ Use the image_to_video tool and provide the image URL

Core Capabilities

1. Text-to-Image

Generate images from text descriptions using the Jimeng 4.5 engine.
Tool:
text_to_image
Parameters:
  • prompt
    (required): Text description of the desired image
  • model
    (optional): Model version (default: jimeng-4.5)
  • ratio
    (optional): Image aspect ratio ("1:1", "4:3", "3:4", "16:9", "9:16")
  • resolution
    (optional): Resolution preset ("1k", "2k", "4k", default: 2k)
  • negativePrompt
    (optional): Elements to avoid in the generated image
Common Aspect Ratios:
  • 16:9 → Landscape/widescreen (video covers, banners)
  • 1:1 → Square (avatars, social media)
  • 9:16 → Portrait/mobile screen (short video covers)
  • 4:3 → Standard landscape (blog illustrations)
  • 3:4 → Standard portrait (portrait photos)
Usage Example:
python
# User request: "Generate an image: Beach at sunset with coconut trees"
{
  "model": "jimeng-4.5",
  "prompt": "Beach at sunset with coconut trees",
  "ratio": "16:9",
  "resolution": "2k"
}
Return Result: Returns an array containing multiple image URLs, which can be displayed or downloaded.
Tips:
  • Higher resolution (4k) is suitable for print and high-quality displays
  • Lower resolution (1k) is suitable for quick previews
  • Use descriptive prompts for better results
  • Specify art style, lighting, and atmosphere to enhance control

2. Image Synthesis

Merge and blend multiple images through intelligent fusion.
Tool:
image_composition
Parameters:
  • prompt
    (required): Description of how to synthesize the images
  • images
    (required): Array of 2-5 image URLs to synthesize
  • model
    (optional): Model version (default: jimeng-4.5)
  • ratio
    (optional): Output image aspect ratio ("1:1", "4:3", "3:4", "16:9", "9:16")
  • resolution
    (optional): Resolution preset ("1k", "2k", "4k", default: 2k)
Usage Example:
python
# User request: "Synthesize these two images, retaining the style of the first one"
{
  "model": "jimeng-4.5",
  "prompt": "Seamlessly blend the two images while maintaining the artistic style of the first image",
  "images": [
    "https://example.com/image1.jpg",
    "https://example.com/image2.jpg"
  ],
  "ratio": "4:3",
  "resolution": "2k"
}
Usage Scenarios:
  • Blend portraits with backgrounds
  • Style transfer between images
  • Create artistic composite works
  • Merge elements from multiple photos
Tips:
  • Provide clear synthesis instructions in the prompt
  • Images should have compatible resolutions
  • Describe the desired blending style (seamless, artistic, realistic)

3. Text-to-Video

Create short videos from text descriptions.
Tool:
text_to_video
Parameters:
  • prompt
    (required): Text description of the video scene
  • model
    (optional): Model version (default: jimeng-video-3.0)
  • ratio
    (optional): Video aspect ratio ("16:9", "9:16", "4:3", "3:4", "1:1")
  • resolution
    (optional): Preset resolution ("480p", "720p", "1080p")
Resolution Presets:
  • "480p" → Quick preview
  • "720p" → Balanced quality/speed (recommended)
  • "1080p" → High quality
Usage Example:
python
# User request: "Generate a 5-second video: Kitten fishing"
{
  "model": "jimeng-video-3.0",
  "prompt": "An orange kitten sitting by the river, holding a fishing rod and focusing on fishing, sunny weather",
  "ratio": "16:9",
  "resolution": "720p"
}
Video Features:
  • Duration: Typically 3-5 seconds
  • Format: MP4
  • Generation Time: 30-60 seconds
  • Frame Rate: 24-30 fps
Tips:
  • Include scene details, actions, and atmosphere
  • Keep prompts focused on a single clear action
  • Specify time of day, weather, or mood for better results
  • Start with 720p to balance quality and speed

4. Image-to-Video Animation

Add motion and animation effects to static images.
Tool:
image_to_video
Parameters:
  • prompt
    (required): Description of the desired animation effect
  • file_paths
    (required): Array of image URLs to animate
  • model
    (optional): Model version (default: jimeng-video-3.0)
  • ratio
    (optional): Video aspect ratio ("16:9", "9:16", "4:3", "3:4", "1:1")
  • resolution
    (optional): Preset resolution ("480p", "720p", "1080p")
Usage Example:
python
# User request: "Animate this photo with gentle camera zoom"
{
  "model": "jimeng-video-3.0",
  "prompt": "Add gentle motion effects and natural camera zoom to create a cinematic feel",
  "file_paths": ["https://example.com/photo.jpg"],
  "ratio": "16:9",
  "resolution": "720p"
}
Animation Types:
  • Character motion
  • Camera movements
  • Scene transitions
  • Environmental effects (wind, rain, etc.)
Tips:
  • Describe the desired type of motion
  • Consider image content when selecting effects
  • Portrait photos suit subtle movements
  • Landscape photos suit pan/zoom effects

Workflow Guide

Decision Tree

Receive User Request
    ├─ Contains "generate image" or "create image"?
    │   └─ Yes → Use text_to_image
    ├─ Contains "synthesize" or "merge/blend images"?
    │   └─ Yes → Use image_composition
    ├─ Contains "generate video" or "create video"?
    │   └─ Yes → Use text_to_video
    └─ Contains "animate" or "animate image"?
        └─ Yes → Use image_to_video

Parameter Extraction

When processing user requests:
  1. Extract Prompt: User's description of the desired content
  2. Identify Aspect Ratio: Extract size preferences (landscape/portrait/square) corresponding to the ratio parameter
  3. Parse Resolution Requirements: Look for quality requirements corresponding to the resolution parameter
  4. Collect Image URLs: For synthesis and animation tasks

Error Handling

If tool execution fails:
  1. Check Server Status: Verify if jimeng-mcp-server is running
  2. Validate API Key: Ensure JIMENG_API_KEY is configured
  3. Check Parameters: Confirm all required fields are provided
  4. Check Image URLs: Verify URLs for synthesis/animation are accessible
  5. Report Errors Clearly: Explain the problem and suggest solutions
Common Errors:
  • API key not configured
    : Set JIMENG_API_KEY in the environment
  • Server not responding
    : Start the jimeng-free-api-all Docker container
  • Invalid image URL
    : Ensure the URL is publicly accessible
  • Generation timeout
    : Large videos may take 60+ seconds

Advanced Usage

Combine Multiple Tools

For complex creative tasks, tools can be used in a chain:
Example: Create Animated Artwork
  1. Use
    text_to_image
    to generate a base image
  2. Use
    image_to_video
    to add animation to the result
Example: Synthesize and Optimize
  1. Use
    image_composition
    to synthesize images
  2. Generate variants with adjusted prompts

Optimization Tips

Speed Up Generation:
  • Use lower resolution (720p instead of 1080p, or 1k instead of 2k)
  • Keep prompts concise yet descriptive
Improve Quality:
  • Use detailed, specific prompts
  • Select appropriate ratio based on the scene
  • Use higher resolution (2k or 4k)
  • Specify art style and techniques
  • Include lighting and atmosphere descriptions

Batch Processing

When users request multiple generations:
  1. Process requests sequentially (one at a time)
  2. Provide progress updates for each item
  3. Collect all results before final response
  4. Consider resource limits (API quotas)

Troubleshooting

Server Connection Issues

Symptom: Tool returns connection errors
Solutions:
  1. Check if the jimeng-free-api-all Docker container is running:
    bash
    docker ps | grep jimeng
  2. Verify server accessibility:
    bash
    curl http://127.0.0.1:8001/health
  3. Restart the Docker container if needed

API Key Issues

Symptom: "Invalid API key" or authentication errors
Solutions:
  1. Verify JIMENG_API_KEY in the .env file
  2. Obtain a new API key from Jimeng website cookies (sessionid value)
  3. Ensure the key format is correct (no extra spaces or quotes)

Generation Quality Issues

Symptom: Poor quality or unexpected results
Solutions:
  1. Optimize prompts with more specific details
  2. Adjust the
    ratio
    parameter to select an appropriate aspect ratio
  3. Try different
    resolution
    settings
  4. Add
    negativePrompt
    to exclude unwanted elements

Timeout Errors

Symptom: Generation takes too long or times out
Solutions:
  1. Video generation typically takes 30-60 seconds - please be patient
  2. If timeouts persist, try lower resolution
  3. Check server resource usage
  4. Verify network connection to Jimeng API

Resources

references/

  • setup_guide.md
    : Detailed installation and configuration instructions
  • api_reference.md
    : Complete API documentation for all tools

Project Links

Best Practices

  1. Always Verify Server Status Before Attempting Generation
  2. Use Appropriate Resolution Based on Use Case and Speed Requirements (ratio controls aspect ratio, resolution controls clarity)
  3. Provide Detailed Prompts for Better Generation Quality
  4. Handle Errors Gracefully and Provide Clear User Feedback
  5. Consider Rate Limits When Processing Multiple Requests
  6. Test with Simple Prompts Before Complex Synthesis
  7. Cache Frequently Used Parameters such as preferred ratio and resolution

Limitations

  • Free Tier Limits: Official Jimeng API allows 66 credits per day
  • Video Duration: Typically limited to 3-10 seconds
  • Generation Time: Videos may take 30-60 seconds to generate
  • Image Synthesis: Best results with 2-3 images, maximum 5 images supported
  • Server Dependency: Requires jimeng-free-api-all backend to run
  • Network Requirements: Internet access required to call Jimeng API