xyq-nest-skill

Original🇨🇳 Chinese
Translated
5 scriptsChecked / no sensitive code detected

Comprehensive creation via Xiaoyunque's AI capabilities, supporting generation and editing of images/videos. Covered scenarios include: Generation (text-to-image, text-to-video, image-to-video, animation creation, draw xxx, create xxx clip), Editing & Revision (replace xxx with yyy, remove xxx, add xxx, change to xxx, adjust xxx, local modification, lens adjustment), Style Transfer (style migration, repainting, style change), video continuation, video/TVC/promotional video replication, short drama/short comic drama generation, music MV creation, product advertisement/demo video production, storyboard design, educational video/short video production. This skill should also be triggered when users mention Xiaoyunque, xyq, uploading reference images/videos, or checking generation progress. Key Judgment: This skill must be triggered whenever the user's request involves AI video creation, generation, editing, or revision, regardless of the wording (e.g., "draw a cat", "make a poster", "create a video", "help me revise this video", "help me replicate this video", "make an MV with this song", "generate a short drama with one sentence")

11installs
Added on

NPX Install

npx skill4agent add pippit-dev/pippit-skills xyq-nest-skill

SKILL.md Content (Chinese)

View Translation Comparison →

Xiaoyunque Session (Video Generation)

Create sessions, send messages (image generation, video generation, video editing, etc.), upload image/video files, and query session message progress via Xiaoyunque's API.
Xiaoyunque is an AI comprehensive creation platform designed for both human creators and Agents. Agents understand tasks, call models, and automatically orchestrate workflows through the Skill entry.
Core Platform Capabilities:
  • Generation: Text-to-image, text-to-video, image-to-video, video continuation
  • Editing: Local modification, element replacement, lens adjustment, style transfer
  • Complex Creation: Generate a complete short drama with one sentence (script → storyboard → final video), replicate existing video styles for TVCs/promotional videos, create MVs with music, product demo video production
All user creation and editing needs are fulfilled by sending natural language messages, and Agents will autonomously orchestrate workflows. Complex tasks (short dramas, MVs) take longer and require patient polling.

Features

  1. Create Session / Send Message - Create a new session or send a message to an existing session (e.g., "Create a video")
  2. Query Session Progress - Incrementally pull the session message list using
    thread_id
    ,
    run_id
    , and
    after_seq
    to poll for creation process updates and final results
  3. Upload File - Support uploading a
    single image
    or
    single video file
    to Xiaoyunque's asset library to obtain the corresponding
    asset_id
    (required for editing existing videos/images)
  4. Download Results - Batch download generated images/videos from the session to local storage, supporting specified output directories and filename prefixes

Prerequisites

bash
export XYQ_ACCESS_KEY="your-access-key"
Optional:
XYQ_OPENAPI_BASE
or
XYQ_BASE_URL
, default is
https://xyq.jianying.com
.
No additional dependencies required, only Python standard libraries are used.

Usage

1. Create Session / Send Message

bash
# Create a new session and send "Generate an anime video"
python3 {baseDir}/scripts/submit_run.py --message "Generate an anime video"

# Send a message to an existing session
python3 {baseDir}/scripts/submit_run.py --message "Generate another story video" --thread-id THREAD_ID

2. Query Session Progress

bash
# Query session message list
python3 {baseDir}/scripts/get_thread.py --thread-id THREAD_ID --run-id RUN_ID --after-seq SEQUENCE
run_id
is returned by
submit_run
and used to specify the result of a specific run.

3. Upload File

  • When the user provides a reference file URL, first upload the file, only images and videos are supported.
  • Only one file can be uploaded per command execution; multiple files can be uploaded in parallel, and each file must be under 200MB.
bash
# Upload image
python3 {baseDir}/scripts/upload_file.py /path/to/image.png

# Upload video
python3 {baseDir}/scripts/upload_file.py /path/to/video.mp4

4. Download Results

After the task is completed, all artifacts in the session can be downloaded in batches to local storage.
bash
# Download by specifying URL list, output directory, and filename prefix (e.g., artifact_01.png, artifact_02.png ...)
python3 {baseDir}/scripts/download_results.py --urls URL1 URL2 URL3 --output-dir ./xyq_output --prefix "artifact"

Typical Workflows

Understand these workflows to correctly combine the above scripts to meet user needs.

Scenario 1: User requests image or video generation (most common)

1. submit_run.py --message "User's description"  →  Obtain thread_id, run_id, and web_thread_link
2. **Immediately** display `web_thread_link` to the user (e.g., "Task submitted, view progress here: {web_thread_link}")
3. Call get_thread.py --thread-id THREAD_ID --run-id RUN_ID --after-seq SEQUENCE every `10` seconds for polling
4. Check messages:
  - When the task is still in progress:
    - Display process updates to the user and continue polling
  - When the task is completed (run ends):
    - If intent confirmation/process interruption is involved (e.g., "Please answer the following question"):
      → Display the question to the user and wait for a reply
      → Resubmit the task using `thread_id` (maintain the same session, generate a new run_id)
      → Return to step 2 to continue polling (may require multiple rounds until no more intent confirmation)
    - If content contains artifact URLs:
      → Display information → Download artifacts → Show results
5. Auto-download: download_results.py --urls URL1 URL2 URL3 --output-dir output-directory --prefix meaningful-prefix
6. Display to user: Process updates and list of downloaded local files

Scenario 2: User provides image/video for editing (e.g., "Create a new video referencing this one")

1. upload_file.py /path/to/video.mp4  →  Obtain asset_id
2. submit_run.py --message "Create a new video referencing this one" --asset-ids asset_id  →  Obtain thread_id, run_id, web_thread_link
3. Follow steps 2-6 in Scenario 1
User provides file path + editing instruction = First upload the file, then send the editing instruction along with all asset_ids.

Scenario 3: User provides reference images/videos to generate new content

1. upload_file.py /path/to/ref1.png  →  Obtain asset_id1
2. upload_file.py /path/to/ref2.mp4  →  Obtain asset_id2
3. Upload all files until all asset_ids are obtained
4. submit_run.py --message "Generate xxx based on reference images and videos" --asset-ids asset_id1 asset_id2, ...  →  Obtain thread_id, run_id, web_thread_link
5. Follow steps 2-6 in Scenario 1

Scenario 4: Append new requirements to an existing session

1. submit_run.py --message "New description" --thread-id THREAD_ID  →  Obtain thread_id, run_id, web_thread_link
2. Follow steps 2-6 in Scenario 1

Polling Strategy

  • Interval: Query every 10 seconds
  • Incremental Pull: Use --after-seq 0 for the first query, calculate the new seq value based on the length of the messages list for subsequent queries
  • Completion Judgment: When the creation task is completed and the messages content contains artifact URLs (image/video addresses)
  • Timeout: If no result is obtained after continuous polling for
    48 hours
    , inform the user "Generation is taking longer than expected, please check later" and stop polling
  • Error Retry: Retry once for a single query failure; stop and inform the user if there are 3 consecutive failures

Output Format

submit_run returns:
json
{
  "thread_id": "90f05e0c-...",
  "run_id": "abc123-...",
  "web_thread_link": "https://xyq.jianying.com/..."
}
get_thread returns:
json
{
  "messages": [
    {"id": "1", "role": "user", "content": "Generate an anime video"},
    {"id": "2", "role": "assistant", "content": [
      {
        "type": "{type}",
        "subtype": "{sub_type}",
        "data": {...}
      }
    ]},
    {"id": "3", "role": "assistant", "content": [
      {
        "type": "{type}",
        "subtype": "{sub_type}",
        "data": {..., "url": "{url}"....}
      }
    ]}
  ]
}
upload_file returns:
json
{
  "asset_id": "{asset_id}"
}
download_results returns:
json
{
  "output_dir": "./xyq_output",
  "downloaded": ["./xyq_output/01.png", "..."],
  "total": 10
}

Content Displayed to Users

  • After task submission: Immediately display
    web_thread_link
    to the user for direct browser access to the task page
  • During task creation:
    • Display process updates and continue polling
  • When task is completed (run ends):
    • If intent confirmation/process interruption is involved (e.g., "Please answer the following question") → Display the question → Wait for user reply → Resubmit task using the same
      thread_id
      → Continue polling (may require multiple rounds)
    • If content contains artifact URLs:
    • Display result links, downloaded local files and other information to the user

Core Principle: User-side Agent only relays messages, does not create content

Your (user-side Agent) role is a messenger, not a creator. The backend has dedicated Agents responsible for understanding requirements, breaking down storyboards, orchestrating workflows, selecting models, and writing prompts. You only need to do three things:
  1. Upload: If the user provides local files → Use
    upload_file.py
    to get asset_ids
  2. Submit Task: Send the user's original description + asset_ids to
    submit_run.py
    without modification
  3. Relay Messages: Display intent queries and process updates based on the message list returned by
    get_thread.py
  4. Retrieve Results: Poll results via
    get_thread.py
    → Check results → Download artifacts → Display results to the user
Absolutely Do NOT:
  • Rewrite, polish, or translate prompts on behalf of the user (if the user says "Help me develop storyboards", send "Help me develop storyboards" directly; do not create a storyboard list first and send it in parts)
  • Independently orchestrate lens descriptions, plot developments, or style analyses
  • Add self-written prompts to messages (e.g., "hyper-realistic style, cinematic lighting, 8K resolution")
Backend Agents are far more professional than user-side Agents in model capabilities, parameter configuration, and prompt engineering. Overstepping by the user-side will only reduce generation quality, and switching to a weaker model will be even worse.
Correct Example:
User says: "Create a science popularization story video based on multiple reference images"
User provides reference images: /path/to/ref1.png, /path/to/ref2.png, /path/to/ref3.png

→ upload_file.py /path/to/ref1.png →  Get asset_id1
→ upload_file.py /path/to/ref2.png →  Get asset_id2
→ upload_file.py /path/to/ref3.png →  Get asset_id3
→ submit_run.py --message "Generate xxx based on reference images and videos" --asset-ids asset_id1 asset_id2, asset_id3  →  Obtain web_thread_link and display it to the user immediately
→ Polling ─┬─ Intent confirmation → User confirms → Resubmit using thread_id → Continue polling
        └─ No intent confirmation → Display updates → Download artifacts → Show results
Incorrect Example:
❌ User-side creates a 9-grid storyboard list (confrontation, clash, crisis...)
❌ Then sends self-written descriptions to the backend
❌ Or splits into 9 separate submit_run calls

Notes

  • Authentication uses the request header
    Authorization: Bearer <XYQ_ACCESS_KEY>
  • The
    message
    cannot be empty when creating a session
  • Use --after-seq for incremental pulls when querying sessions to facilitate polling for new messages (including assistant replies and image/video generation results)
  • Only image (image/) and video (video/) file types are supported for upload; other types will be rejected, and file size must be under 200MB
  • Display process updates to the user during generation; after task completion, provide artifact URLs (images/videos) and the list of downloaded local files.