storytelling

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Storytelling with genmedia

借助genmedia进行叙事创作

Use this skill when the user wants a sequence, not a single asset. Load references as needed:
  • references/shot-planning.md
  • references/workflows.md
  • references/examples.md
Load
model-routing
alongside this skill for default endpoint choices.
The goal is to produce clear story beats and executable genmedia runs. Avoid generic inspiration copy, fake dialogue, and em dashes.
当用户需要生成序列内容而非单一资产时,可使用此技能。按需加载参考文档:
  • references/shot-planning.md
  • references/workflows.md
  • references/examples.md
加载
model-routing
以配合此技能完成默认端点选择。
目标是生成清晰的故事节拍与可执行的genmedia运行任务。避免使用通用灵感文案、虚构对话及破折号。

Inputs to collect

需要收集的输入信息

Ask only when missing information affects execution.
  • Format: ad, short film, music video, documentary, tutorial, social story.
  • Duration and aspect ratio.
  • Number of shots or allowed range.
  • Main subject, character, product, or location.
  • Continuity anchors: character, product, wardrobe, environment, color.
  • Source media: first frame, reference image, product shot, audio track.
  • Audio needs: narration, music, sound design, transcript, no audio.
  • Preferred model or model family, if the user wants to decide quality, cost, speed, audio, or multi-shot tradeoffs.
仅当缺失信息会影响执行时才询问用户。
  • 格式:广告、短片、音乐视频、纪录片、教程、社交故事。
  • 时长与宽高比。
  • 镜头数量或允许范围。
  • 主体、角色、产品或场景。
  • 连贯性锚点:角色、产品、服装、环境、色彩。
  • 源媒体:首帧、参考图片、产品照片、音轨。
  • 音频需求:旁白、音乐、音效设计、转录文本、无需音频。
  • 偏好模型或模型系列(若用户希望自主决定质量、成本、速度、音频或多镜头权衡)。

Genmedia workflow

Genmedia工作流

  1. Start from routed endpoint IDs.
    bash
    genmedia models --endpoint_id bytedance/seedance-2.0/text-to-video --json
    genmedia models --endpoint_id bytedance/seedance-2.0/image-to-video --json
    genmedia models --endpoint_id bytedance/seedance-2.0/reference-to-video --json
    genmedia models --endpoint_id fal-ai/kling-video/v3/pro/text-to-video --json
    genmedia models --endpoint_id alibaba/happy-horse/text-to-video --json
    genmedia models --endpoint_id veed/fabric-1.0 --json
    Use text search only as fallback discovery for an unsupported sequence control:
    bash
    genmedia models "first frame last frame video generation" --json
    genmedia docs "multi shot video generation" --json
  2. Inspect schema before planning exact payloads.
    bash
    genmedia schema <endpoint_id> --json
    genmedia pricing <endpoint_id> --json
  3. Upload references.
    bash
    genmedia upload ./first-frame.png --json
    genmedia upload ./character.png --json
    genmedia upload ./product.png --json
    genmedia upload ./voiceover.wav --json
  4. Choose the sequence route.
    • Highest quality video: start with Seedance 2.0 endpoints from
      model-routing
      .
    • Native multi-prompt: use if schema has shot arrays, prompt lists, or timeline fields.
    • First/last frame: use for controlled transitions between key frames.
    • Image-to-video per shot: use for maximum continuity from approved stills.
    • Manual per-shot generation: use when the model only supports one prompt.
    • Audio-first: generate or upload audio, then plan visual shot lengths.
    • Lip-sync or talking avatar: use Fabric 1.0 or Creatify Aurora from
      model-routing
      .
  5. Run long jobs async and download every result with a unique template.
    bash
    genmedia run <endpoint_id> \
      --prompt "<shot or sequence prompt>" \
      --async \
      --json
    
    genmedia status <endpoint_id> <request_id> \
      --download "./outputs/story/{request_id}_{index}.{ext}" \
      --json
  6. Return a shot table with endpoint, request id, prompt summary, local path, and any continuity issues. Genmedia downloads clips; it does not replace a timeline editor unless the chosen model returns a complete stitched video.
  1. 从路由端点ID开始。
    bash
    genmedia models --endpoint_id bytedance/seedance-2.0/text-to-video --json
    genmedia models --endpoint_id bytedance/seedance-2.0/image-to-video --json
    genmedia models --endpoint_id bytedance/seedance-2.0/reference-to-video --json
    genmedia models --endpoint_id fal-ai/kling-video/v3/pro/text-to-video --json
    genmedia models --endpoint_id alibaba/happy-horse/text-to-video --json
    genmedia models --endpoint_id veed/fabric-1.0 --json
    仅当遇到不支持的序列控制时,才将文本搜索作为后备发现方式:
    bash
    genmedia models "first frame last frame video generation" --json
    genmedia docs "multi shot video generation" --json
  2. 在规划确切负载前检查架构。
    bash
    genmedia schema <endpoint_id> --json
    genmedia pricing <endpoint_id> --json
  3. 上传参考素材。
    bash
    genmedia upload ./first-frame.png --json
    genmedia upload ./character.png --json
    genmedia upload ./product.png --json
    genmedia upload ./voiceover.wav --json
  4. 选择序列路由方式。
    • 最高质量视频:从
      model-routing
      中的Seedance 2.0端点开始。
    • 原生多提示词:若架构包含镜头数组、提示词列表或时间轴字段,则使用此方式。
    • 首帧/末帧:用于关键帧之间的可控过渡。
    • 单镜头图像转视频:用于从已确认的静态素材中实现最大连贯性。
    • 手动单镜头生成:当模型仅支持单个提示词时使用。
    • 音频优先:生成或上传音频,然后规划视觉镜头时长。
    • 唇形同步或虚拟形象:使用
      model-routing
      中的Fabric 1.0或Creatify Aurora。
  5. 异步运行长任务,并使用唯一模板下载所有结果。
    bash
    genmedia run <endpoint_id> \
      --prompt "<shot or sequence prompt>" \
      --async \
      --json
    
    genmedia status <endpoint_id> <request_id> \
      --download "./outputs/story/{request_id}_{index}.{ext}" \
      --json
  6. 返回包含端点、请求ID、提示词摘要、本地路径及任何连贯性问题的镜头表格。Genmedia仅负责下载剪辑,除非所选模型返回完整的拼接视频,否则它不会替代时间轴编辑器。

Shot planning

镜头规划

Plan every sequence as beats first:
  1. Hook: immediate visual reason to keep watching.
  2. Setup: who, what, where, and why it matters.
  3. Development: movement, discovery, proof, or escalation.
  4. Turn: reveal, transformation, result, or emotional change.
  5. Close: final image, product memory, CTA-safe frame, or unresolved mood.
For each shot, write:
  • Shot number and duration.
  • Story purpose.
  • Visual prompt.
  • Continuity anchor.
  • Input reference, if any.
  • Genmedia endpoint.
  • Expected output path.
首先将每个序列规划为故事节拍:
  1. 钩子:立即吸引观看者的视觉元素。
  2. 铺垫:介绍人物、事件、场景及其重要性。
  3. 发展:情节推进、发现、验证或升级。
  4. 转折:揭示、转变、结果或情感变化。
  5. 收尾:最终画面、产品记忆、适合CTA的帧或未解决的氛围。
为每个镜头撰写以下内容:
  • 镜头编号及时长。
  • 故事用途。
  • 视觉提示词。
  • 连贯性锚点。
  • 输入参考素材(如有)。
  • Genmedia端点。
  • 预期输出路径。

Prompt build order

提示词构建顺序

Use this structure for each shot:
text
SHOT [number], [duration]:
[story purpose]. [subject and action]. [location and time]. [camera framing].
[camera movement]. [lighting and color]. [continuity anchor]. [transition or
relationship to previous shot].
Keep one shot to one clear action unless the selected model supports multi-shot or timeline prompting.
为每个镜头使用以下结构:
text
SHOT [number], [duration]:
[story purpose]. [subject and action]. [location and time]. [camera framing].
[camera movement]. [lighting and color]. [continuity anchor]. [transition or
relationship to previous shot].
除非所选模型支持多镜头或时间轴提示,否则每个镜头仅对应一个清晰动作。

Model routing

模型路由

  • Highest quality video:
    bytedance/seedance-2.0/text-to-video
    ,
    bytedance/seedance-2.0/image-to-video
    , or
    bytedance/seedance-2.0/reference-to-video
    .
  • Fast or lower-cost video:
    xai/grok-imagine-video/text-to-video
    or
    xai/grok-imagine-video/image-to-video
    .
  • Multi-shot sequence: Seedance 2.0 first, then
    fal-ai/kling-video/v3/pro/text-to-video
    , then
    fal-ai/kling-video/v3/pro/image-to-video
    , then
    alibaba/happy-horse/text-to-video
    or
    alibaba/happy-horse/image-to-video
    .
  • Text-heavy keyframes, boards, UI frames, posters, or infographics:
    openai/gpt-image-2
    at
    quality=high
    .
  • Talking avatar, native audio, or lip-sync:
    veed/fabric-1.0
    ,
    veed/fabric-1.0/text
    , or
    fal-ai/creatify/aurora
    .
  • 最高质量视频:
    bytedance/seedance-2.0/text-to-video
    bytedance/seedance-2.0/image-to-video
    bytedance/seedance-2.0/reference-to-video
  • 快速或低成本视频:
    xai/grok-imagine-video/text-to-video
    xai/grok-imagine-video/image-to-video
  • 多镜头序列:优先使用Seedance 2.0,其次是
    fal-ai/kling-video/v3/pro/text-to-video
    fal-ai/kling-video/v3/pro/image-to-video
    ,最后是
    alibaba/happy-horse/text-to-video
    alibaba/happy-horse/image-to-video
  • 含大量文本的关键帧、故事板、UI帧、海报或信息图:使用
    openai/gpt-image-2
    并设置
    quality=high
  • 虚拟形象、原生音频或唇形同步:
    veed/fabric-1.0
    veed/fabric-1.0/text
    fal-ai/creatify/aurora

Quality bar

质量标准

Before returning:
  • Shot order has a clear narrative function.
  • The first shot is strong enough for the platform.
  • Continuity anchors are repeated without bloating every prompt.
  • Camera motion is varied but not random.
  • Durations add up to the requested runtime.
  • Async request IDs and downloaded files are recorded.
  • The model's actual schema, not assumptions, drove the final command.
返回结果前需确认:
  • 镜头顺序具有清晰的叙事功能。
  • 第一个镜头足以适配目标平台。
  • 连贯性锚点重复出现但未冗余填充每个提示词。
  • 镜头运动多样但不随机。
  • 时长总和符合要求的运行时间。
  • 已记录异步请求ID及下载文件。
  • 最终命令由模型实际架构而非假设驱动。