screenwriter

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Screenwriter Skill

编剧Skill

Overview

概述

This skill transforms creative concepts into professional screenplay documents optimized for AI-powered video production pipelines. It bridges the gap between raw story ideas and production-ready scripts by generating structured, visual-rich narratives in industry-standard screenplay format.

Pipeline Position:

diverse-content-gen

→ screenwriter →

imagine

→

arch-v

Key Capabilities:

Convert raw ideas into structured scene-by-scene narratives
Generate rich visual descriptions optimized for image generation
Apply professional screenplay formatting (sluglines, action lines, dialogue)
Output XML-tagged markdown for easy parsing
Optimize pacing for 5-10 minute short films (8-15 scenes typical)

本Skill可将创意概念转化为适用于AI驱动视频生产流水线的专业剧本文档。它通过生成符合行业标准剧本格式的结构化、视觉丰富的叙事内容，填补了原始故事想法与可直接用于生产的剧本之间的差距。

流水线位置：

diverse-content-gen

→ screenwriter →

imagine

→

arch-v

核心功能：

将原始想法转化为结构化的逐场景叙事内容
生成针对图像优化的丰富视觉描述
应用专业剧本格式（场景标题、动作台词、对话）
输出便于解析的带XML标签的Markdown
为5-10分钟的短片优化节奏（通常包含8-15个场景）

Core Workflow

核心工作流程

1. Analyze Input Concept

1. 分析输入概念

Extract key story beats from raw ideas
Identify characters, locations, emotional arc
Determine story structure (beginning, middle, end)

从原始想法中提取关键故事节点
识别角色、场景、情感弧线
确定故事结构（开端、发展、结局）

2. Generate Scene Breakdown

2. 生成场景拆分

Convert story beats into discrete scenes
Establish scene count (aim for 8-15 scenes for 5-10 min films)
Define scene purpose and emotional progression

将故事节点转化为独立场景
确定场景数量（5-10分钟的短片建议8-15个场景）
定义场景的目的和情感推进方向

3. Write Professional Screenplay

3. 撰写专业剧本

Apply industry-standard formatting
Write visual-rich action lines
Include dialogue when narratively essential
Maintain consistent character descriptions

应用行业标准格式
撰写视觉丰富的动作台词
仅在叙事必要时加入对话
保持角色描述的一致性

4. Output XML-Tagged Markdown

4. 输出带XML标签的Markdown

Wrap each scene in XML tags with metadata
Include scene numbers, locations, key visuals
Format for easy pipeline parsing

用带元数据的XML标签包裹每个场景
包含场景编号、地点、关键视觉元素
采用便于流水线解析的格式

Screenplay Format Standards

剧本格式标准

Scene Structure (Master Scene Heading)

场景结构（主场景标题）

Slugline Format:

INT/EXT. LOCATION - TIME

Components:

INT/EXT: Interior or Exterior
LOCATION: Specific place (be descriptive but concise)
TIME: DAY, NIGHT, DAWN, DUSK, CONTINUOUS

Examples:

EXT. WASTELAND - DAWN
INT. ABANDONED SUBWAY STATION - NIGHT
EXT. ROOFTOP GARDEN - GOLDEN HOUR

Guidelines:

Always use ALL CAPS for sluglines
Use hyphens to separate elements
Be specific with locations (aids visual generation)
Time should suggest lighting/mood

场景标题格式：

INT/EXT. LOCATION - TIME

组成部分：

INT/EXT： 室内或室外
LOCATION： 具体地点（描述性但简洁）
TIME： DAY（白天）、NIGHT（夜晚）、DAWN（黎明）、DUSK（黄昏）、CONTINUOUS（连续）

示例：

EXT. WASTELAND - DAWN
INT. ABANDONED SUBWAY STATION - NIGHT
EXT. ROOFTOP GARDEN - GOLDEN HOUR

规范：

场景标题始终使用大写字母
用连字符分隔各元素
地点描述要具体（有助于图像生成）
时间应暗示光线/氛围

Action Lines (Visual Description)

动作台词（视觉描述）

Purpose: Describe what the audience sees on screen. This is CRITICAL for image generation.

Visual-Rich Writing Principles:

Show, Don't Tell: Write what's visible, not internal thoughts
Sensory Details: Include lighting, atmosphere, textures, colors
Present Tense: Always write in present tense
Active Voice: Use strong, active verbs
Specific Props: Name objects that matter visually
Atmosphere: Set mood through environmental details

Example - Weak:

A robot walks through the city. It's sad.

Example - Strong:

A BOXY ROBOT (Unit-7, weathered chrome with a single blue optical sensor) rolls through fog-shrouded streets. Neon signs flicker overhead, casting pink and cyan reflections on wet pavement. The robot's movements are slow, deliberate—almost hesitant.

Visual Enhancement Checklist:

Lighting described (natural/artificial, quality, color)
Atmosphere/mood established (fog, rain, dust, clarity)
Character appearance detailed (first appearance only)
Props/objects specified (important visual elements)
Composition suggested (without technical camera direction)
Colors/textures mentioned when relevant

目的： 描述观众在屏幕上看到的内容。这对图像生成至关重要。

视觉丰富的写作原则：

展示而非讲述： 写可见的内容，而非内心想法
感官细节： 包含光线、氛围、纹理、色彩
现在时态： 始终使用现在时态
主动语态： 使用有力的主动动词
特定道具： 明确命名视觉上重要的物品
氛围营造： 通过环境细节设定情绪

示例 - 欠佳：

一个机器人走过城市。它很悲伤。

示例 - 优秀：

一个箱型机器人（Unit-7，外壳是风化的铬合金，配有单个蓝色光学传感器）在浓雾笼罩的街道上滚动。头顶的霓虹灯闪烁，在潮湿的路面上投射出粉色和青色的反光。机器人的动作缓慢而刻意——几乎有些迟疑。

视觉增强检查清单：

描述了光线（自然光/人造光、质感、颜色）
建立了氛围/情绪（雾、雨、灰尘、清晰度）
详细描述了角色外观（仅首次出场时）
指定了道具/物品（重要的视觉元素）
暗示了构图（无需专业镜头指导）
提及了相关的颜色/纹理

Character Introduction

角色介绍

First Appearance - Detailed:

SARAH (28, sharp eyes, wearing a weathered leather jacket over faded jeans) enters the frame. Her dark hair is pulled back, revealing a small scar above her left eyebrow.

Subsequent Appearances - Brief:

Sarah checks her watch.

Guidelines:

Character names in ALL CAPS on first appearance only
Include: age (if relevant), key physical traits, wardrobe
Focus on visual identifiers for consistent image generation
Avoid excessive detail—just enough for visual consistency

首次出场 - 详细：

SARAH（28岁，眼神锐利，破旧皮夹克内搭褪色牛仔裤）进入画面。她的黑发向后梳起，露出左眉上方的一道小疤痕。

后续出场 - 简洁：

Sarah看了看手表。

规范：

角色名称仅在首次出场时使用大写字母
包含信息：年龄（如相关）、关键外貌特征、着装
聚焦于视觉识别符，确保图像生成的一致性
避免过多细节——足够保证视觉一致性即可

Dialogue (Use Sparingly)

对话（谨慎使用）

Format:

CHARACTER NAME
(parenthetical - optional)
Dialogue goes here.

Guidelines for Short Films:

Use dialogue ONLY when essential to story
Favor visual storytelling over talking
Keep lines concise (max 3-4 lines per block)
Parentheticals only for critical tone/action
Character names centered, ALL CAPS

Example:

UNIT-7 (robotic voice, soft)
Organic life form detected.
Probability of survival: low.

格式：

角色名称
(括号说明 - 可选)
对话内容。

短片对话规范：

仅在对故事至关重要时使用对话
优先选择视觉叙事而非对话
台词要简洁（每块最多3-4行）
括号说明仅用于关键语气/动作
角色名称居中，使用大写字母

示例：

UNIT-7（机械音，柔和）
检测到有机生命体。
存活概率：低。

Transitions (Minimal Use)

转场（尽量少用）

Common Transitions:

```
FADE IN:
```
- Opening of screenplay only
```
CUT TO:
```
- Scene change (usually implied, use for emphasis)
```
SMASH CUT TO:
```
- Abrupt, jarring transition
```
DISSOLVE TO:
```
- Passage of time
```
FADE OUT.
```
- End of screenplay

Modern Best Practice: Most transitions are IMPLIED. Use sparingly, only for specific narrative effect.

常见转场：

```
FADE IN:
```
- 仅用于剧本开头
```
CUT TO:
```
- 场景切换（通常可默认，仅用于强调）
```
SMASH CUT TO:
```
- 突兀、生硬的转场
```
DISSOLVE TO:
```
- 时间流逝
```
FADE OUT.
```
- 剧本结尾

现代最佳实践： 大多数转场可默认。仅在需要特定叙事效果时谨慎使用。

XML Output Format

XML输出格式

Scene Tag Structure

场景标签结构

Each scene wrapped in XML with metadata for pipeline processing:

xml

<scene number="1" duration="30-45s">
  <slugline>EXT. WASTELAND - DAWN</slugline>
  <location>Wasteland</location>
  <time>Dawn</time>
  <characters>Unit-7</characters>
  <mood>desolate, lonely</mood>
  <key_visuals>
    <visual>post-apocalyptic wasteland with ruined skyscrapers</visual>
    <visual>boxy robot with single blue optical sensor</visual>
    <visual>dust and smog atmosphere, weak pale sun</visual>
  </key_visuals>
  <action>
Gray dust covers everything. Skeletal remains of skyscrapers pierce the horizon. The sun, pale and weak, struggles through thick smog.

A ROBOT (Unit-7, boxy frame with single blue optical sensor) rolls across cracked asphalt. Its treads leave marks in the dust—the only sign of life.

The robot stops at a pile of rubble, extending a mechanical arm to sort through debris. Methodical. Purposeful. Lonely.
  </action>
</scene>

每个场景都用带元数据的XML标签包裹，供流水线处理：

xml

<scene number="1" duration="30-45s">
  <slugline>EXT. WASTELAND - DAWN</slugline>
  <location>Wasteland</location>
  <time>Dawn</time>
  <characters>Unit-7</characters>
  <mood>desolate, lonely</mood>
  <key_visuals>
    <visual>post-apocalyptic wasteland with ruined skyscrapers</visual>
    <visual>boxy robot with single blue optical sensor</visual>
    <visual>dust and smog atmosphere, weak pale sun</visual>
  </key_visuals>
  <action>
Gray dust covers everything. Skeletal remains of skyscrapers pierce the horizon. The sun, pale and weak, struggles through thick smog.

A ROBOT (Unit-7, boxy frame with single blue optical sensor) rolls across cracked asphalt. Its treads leave marks in the dust—the only sign of life.

The robot stops at a pile of rubble, extending a mechanical arm to sort through debris. Methodical. Purposeful. Lonely.
  </action>
</scene>

Metadata Fields

元数据字段

```
number
```
: Scene sequence number (1, 2, 3...)
```
duration
```
: Estimated screen time (for 5-10 min total)
```
slugline
```
: Master scene heading
```
location
```
: Extracted location name
```
time
```
: Time of day
```
characters
```
: Comma-separated character list
```
mood
```
: Emotional tone/atmosphere
```
key_visuals
```
: Array of specific visual elements for image generation
```
action
```
: The full action/description text
```
dialogue
```
(optional): Character dialogue if present

```
number
```
: 场景序号（1、2、3...）
```
duration
```
: 预估屏幕时长（针对总时长5-10分钟的短片）
```
slugline
```
: 主场景标题
```
location
```
: 提取出的地点名称
```
time
```
: 时间段
```
characters
```
: 逗号分隔的角色列表
```
mood
```
: 情感基调/氛围
```
key_visuals
```
: 供图像生成使用的特定视觉元素数组
```
action
```
: 完整的动作/描述文本
```
dialogue
```
（可选）: 若有则包含角色对话

Short Film Structure (5-10 Minutes)

短片结构（5-10分钟）

Scene Count Guidelines

场景数量规范

5 minutes: 6-10 scenes
7 minutes: 10-12 scenes
10 minutes: 12-15 scenes

Average: ~30-60 seconds per scene

5分钟： 6-10个场景
7分钟： 10-12个场景
10分钟： 12-15个场景

平均： 每个场景约30-60秒

Three-Act Structure (Compressed)

压缩式三幕结构

Act 1 - Setup (20-25%): 2-3 scenes

Establish world, character, situation
Inciting incident

Act 2 - Confrontation (50-60%): 4-8 scenes

Development, obstacles, rising tension
Midpoint twist or escalation

Act 3 - Resolution (20-25%): 2-3 scenes

Climax and resolution
Emotional payoff

第一幕 - 铺垫（20-25%）： 2-3个场景

建立世界观、角色、现状
触发事件

第二幕 - 对抗（50-60%）： 4-8个场景

情节发展、障碍、紧张感升级
中点转折或冲突加剧

第三幕 - 结局（20-25%）： 2-3个场景

高潮与结局
情感回报

Pacing Tips

节奏技巧

Open strong: Hook audience in first 10-15 seconds
Visual variety: Alternate between wide/close, action/stillness
Emotional beats: Each scene should shift emotional state
Build tension: Escalate stakes scene-by-scene
Satisfying end: Clear resolution, even if bittersweet

开场有力： 在前10-15秒抓住观众注意力
视觉多样性： 在远景/特写、动作/静止之间交替
情感节点： 每个场景都应改变情绪状态
紧张感构建： 逐场景提升风险
圆满结尾： 清晰的结局，即使带点苦涩

Best Practices

最佳实践

For Pipeline Integration

流水线集成

Consistent naming: Use same character names throughout
Rich visuals: Every scene needs 3-5 key_visuals for image generation
Parseable format: Maintain strict XML structure
Duration estimates: Help pipeline plan total video length

命名一致： 全程使用相同的角色名称
丰富视觉： 每个场景需要3-5个key_visuals用于图像生成
可解析格式： 严格遵循XML结构
时长预估： 帮助流水线规划视频总时长

For Quality Output

高质量输出

Visual storytelling: Show emotions through actions, not dialogue
Specific details: "weathered chrome" beats "old metal"
Atmospheric writing: Set mood through environment
Lean prose: Each word should serve the image

视觉叙事： 通过动作展示情绪，而非对话
具体细节： “风化的铬合金”优于“旧金属”
氛围写作： 通过环境设定情绪
简洁文字： 每个词都应为图像服务

Common Pitfalls to Avoid

需避免的常见陷阱

❌ Vague descriptions: "A person walks" → ✅ "A weathered woman in her 50s trudges through snow"
❌ Telling emotions: "She feels sad" → ✅ "Tears streak her dusty cheeks"
❌ Camera directions: "CLOSE UP ON" → ✅ "The crack in the glass spreads"
❌ Over-dialogue: Short films need visual storytelling
❌ Inconsistent character names: Stick to ONE name per character

❌ 模糊描述： “一个人走路” → ✅ “一位50多岁的沧桑妇女在雪中艰难跋涉”
❌ 讲述情绪： “她感到悲伤” → ✅ “泪水划过她沾满灰尘的脸颊”
❌ 镜头指导： “特写镜头对准” → ✅ “玻璃上的裂痕蔓延开来”
❌ 过多对话： 短片需要视觉叙事
❌ 角色名称不一致： 每个角色只使用一个名称

Additional Resources

额外资源

Pipeline Integration Guide

流水线集成指南

For detailed guidance on metadata standards, visual optimization, and integration with imagine/arch-v:

references/pipeline-integration.md

如需元数据标准、视觉优化以及与imagine/arch-v集成的详细指导：

references/pipeline-integration.md

Advanced Techniques

进阶技巧

For sophisticated screenwriting techniques, camera movement hints, and pacing optimization:

references/advanced-techniques.md

如需复杂编剧技巧、镜头运动提示和节奏优化的内容：

references/advanced-techniques.md