AlayaRenderer — Generative World Renderer

AlayaRenderer — 生成式世界渲染器

Skill by ara.so — Daily 2026 Skills collection.

AlayaRenderer is a two-stage framework for high-quality video rendering:

Inverse Renderer (RGB → G-buffers): Extracts albedo, normal, depth, roughness, and metallic maps from RGB video using a fine-tuned Cosmos-Transfer1-DiffusionRenderer 7B model.
Game Editing (G-buffers + Text → Stylized RGB): Synthesizes photorealistic, stylized RGB video from G-buffer inputs using a fine-tuned Wan2.1 1.3B model via DiffSynth-Studio.

由ara.so提供的技能——属于2026年度技能合集。

AlayaRenderer是用于高质量视频渲染的两阶段框架：

逆向渲染器（RGB → G-buffers）：使用微调的Cosmos-Transfer1-DiffusionRenderer 7B模型，从RGB视频中提取反照率、法线、深度、粗糙度和金属度贴图。
游戏编辑（G-buffers + 文本 → 风格化RGB）：通过DiffSynth-Studio使用微调的Wan2.1 1.3B模型，从G-buffer输入合成逼真的风格化RGB视频。

Installation

安装

Clone the Repository

克隆仓库

bash

git clone --recurse-submodules https://github.com/ShandaAI/AlayaRenderer.git
cd AlayaRenderer

Important: Use
--recurse-submodules
— DiffSynth-Studio is a git submodule required for Game Editing.

bash

git clone --recurse-submodules https://github.com/ShandaAI/AlayaRenderer.git
cd AlayaRenderer

重要提示： 请使用
--recurse-submodules
参数——DiffSynth-Studio是游戏编辑功能所需的Git子模块。

Two Separate Conda Environments (Recommended)

两个独立Conda环境（推荐）

The two models have conflicting dependencies. Use separate environments:

bash

undefined

两个模型的依赖存在冲突，建议使用独立环境：

bash

undefined

Environment 1: Inverse Renderer

环境1：逆向渲染器

conda create -n inverse_renderer python=3.10 -y conda activate inverse_renderer cd inverse_renderer

Follow inverse_renderer/ instructions for Cosmos-Transfer1 setup

按照inverse_renderer/目录下的说明完成Cosmos-Transfer1的配置

Environment 2: Game Editing

环境2：游戏编辑

conda create -n game_editing python=3.10 -y conda activate game_editing cd game_editing

Follow DiffSynth-Studio setup instructions

按照DiffSynth-Studio的配置说明完成设置

---

---

Model Weights

模型权重

Model	Base Model	Size	HuggingFace Link
Inverse Renderer	Cosmos-Transfer1-DiffusionRenderer 7B	~7B params	Brian9999/world_inverse_renderer
Game Editing	Wan2.1 1.3B	~1.3B params	Brian9999/stylerenderer

模型	基础模型	大小	HuggingFace链接
逆向渲染器	Cosmos-Transfer1-DiffusionRenderer 7B	~7B参数	Brian9999/world_inverse_renderer
游戏编辑	Wan2.1 1.3B	~1.3B参数	Brian9999/stylerenderer

Download and Place Weights

下载并放置权重

bash

undefined

bash

undefined

Inverse Renderer — replace the base checkpoint

逆向渲染器 — 替换基础检查点

huggingface-cli download Brian9999/world_inverse_renderer
--local-dir inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B

Game Editing — place in game_editing models directory

游戏编辑 — 放置到game_editing的models目录下

mkdir -p game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer huggingface-cli download Brian9999/stylerenderer
--local-dir game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer

---

mkdir -p game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer huggingface-cli download Brian9999/stylerenderer
--local-dir game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer

---

Inverse Renderer Usage

逆向渲染器使用说明

The inverse renderer decomposes an RGB video into 5 G-buffer channels: albedo, normal, depth, roughness, metallic.

逆向渲染器可将RGB视频分解为5个G-buffer通道：反照率、法线、深度、粗糙度、金属度。

Setup

环境配置

bash

cd inverse_renderer

bash

cd inverse_renderer

Follow Cosmos-Transfer1-DiffusionRenderer environment setup

按照Cosmos-Transfer1-DiffusionRenderer的环境说明完成配置

Ensure checkpoint is at:

确保检查点位于如下路径：

inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/

undefined

undefined

Inference

推理

Refer to the

inverse_renderer/

subdirectory for the full inference script. The general pattern follows Cosmos-Transfer1-DiffusionRenderer conventions:

python

undefined

完整推理脚本请参考

inverse_renderer/

子目录，通用使用方式遵循Cosmos-Transfer1-DiffusionRenderer的约定：

python

undefined

inverse_renderer/run_inverse.py (typical pattern)

inverse_renderer/run_inverse.py（典型使用示例）

import torch from pathlib import Path

Input: path to RGB video

输入：RGB视频路径

input_video = "path/to/rgb_video.mp4" output_dir = "outputs/gbuffers/"

The model outputs 5 synchronized channels:

模型输出5个同步的通道：

- albedo (diffuse color)

- albedo（漫反射颜色）

- normal (surface orientation)

- normal（表面朝向）

- depth (scene geometry)

- depth（场景几何结构）

- roughness (surface roughness)

- roughness（表面粗糙度）

- metallic (metallic property)

- metallic（金属属性）

---

---

Game Editing Usage

游戏编辑使用说明

Quick Start — CLI Inference

快速开始 — CLI推理

bash

cd game_editing

CUDA_VISIBLE_DEVICES=0 python \
    examples/wanvideo/model_inference/inference_gbuffer_caption.py \
    --checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors \
    --gpu 0 \
    --style snowy_winter \
    --prompt "the scene is set in a frozen, snow-covered environment under cold, pale winter light with falling snowflakes, creating a silent and ethereal winter wonderland atmosphere." \
    --gbuffer_dir test_dataset \
    --save_dir outputs/ \
    --num_frames 81 \
    --height 480 \
    --width 832

bash

cd game_editing

CUDA_VISIBLE_DEVICES=0 python \
    examples/wanvideo/model_inference/inference_gbuffer_caption.py \
    --checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors \
    --gpu 0 \
    --style snowy_winter \
    --prompt "the scene is set in a frozen, snow-covered environment under cold, pale winter light with falling snowflakes, creating a silent and ethereal winter wonderland atmosphere." \
    --gbuffer_dir test_dataset \
    --save_dir outputs/ \
    --num_frames 81 \
    --height 480 \
    --width 832

CLI Parameters

CLI参数说明

Parameter	Description	Example
`--checkpoint`	Path to fine-tuned `.safetensors` weights	`models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors`
`--gpu`	GPU device index	`0`
`--style`	Named style preset	`snowy_winter` , `rainy` , `night` , `sunset`
`--prompt`	Text description of target lighting/atmosphere	See examples below
`--gbuffer_dir`	Directory containing G-buffer input frames/video	`test_dataset`
`--save_dir`	Output directory for rendered video	`outputs/`
`--num_frames`	Number of frames to generate (must be `8n+1` )	`81`
`--height`	Output height in pixels	`480`
`--width`	Output width in pixels	`832`

参数	描述	示例
`--checkpoint`	微调后的 `.safetensors` 权重路径	`models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors`
`--gpu`	GPU设备索引	`0`
`--style`	命名风格预设	`snowy_winter` , `rainy` , `night` , `sunset`
`--prompt`	目标光照/氛围的文本描述	参考下方示例
`--gbuffer_dir`	存放G-buffer输入帧/视频的目录	`test_dataset`
`--save_dir`	渲染视频的输出目录	`outputs/`
`--num_frames`	生成的帧数（必须为 `8n+1` 格式）	`81`
`--height`	输出视频的高度（像素）	`480`
`--width`	输出视频的宽度（像素）	`832`

G-buffer Directory Structure

G-buffer目录结构

test_dataset/
├── albedo/
│   ├── frame_0000.png
│   ├── frame_0001.png
│   └── ...
├── normal/
│   ├── frame_0000.png
│   └── ...
├── depth/
│   ├── frame_0000.png
│   └── ...
├── roughness/
│   ├── frame_0000.png
│   └── ...
└── metallic/
    ├── frame_0000.png
    └── ...

test_dataset/
├── albedo/
│   ├── frame_0000.png
│   ├── frame_0001.png
│   └── ...
├── normal/
│   ├── frame_0000.png
│   └── ...
├── depth/
│   ├── frame_0000.png
│   └── ...
├── roughness/
│   ├── frame_0000.png
│   └── ...
└── metallic/
    ├── frame_0000.png
    └── ...

Style Prompt Examples

风格Prompt示例

bash

undefined

bash

undefined

Cyberpunk night scene

赛博朋克夜景

--style night
--prompt "neon-lit urban environment at night with rain-slicked streets reflecting colorful neon signs, creating a cyberpunk noir atmosphere"

Golden hour / sunset

黄金时刻/日落

--style sunset
--prompt "warm golden hour lighting with long shadows and a glowing amber sky, soft cinematic atmosphere"

Rainy urban

雨天城市

--style rainy
--prompt "overcast rainy day with wet surfaces, soft diffuse lighting, and atmospheric fog creating a moody cinematic look"

Fantasy / stylized

奇幻/风格化

--style fantasy
--prompt "magical forest environment with bioluminescent plants, ethereal blue-green lighting, and mystical particle effects"

Foggy morning

雾天清晨

--style foggy
--prompt "early morning dense fog with soft diffused light creating a mysterious and quiet atmosphere"

undefined

--style foggy
--prompt "early morning dense fog with soft diffused light creating a mysterious and quiet atmosphere"

undefined

Multi-GPU Inference

多GPU推理

bash

undefined

bash

undefined

Run on specific GPU

在指定GPU上运行

CUDA_VISIBLE_DEVICES=1 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 1
--style rainy
--prompt "heavy rainfall with dark storm clouds and dramatic lightning in the distance"
--gbuffer_dir my_gbuffers
--save_dir outputs/rainy_scene
--num_frames 81 --height 480 --width 832

---

CUDA_VISIBLE_DEVICES=1 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 1
--style rainy
--prompt "heavy rainfall with dark storm clouds and dramatic lightning in the distance"
--gbuffer_dir my_gbuffers
--save_dir outputs/rainy_scene
--num_frames 81 --height 480 --width 832

---

Full Pipeline: RGB Video → Stylized Output

完整流程：RGB视频 → 风格化输出

bash

undefined

bash

undefined

Step 1: Extract G-buffers from RGB video (Inverse Renderer env)

步骤1：从RGB视频中提取G-buffers（在逆向渲染器环境中执行）

conda activate inverse_renderer cd inverse_renderer python run_inverse.py
--input path/to/gameplay_video.mp4
--output_dir ../game_editing/test_dataset/

Step 2: Apply game editing style (Game Editing env)

步骤2：应用游戏编辑风格（在游戏编辑环境中执行）

conda activate game_editing cd ../game_editing CUDA_VISIBLE_DEVICES=0 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 0
--style snowy_winter
--prompt "frozen tundra with blizzard conditions, pale blue-white lighting and drifting snow"
--gbuffer_dir test_dataset
--save_dir outputs/final_render
--num_frames 81 --height 480 --width 832

---

conda activate game_editing cd ../game_editing CUDA_VISIBLE_DEVICES=0 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 0
--style snowy_winter
--prompt "frozen tundra with blizzard conditions, pale blue-white lighting and drifting snow"
--gbuffer_dir test_dataset
--save_dir outputs/final_render
--num_frames 81 --height 480 --width 832

---

Online Demos

在线演示

Demo	URL
Game Editing Demo	https://huggingface.co/spaces/Brian9999/game-editing
Project Page	https://alaya-studio.github.io/renderer/

演示	链接
游戏编辑演示	https://huggingface.co/spaces/Brian9999/game-editing
项目主页	https://alaya-studio.github.io/renderer/

Dataset Overview

数据集概览

The AlayaRenderer dataset (release pending) features:

4M+ frames at 720p / 30 FPS
6 synchronized channels: RGB + albedo, normal, depth, metallic, roughness
40 hours from Cyberpunk 2077 and Black Myth: Wukong
Average clip length: 8 minutes, up to 53 minutes continuous
Weather variants: sunny, rainy, foggy, night, sunset
Motion blur variant via sub-frame interpolation

AlayaRenderer数据集（即将发布）包含：

400万+帧，分辨率720p / 30 FPS
6个同步通道：RGB + 反照率、法线、深度、金属度、粗糙度
40小时素材来自《赛博朋克2077》和《黑神话：悟空》
平均片段长度：8分钟，最长连续片段可达53分钟
天气变体：晴天、雨天、雾天、夜晚、日落
通过子帧插值生成的运动模糊变体

Architecture Summary

架构总结

RGB Video Input
      │
      ▼
┌─────────────────────────────────────┐
│  Inverse Renderer                   │
│  (Cosmos-Transfer1 7B fine-tuned)   │
│  RGB → [albedo, normal, depth,      │
│          roughness, metallic]       │
└─────────────────┬───────────────────┘
                  │  G-buffers
                  ▼
┌─────────────────────────────────────┐
│  Game Editing                       │
│  (Wan2.1 1.3B fine-tuned)           │
│  G-buffers + Text Prompt            │
│  → Stylized RGB Video               │
└─────────────────────────────────────┘

RGB Video Input
      │
      ▼
┌─────────────────────────────────────┐
│  逆向渲染器                          │
│  （微调的Cosmos-Transfer1 7B）       │
│  RGB → [反照率, 法线, 深度,          │
│          粗糙度, 金属度]             │
└─────────────────┬───────────────────┘
                  │  G-buffers
                  ▼
┌─────────────────────────────────────┐
│  游戏编辑                            │
│  （微调的Wan2.1 1.3B）               │
│  G-buffers + 文本Prompt              │
│  → 风格化RGB视频                     │
└─────────────────────────────────────┘

Troubleshooting

问题排查

Submodule not found / DiffSynth-Studio missing

找不到子模块 / 缺失DiffSynth-Studio

bash

undefined

bash

undefined

If cloned without --recurse-submodules:

如果克隆时没有加--recurse-submodules参数：

git submodule update --init --recursive

undefined

git submodule update --init --recursive

undefined

CUDA Out of Memory

CUDA显存不足

Reduce
```
--num_frames
```
(try
```
41
```
instead of
```
81
```
)
Reduce resolution:
```
--height 320 --width 576
```
Ensure no other processes are using the GPU:
```
CUDA_VISIBLE_DEVICES=0
```

减小
```
--num_frames
```
（尝试用
```
41
```
代替
```
81
```
）
降低分辨率：
```
--height 320 --width 576
```
确保没有其他进程占用GPU：指定
```
CUDA_VISIBLE_DEVICES=0
```

num_frames

must follow

8n+1

pattern

num_frames

必须遵循

8n+1

格式

Valid values:

9, 17, 25, 33, 41, 49, 57, 65, 73, 81

bash

undefined

有效值：

9, 17, 25, 33, 41, 49, 57, 65, 73, 81

bash

undefined

Valid

有效

--num_frames 81 # 810 + 1 ✓ --num_frames 41 # 85 + 1 ✓

Invalid

无效

--num_frames 80 # ✗ --num_frames 60 # ✗

undefined

--num_frames 80 # ✗ --num_frames 60 # ✗

undefined

Checkpoint not found

找不到检查点

bash

undefined

bash

undefined

Verify checkpoint placement

验证检查点放置路径是否正确

ls game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors ls inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/

undefined

ls game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors ls inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/

undefined

Version conflicts between models

模型之间的版本冲突

Always use the two separate conda environments (

inverse_renderer

and

game_editing

). Do not install both models' dependencies in one environment.

请始终使用两个独立的conda环境（

inverse_renderer

和

game_editing

），不要在同一个环境中安装两个模型的依赖。

Citation

引用

bibtex

@article{huang2026generativeworldrenderer,
    title={Generative World Renderer},
    author={Zheng-Hui Huang and Zhixiang Wang and Jiaming Tan and Ruihan Yu and Yidan Zhang and Bo Zheng and Yu-Lun Liu and Yung-Yu Chuang and Kaipeng Zhang},
    journal={arXiv preprint arXiv:2604.02329},
    year={2026}
}

bibtex

@article{huang2026generativeworldrenderer,
    title={Generative World Renderer},
    author={Zheng-Hui Huang and Zhixiang Wang and Jiaming Tan and Ruihan Yu and Yidan Zhang and Bo Zheng and Yu-Lun Liu and Yung-Yu Chuang and Kaipeng Zhang},
    journal={arXiv preprint arXiv:2604.02329},
    year={2026}
}

alayarenderer-generative-world

Original

Translation

AlayaRenderer — Generative World Renderer

AlayaRenderer — 生成式世界渲染器

Installation

安装

Clone the Repository

克隆仓库

Two Separate Conda Environments (Recommended)

两个独立Conda环境（推荐）

Environment 1: Inverse Renderer

环境1：逆向渲染器

Follow inverse_renderer/ instructions for Cosmos-Transfer1 setup

按照inverse_renderer/目录下的说明完成Cosmos-Transfer1的配置

Environment 2: Game Editing

环境2：游戏编辑

Follow DiffSynth-Studio setup instructions

按照DiffSynth-Studio的配置说明完成设置

Model Weights

模型权重

Download and Place Weights

下载并放置权重

Inverse Renderer — replace the base checkpoint

逆向渲染器 — 替换基础检查点

Game Editing — place in game_editing models directory

游戏编辑 — 放置到game_editing的models目录下

Inverse Renderer Usage

逆向渲染器使用说明

Setup

环境配置

Follow Cosmos-Transfer1-DiffusionRenderer environment setup

按照Cosmos-Transfer1-DiffusionRenderer的环境说明完成配置

Ensure checkpoint is at:

确保检查点位于如下路径：

inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/

inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/

Inference

推理

inverse_renderer/run_inverse.py (typical pattern)

inverse_renderer/run_inverse.py（典型使用示例）

Input: path to RGB video

输入：RGB视频路径

The model outputs 5 synchronized channels:

模型输出5个同步的通道：

- albedo (diffuse color)

- albedo（漫反射颜色）

- normal (surface orientation)

- normal（表面朝向）

- depth (scene geometry)

- depth（场景几何结构）

- roughness (surface roughness)

- roughness（表面粗糙度）

- metallic (metallic property)

- metallic（金属属性）

Game Editing Usage

游戏编辑使用说明

Quick Start — CLI Inference

快速开始 — CLI推理

CLI Parameters

CLI参数说明

G-buffer Directory Structure

G-buffer目录结构

Style Prompt Examples

风格Prompt示例

Cyberpunk night scene

赛博朋克夜景

Golden hour / sunset

黄金时刻/日落

Rainy urban

雨天城市

Fantasy / stylized

奇幻/风格化

Foggy morning

雾天清晨

Multi-GPU Inference

多GPU推理

Run on specific GPU

在指定GPU上运行

Full Pipeline: RGB Video → Stylized Output

`num_frames`
must follow
`8n+1`
pattern

`num_frames`
必须遵循
`8n+1`
格式