alayarenderer-generative-world
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAlayaRenderer — Generative World Renderer
AlayaRenderer — 生成式世界渲染器
Skill by ara.so — Daily 2026 Skills collection.
AlayaRenderer is a two-stage framework for high-quality video rendering:
- Inverse Renderer (RGB → G-buffers): Extracts albedo, normal, depth, roughness, and metallic maps from RGB video using a fine-tuned Cosmos-Transfer1-DiffusionRenderer 7B model.
- Game Editing (G-buffers + Text → Stylized RGB): Synthesizes photorealistic, stylized RGB video from G-buffer inputs using a fine-tuned Wan2.1 1.3B model via DiffSynth-Studio.
由ara.so提供的技能——属于2026年度技能合集。
AlayaRenderer是用于高质量视频渲染的两阶段框架:
- 逆向渲染器(RGB → G-buffers):使用微调的Cosmos-Transfer1-DiffusionRenderer 7B模型,从RGB视频中提取反照率、法线、深度、粗糙度和金属度贴图。
- 游戏编辑(G-buffers + 文本 → 风格化RGB):通过DiffSynth-Studio使用微调的Wan2.1 1.3B模型,从G-buffer输入合成逼真的风格化RGB视频。
Installation
安装
Clone the Repository
克隆仓库
bash
git clone --recurse-submodules https://github.com/ShandaAI/AlayaRenderer.git
cd AlayaRendererImportant: Use— DiffSynth-Studio is a git submodule required for Game Editing.--recurse-submodules
bash
git clone --recurse-submodules https://github.com/ShandaAI/AlayaRenderer.git
cd AlayaRenderer重要提示: 请使用参数——DiffSynth-Studio是游戏编辑功能所需的Git子模块。--recurse-submodules
Two Separate Conda Environments (Recommended)
两个独立Conda环境(推荐)
The two models have conflicting dependencies. Use separate environments:
bash
undefined两个模型的依赖存在冲突,建议使用独立环境:
bash
undefinedEnvironment 1: Inverse Renderer
环境1:逆向渲染器
conda create -n inverse_renderer python=3.10 -y
conda activate inverse_renderer
cd inverse_renderer
conda create -n inverse_renderer python=3.10 -y
conda activate inverse_renderer
cd inverse_renderer
Follow inverse_renderer/ instructions for Cosmos-Transfer1 setup
按照inverse_renderer/目录下的说明完成Cosmos-Transfer1的配置
Environment 2: Game Editing
环境2:游戏编辑
conda create -n game_editing python=3.10 -y
conda activate game_editing
cd game_editing
conda create -n game_editing python=3.10 -y
conda activate game_editing
cd game_editing
Follow DiffSynth-Studio setup instructions
按照DiffSynth-Studio的配置说明完成设置
---
---Model Weights
模型权重
| Model | Base Model | Size | HuggingFace Link |
|---|---|---|---|
| Inverse Renderer | Cosmos-Transfer1-DiffusionRenderer 7B | ~7B params | Brian9999/world_inverse_renderer |
| Game Editing | Wan2.1 1.3B | ~1.3B params | Brian9999/stylerenderer |
| 模型 | 基础模型 | 大小 | HuggingFace链接 |
|---|---|---|---|
| 逆向渲染器 | Cosmos-Transfer1-DiffusionRenderer 7B | ~7B参数 | Brian9999/world_inverse_renderer |
| 游戏编辑 | Wan2.1 1.3B | ~1.3B参数 | Brian9999/stylerenderer |
Download and Place Weights
下载并放置权重
bash
undefinedbash
undefinedInverse Renderer — replace the base checkpoint
逆向渲染器 — 替换基础检查点
huggingface-cli download Brian9999/world_inverse_renderer
--local-dir inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B
--local-dir inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B
huggingface-cli download Brian9999/world_inverse_renderer
--local-dir inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B
--local-dir inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B
Game Editing — place in game_editing models directory
游戏编辑 — 放置到game_editing的models目录下
mkdir -p game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer
huggingface-cli download Brian9999/stylerenderer
--local-dir game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer
--local-dir game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer
---mkdir -p game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer
huggingface-cli download Brian9999/stylerenderer
--local-dir game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer
--local-dir game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer
---Inverse Renderer Usage
逆向渲染器使用说明
The inverse renderer decomposes an RGB video into 5 G-buffer channels: albedo, normal, depth, roughness, metallic.
逆向渲染器可将RGB视频分解为5个G-buffer通道:反照率、法线、深度、粗糙度、金属度。
Setup
环境配置
bash
cd inverse_rendererbash
cd inverse_rendererFollow Cosmos-Transfer1-DiffusionRenderer environment setup
按照Cosmos-Transfer1-DiffusionRenderer的环境说明完成配置
Ensure checkpoint is at:
确保检查点位于如下路径:
inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/
inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/
undefinedundefinedInference
推理
Refer to the subdirectory for the full inference script. The general pattern follows Cosmos-Transfer1-DiffusionRenderer conventions:
inverse_renderer/python
undefined完整推理脚本请参考子目录,通用使用方式遵循Cosmos-Transfer1-DiffusionRenderer的约定:
inverse_renderer/python
undefinedinverse_renderer/run_inverse.py (typical pattern)
inverse_renderer/run_inverse.py(典型使用示例)
import torch
from pathlib import Path
import torch
from pathlib import Path
Input: path to RGB video
输入:RGB视频路径
input_video = "path/to/rgb_video.mp4"
output_dir = "outputs/gbuffers/"
input_video = "path/to/rgb_video.mp4"
output_dir = "outputs/gbuffers/"
The model outputs 5 synchronized channels:
模型输出5个同步的通道:
- albedo (diffuse color)
- albedo(漫反射颜色)
- normal (surface orientation)
- normal(表面朝向)
- depth (scene geometry)
- depth(场景几何结构)
- roughness (surface roughness)
- roughness(表面粗糙度)
- metallic (metallic property)
- metallic(金属属性)
---
---Game Editing Usage
游戏编辑使用说明
Quick Start — CLI Inference
快速开始 — CLI推理
bash
cd game_editing
CUDA_VISIBLE_DEVICES=0 python \
examples/wanvideo/model_inference/inference_gbuffer_caption.py \
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors \
--gpu 0 \
--style snowy_winter \
--prompt "the scene is set in a frozen, snow-covered environment under cold, pale winter light with falling snowflakes, creating a silent and ethereal winter wonderland atmosphere." \
--gbuffer_dir test_dataset \
--save_dir outputs/ \
--num_frames 81 \
--height 480 \
--width 832bash
cd game_editing
CUDA_VISIBLE_DEVICES=0 python \
examples/wanvideo/model_inference/inference_gbuffer_caption.py \
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors \
--gpu 0 \
--style snowy_winter \
--prompt "the scene is set in a frozen, snow-covered environment under cold, pale winter light with falling snowflakes, creating a silent and ethereal winter wonderland atmosphere." \
--gbuffer_dir test_dataset \
--save_dir outputs/ \
--num_frames 81 \
--height 480 \
--width 832CLI Parameters
CLI参数说明
| Parameter | Description | Example |
|---|---|---|
| Path to fine-tuned | |
| GPU device index | |
| Named style preset | |
| Text description of target lighting/atmosphere | See examples below |
| Directory containing G-buffer input frames/video | |
| Output directory for rendered video | |
| Number of frames to generate (must be | |
| Output height in pixels | |
| Output width in pixels | |
| 参数 | 描述 | 示例 |
|---|---|---|
| 微调后的 | |
| GPU设备索引 | |
| 命名风格预设 | |
| 目标光照/氛围的文本描述 | 参考下方示例 |
| 存放G-buffer输入帧/视频的目录 | |
| 渲染视频的输出目录 | |
| 生成的帧数(必须为 | |
| 输出视频的高度(像素) | |
| 输出视频的宽度(像素) | |
G-buffer Directory Structure
G-buffer目录结构
test_dataset/
├── albedo/
│ ├── frame_0000.png
│ ├── frame_0001.png
│ └── ...
├── normal/
│ ├── frame_0000.png
│ └── ...
├── depth/
│ ├── frame_0000.png
│ └── ...
├── roughness/
│ ├── frame_0000.png
│ └── ...
└── metallic/
├── frame_0000.png
└── ...test_dataset/
├── albedo/
│ ├── frame_0000.png
│ ├── frame_0001.png
│ └── ...
├── normal/
│ ├── frame_0000.png
│ └── ...
├── depth/
│ ├── frame_0000.png
│ └── ...
├── roughness/
│ ├── frame_0000.png
│ └── ...
└── metallic/
├── frame_0000.png
└── ...Style Prompt Examples
风格Prompt示例
bash
undefinedbash
undefinedCyberpunk night scene
赛博朋克夜景
--style night
--prompt "neon-lit urban environment at night with rain-slicked streets reflecting colorful neon signs, creating a cyberpunk noir atmosphere"
--prompt "neon-lit urban environment at night with rain-slicked streets reflecting colorful neon signs, creating a cyberpunk noir atmosphere"
--style night
--prompt "neon-lit urban environment at night with rain-slicked streets reflecting colorful neon signs, creating a cyberpunk noir atmosphere"
--prompt "neon-lit urban environment at night with rain-slicked streets reflecting colorful neon signs, creating a cyberpunk noir atmosphere"
Golden hour / sunset
黄金时刻/日落
--style sunset
--prompt "warm golden hour lighting with long shadows and a glowing amber sky, soft cinematic atmosphere"
--prompt "warm golden hour lighting with long shadows and a glowing amber sky, soft cinematic atmosphere"
--style sunset
--prompt "warm golden hour lighting with long shadows and a glowing amber sky, soft cinematic atmosphere"
--prompt "warm golden hour lighting with long shadows and a glowing amber sky, soft cinematic atmosphere"
Rainy urban
雨天城市
--style rainy
--prompt "overcast rainy day with wet surfaces, soft diffuse lighting, and atmospheric fog creating a moody cinematic look"
--prompt "overcast rainy day with wet surfaces, soft diffuse lighting, and atmospheric fog creating a moody cinematic look"
--style rainy
--prompt "overcast rainy day with wet surfaces, soft diffuse lighting, and atmospheric fog creating a moody cinematic look"
--prompt "overcast rainy day with wet surfaces, soft diffuse lighting, and atmospheric fog creating a moody cinematic look"
Fantasy / stylized
奇幻/风格化
--style fantasy
--prompt "magical forest environment with bioluminescent plants, ethereal blue-green lighting, and mystical particle effects"
--prompt "magical forest environment with bioluminescent plants, ethereal blue-green lighting, and mystical particle effects"
--style fantasy
--prompt "magical forest environment with bioluminescent plants, ethereal blue-green lighting, and mystical particle effects"
--prompt "magical forest environment with bioluminescent plants, ethereal blue-green lighting, and mystical particle effects"
Foggy morning
雾天清晨
--style foggy
--prompt "early morning dense fog with soft diffused light creating a mysterious and quiet atmosphere"
--prompt "early morning dense fog with soft diffused light creating a mysterious and quiet atmosphere"
undefined--style foggy
--prompt "early morning dense fog with soft diffused light creating a mysterious and quiet atmosphere"
--prompt "early morning dense fog with soft diffused light creating a mysterious and quiet atmosphere"
undefinedMulti-GPU Inference
多GPU推理
bash
undefinedbash
undefinedRun on specific GPU
在指定GPU上运行
CUDA_VISIBLE_DEVICES=1 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 1
--style rainy
--prompt "heavy rainfall with dark storm clouds and dramatic lightning in the distance"
--gbuffer_dir my_gbuffers
--save_dir outputs/rainy_scene
--num_frames 81 --height 480 --width 832
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 1
--style rainy
--prompt "heavy rainfall with dark storm clouds and dramatic lightning in the distance"
--gbuffer_dir my_gbuffers
--save_dir outputs/rainy_scene
--num_frames 81 --height 480 --width 832
---CUDA_VISIBLE_DEVICES=1 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 1
--style rainy
--prompt "heavy rainfall with dark storm clouds and dramatic lightning in the distance"
--gbuffer_dir my_gbuffers
--save_dir outputs/rainy_scene
--num_frames 81 --height 480 --width 832
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 1
--style rainy
--prompt "heavy rainfall with dark storm clouds and dramatic lightning in the distance"
--gbuffer_dir my_gbuffers
--save_dir outputs/rainy_scene
--num_frames 81 --height 480 --width 832
---Full Pipeline: RGB Video → Stylized Output
完整流程:RGB视频 → 风格化输出
bash
undefinedbash
undefinedStep 1: Extract G-buffers from RGB video (Inverse Renderer env)
步骤1:从RGB视频中提取G-buffers(在逆向渲染器环境中执行)
conda activate inverse_renderer
cd inverse_renderer
python run_inverse.py
--input path/to/gameplay_video.mp4
--output_dir ../game_editing/test_dataset/
--input path/to/gameplay_video.mp4
--output_dir ../game_editing/test_dataset/
conda activate inverse_renderer
cd inverse_renderer
python run_inverse.py
--input path/to/gameplay_video.mp4
--output_dir ../game_editing/test_dataset/
--input path/to/gameplay_video.mp4
--output_dir ../game_editing/test_dataset/
Step 2: Apply game editing style (Game Editing env)
步骤2:应用游戏编辑风格(在游戏编辑环境中执行)
conda activate game_editing
cd ../game_editing
CUDA_VISIBLE_DEVICES=0 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 0
--style snowy_winter
--prompt "frozen tundra with blizzard conditions, pale blue-white lighting and drifting snow"
--gbuffer_dir test_dataset
--save_dir outputs/final_render
--num_frames 81 --height 480 --width 832
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 0
--style snowy_winter
--prompt "frozen tundra with blizzard conditions, pale blue-white lighting and drifting snow"
--gbuffer_dir test_dataset
--save_dir outputs/final_render
--num_frames 81 --height 480 --width 832
---conda activate game_editing
cd ../game_editing
CUDA_VISIBLE_DEVICES=0 python
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 0
--style snowy_winter
--prompt "frozen tundra with blizzard conditions, pale blue-white lighting and drifting snow"
--gbuffer_dir test_dataset
--save_dir outputs/final_render
--num_frames 81 --height 480 --width 832
examples/wanvideo/model_inference/inference_gbuffer_caption.py
--checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
--gpu 0
--style snowy_winter
--prompt "frozen tundra with blizzard conditions, pale blue-white lighting and drifting snow"
--gbuffer_dir test_dataset
--save_dir outputs/final_render
--num_frames 81 --height 480 --width 832
---Online Demos
在线演示
| Demo | URL |
|---|---|
| Game Editing Demo | https://huggingface.co/spaces/Brian9999/game-editing |
| Project Page | https://alaya-studio.github.io/renderer/ |
Dataset Overview
数据集概览
The AlayaRenderer dataset (release pending) features:
- 4M+ frames at 720p / 30 FPS
- 6 synchronized channels: RGB + albedo, normal, depth, metallic, roughness
- 40 hours from Cyberpunk 2077 and Black Myth: Wukong
- Average clip length: 8 minutes, up to 53 minutes continuous
- Weather variants: sunny, rainy, foggy, night, sunset
- Motion blur variant via sub-frame interpolation
AlayaRenderer数据集(即将发布)包含:
- 400万+帧,分辨率720p / 30 FPS
- 6个同步通道:RGB + 反照率、法线、深度、金属度、粗糙度
- 40小时素材来自《赛博朋克2077》和《黑神话:悟空》
- 平均片段长度:8分钟,最长连续片段可达53分钟
- 天气变体:晴天、雨天、雾天、夜晚、日落
- 通过子帧插值生成的运动模糊变体
Architecture Summary
架构总结
RGB Video Input
│
▼
┌─────────────────────────────────────┐
│ Inverse Renderer │
│ (Cosmos-Transfer1 7B fine-tuned) │
│ RGB → [albedo, normal, depth, │
│ roughness, metallic] │
└─────────────────┬───────────────────┘
│ G-buffers
▼
┌─────────────────────────────────────┐
│ Game Editing │
│ (Wan2.1 1.3B fine-tuned) │
│ G-buffers + Text Prompt │
│ → Stylized RGB Video │
└─────────────────────────────────────┘RGB Video Input
│
▼
┌─────────────────────────────────────┐
│ 逆向渲染器 │
│ (微调的Cosmos-Transfer1 7B) │
│ RGB → [反照率, 法线, 深度, │
│ 粗糙度, 金属度] │
└─────────────────┬───────────────────┘
│ G-buffers
▼
┌─────────────────────────────────────┐
│ 游戏编辑 │
│ (微调的Wan2.1 1.3B) │
│ G-buffers + 文本Prompt │
│ → 风格化RGB视频 │
└─────────────────────────────────────┘Troubleshooting
问题排查
Submodule not found / DiffSynth-Studio missing
找不到子模块 / 缺失DiffSynth-Studio
bash
undefinedbash
undefinedIf cloned without --recurse-submodules:
如果克隆时没有加--recurse-submodules参数:
git submodule update --init --recursive
undefinedgit submodule update --init --recursive
undefinedCUDA Out of Memory
CUDA显存不足
- Reduce (try
--num_framesinstead of41)81 - Reduce resolution:
--height 320 --width 576 - Ensure no other processes are using the GPU:
CUDA_VISIBLE_DEVICES=0
- 减小(尝试用
--num_frames代替41)81 - 降低分辨率:
--height 320 --width 576 - 确保没有其他进程占用GPU:指定
CUDA_VISIBLE_DEVICES=0
num_frames
must follow 8n+1
pattern
num_frames8n+1num_frames
必须遵循8n+1
格式
num_frames8n+1Valid values:
9, 17, 25, 33, 41, 49, 57, 65, 73, 81bash
undefined有效值:
9, 17, 25, 33, 41, 49, 57, 65, 73, 81bash
undefinedValid
有效
--num_frames 81 # 810 + 1 ✓
--num_frames 41 # 85 + 1 ✓
--num_frames 81 # 810 + 1 ✓
--num_frames 41 # 85 + 1 ✓
Invalid
无效
--num_frames 80 # ✗
--num_frames 60 # ✗
undefined--num_frames 80 # ✗
--num_frames 60 # ✗
undefinedCheckpoint not found
找不到检查点
bash
undefinedbash
undefinedVerify checkpoint placement
验证检查点放置路径是否正确
ls game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
ls inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/
undefinedls game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors
ls inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B/
undefinedVersion conflicts between models
模型之间的版本冲突
Always use the two separate conda environments ( and ). Do not install both models' dependencies in one environment.
inverse_renderergame_editing请始终使用两个独立的conda环境(和),不要在同一个环境中安装两个模型的依赖。
inverse_renderergame_editingCitation
引用
bibtex
@article{huang2026generativeworldrenderer,
title={Generative World Renderer},
author={Zheng-Hui Huang and Zhixiang Wang and Jiaming Tan and Ruihan Yu and Yidan Zhang and Bo Zheng and Yu-Lun Liu and Yung-Yu Chuang and Kaipeng Zhang},
journal={arXiv preprint arXiv:2604.02329},
year={2026}
}bibtex
@article{huang2026generativeworldrenderer,
title={Generative World Renderer},
author={Zheng-Hui Huang and Zhixiang Wang and Jiaming Tan and Ruihan Yu and Yidan Zhang and Bo Zheng and Yu-Lun Liu and Yung-Yu Chuang and Kaipeng Zhang},
journal={arXiv preprint arXiv:2604.02329},
year={2026}
}