runpod

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

RunPod Cloud GPU

RunPod 云GPU

Run open-source AI models on cloud GPUs via RunPod serverless. Pay-per-second, no minimums.
通过RunPod无服务器服务在云GPU上运行开源AI模型。按秒付费,无最低消费要求。

Setup

配置步骤

bash
undefined
bash
undefined

1. Create account at https://runpod.io

1. Create account at https://runpod.io

2. Add API key to .env

2. Add API key to .env

echo "RUNPOD_API_KEY=your_key_here" >> .env
echo "RUNPOD_API_KEY=your_key_here" >> .env

3. Deploy any tool with --setup

3. Deploy any tool with --setup

python tools/image_edit.py --setup python tools/upscale.py --setup python tools/dewatermark.py --setup python tools/sadtalker.py --setup python tools/qwen3_tts.py --setup

Each `--setup` command:
1. Creates a RunPod **template** from the Docker image
2. Creates a serverless **endpoint** with appropriate GPU
3. Saves the endpoint ID to `.env` (e.g. `RUNPOD_QWEN_EDIT_ENDPOINT_ID`)
python tools/image_edit.py --setup python tools/upscale.py --setup python tools/dewatermark.py --setup python tools/sadtalker.py --setup python tools/qwen3_tts.py --setup

每个`--setup`命令会执行以下操作:
1. 从Docker镜像创建RunPod**模板**
2. 搭配适配的GPU创建无服务器**端点**
3. 将端点ID保存到`.env`中(例如`RUNPOD_QWEN_EDIT_ENDPOINT_ID`)

Available Images

可用镜像

All images are public on GHCR — no authentication needed.
ToolDocker ImageGPUVRAMTypical Cost
image_edit
ghcr.io/conalmullan/video-toolkit-qwen-edit:latest
A6000/L40S48GB+~$0.05-0.15/job
upscale
ghcr.io/conalmullan/video-toolkit-realesrgan:latest
RTX 3090/409024GB~$0.01-0.05/job
dewatermark
ghcr.io/conalmullan/video-toolkit-propainter:latest
RTX 3090/409024GB~$0.05-0.30/job
sadtalker
ghcr.io/conalmullan/video-toolkit-sadtalker:latest
RTX 409024GB~$0.05-0.15/job
qwen3_tts
ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest
ADA 24GB24GB~$0.01-0.05/job
Total monthly cost: Rarely exceeds $10 even with heavy use.
所有镜像都在GHCR上公开,无需身份验证。
工具Docker ImageGPUVRAM单次任务预估成本
image_edit
ghcr.io/conalmullan/video-toolkit-qwen-edit:latest
A6000/L40S48GB+~$0.05-0.15/job
upscale
ghcr.io/conalmullan/video-toolkit-realesrgan:latest
RTX 3090/409024GB~$0.01-0.05/job
dewatermark
ghcr.io/conalmullan/video-toolkit-propainter:latest
RTX 3090/409024GB~$0.05-0.30/job
sadtalker
ghcr.io/conalmullan/video-toolkit-sadtalker:latest
RTX 409024GB~$0.05-0.15/job
qwen3_tts
ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest
ADA 24GB24GB~$0.01-0.05/job
总月成本: 即便是重度使用,也很少超过10美元。

How It Works

工作原理

All tools follow the same pattern:
Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output
  1. File transfer: Tools use Cloudflare R2 when configured (
    R2_ACCOUNT_ID
    ,
    R2_ACCESS_KEY_ID
    ,
    R2_SECRET_ACCESS_KEY
    ,
    R2_BUCKET_NAME
    ), falling back to free upload services
  2. RunPod API: Tools call the
    /run
    endpoint, then poll
    /status/{job_id}
    until complete
  3. Cold vs warm start: First request after idle spins up a worker (~30-90s). Subsequent requests are fast (~5-15s)
所有工具都遵循相同的运行流程:
Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output
  1. 文件传输: 配置了Cloudflare R2参数(
    R2_ACCOUNT_ID
    R2_ACCESS_KEY_ID
    R2_SECRET_ACCESS_KEY
    R2_BUCKET_NAME
    )时工具会使用Cloudflare R2,否则自动降级使用免费上传服务
  2. RunPod API: 工具调用
    /run
    端点,然后轮询
    /status/{job_id}
    直到任务完成
  3. 冷启动与热启动: 闲置后的首次请求需要启动工作节点(约30-90秒),后续请求速度很快(约5-15秒)

Endpoint Management

端点管理

Workers

工作节点配置

workersMin: 0    — Scale to zero when idle (no cost)
workersMax: 1    — Max concurrent jobs (increase for throughput)
idleTimeout: 5   — Seconds before worker scales down
Across all endpoints, you share a total worker pool based on your RunPod plan. If you hit limits, reduce
workersMax
on endpoints you're not actively using.
workersMin: 0    — 闲置时缩容到0节点(无成本)
workersMax: 1    — 最大并发任务数(可提升以增加吞吐量)
idleTimeout: 5   — 工作节点缩容前的闲置等待秒数
所有端点共享基于你RunPod套餐的总工作节点池。如果达到资源上限,请降低你当前未活跃使用的端点的
workersMax
值。

Checking Endpoint Status

查看端点状态

Each tool stores its endpoint ID in
.env
:
ToolEnv Var
image_edit
RUNPOD_QWEN_EDIT_ENDPOINT_ID
upscale
RUNPOD_UPSCALE_ENDPOINT_ID
dewatermark
RUNPOD_DEWATERMARK_ENDPOINT_ID
sadtalker
RUNPOD_SADTALKER_ENDPOINT_ID
qwen3_tts
RUNPOD_QWEN3_TTS_ENDPOINT_ID
每个工具的端点ID都存储在
.env
中:
工具环境变量
image_edit
RUNPOD_QWEN_EDIT_ENDPOINT_ID
upscale
RUNPOD_UPSCALE_ENDPOINT_ID
dewatermark
RUNPOD_DEWATERMARK_ENDPOINT_ID
sadtalker
RUNPOD_SADTALKER_ENDPOINT_ID
qwen3_tts
RUNPOD_QWEN3_TTS_ENDPOINT_ID

Disabling an Endpoint

禁用端点

To free worker slots without deleting the endpoint, set
workersMax=0
via the RunPod dashboard or GraphQL API.
要在不删除端点的情况下释放工作节点插槽,可以通过RunPod控制台或GraphQL API将
workersMax
设为0。

Troubleshooting

故障排查

Force Image Pull

强制拉取镜像

When you push a new Docker image version, RunPod may still use the cached old one. To force a pull:
  1. Update the template's
    imageName
    to use
    @sha256:DIGEST
    notation
  2. Wait for the worker to restart
  3. Revert to
    :latest
    tag after confirming
当你推送了新版本的Docker镜像时,RunPod可能仍会使用缓存的旧版本。要强制拉取新版:
  1. 将模板的
    imageName
    更新为
    @sha256:DIGEST
    格式
  2. 等待工作节点重启
  3. 确认生效后可以改回
    :latest
    标签

Cold Start Too Slow

冷启动过慢

  • qwen3-tts: ~70s cold start, ~7s warm
  • sadtalker: ~60s cold start, ~10s warm
  • image_edit: ~90s cold start, ~15s warm
If cold starts are a problem, set
workersMin: 1
(costs money when idle).
  • qwen3-tts: 冷启动约70秒,热启动约7秒
  • sadtalker: 冷启动约60秒,热启动约10秒
  • image_edit: 冷启动约90秒,热启动约15秒
如果冷启动速度无法满足需求,可以将
workersMin
设为1(闲置状态下也会产生费用)。

Job Fails with OOM

任务因OOM失败

The model needs more VRAM than the GPU provides. Options:
  • Use a larger GPU tier
  • For dewatermark: reduce
    --resize-ratio
    (default 0.5 for safety)
  • For image_edit: reduce
    --steps
模型需要的VRAM超出了当前GPU的可用显存。可选解决方案:
  • 使用更高配置的GPU规格
  • 去水印工具可降低
    --resize-ratio
    参数(默认安全值为0.5)
  • 图像编辑工具可降低
    --steps
    参数

"No workers available"

"无可用工作节点"

You've hit your plan's concurrent worker limit. Either:
  • Wait for a running job to finish
  • Set
    workersMax=0
    on endpoints you're not using
  • Upgrade your RunPod plan
你已经达到了套餐的并发工作节点上限,可选择以下方案解决:
  • 等待运行中的任务完成
  • 将未使用的端点的
    workersMax
    设为0
  • 升级你的RunPod套餐

Docker Images

Docker镜像

All Dockerfiles live in
docker/runpod-*/
. Images use
runpod/pytorch
as the base to share layers across tools.
Building for RunPod (from Apple Silicon Mac):
bash
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latest
GHCR packages default to private — you must manually make them public for RunPod to pull them. Go to GitHub > Packages > Package Settings > Change Visibility.
所有Dockerfile都存放在
docker/runpod-*/
目录下。镜像以
runpod/pytorch
为基础镜像,方便不同工具共享镜像层。
在Apple Silicon芯片的Mac上为RunPod构建镜像的命令:
bash
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latest
GHCR包默认是私有的,你必须手动将其设为公开,RunPod才能拉取镜像。操作路径:GitHub > 包 > 包设置 > 修改可见性。

Cost Optimization

成本优化

  • Keep
    workersMin: 0
    on all endpoints (scale to zero)
  • Only deploy endpoints you actively need
  • Use
    workersMax=0
    to disable idle endpoints without deleting them
  • Qwen3-TTS is significantly cheaper than ElevenLabs for voiceovers
  • Check the RunPod dashboard for usage and billing
  • 所有端点保持
    workersMin: 0
    配置(闲置时缩容到0)
  • 仅部署你当前需要使用的端点
  • 闲置端点可通过设置
    workersMax=0
    禁用,无需删除
  • 配音场景下Qwen3-TTS成本远低于ElevenLabs
  • 可在RunPod控制台查看使用明细和账单