runpod
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRunPod Cloud GPU
RunPod 云GPU
Run open-source AI models on cloud GPUs via RunPod serverless. Pay-per-second, no minimums.
通过RunPod无服务器服务在云GPU上运行开源AI模型。按秒付费,无最低消费要求。
Setup
配置步骤
bash
undefinedbash
undefined1. Create account at https://runpod.io
1. Create account at https://runpod.io
2. Add API key to .env
2. Add API key to .env
echo "RUNPOD_API_KEY=your_key_here" >> .env
echo "RUNPOD_API_KEY=your_key_here" >> .env
3. Deploy any tool with --setup
3. Deploy any tool with --setup
python tools/image_edit.py --setup
python tools/upscale.py --setup
python tools/dewatermark.py --setup
python tools/sadtalker.py --setup
python tools/qwen3_tts.py --setup
Each `--setup` command:
1. Creates a RunPod **template** from the Docker image
2. Creates a serverless **endpoint** with appropriate GPU
3. Saves the endpoint ID to `.env` (e.g. `RUNPOD_QWEN_EDIT_ENDPOINT_ID`)python tools/image_edit.py --setup
python tools/upscale.py --setup
python tools/dewatermark.py --setup
python tools/sadtalker.py --setup
python tools/qwen3_tts.py --setup
每个`--setup`命令会执行以下操作:
1. 从Docker镜像创建RunPod**模板**
2. 搭配适配的GPU创建无服务器**端点**
3. 将端点ID保存到`.env`中(例如`RUNPOD_QWEN_EDIT_ENDPOINT_ID`)Available Images
可用镜像
All images are public on GHCR — no authentication needed.
| Tool | Docker Image | GPU | VRAM | Typical Cost |
|---|---|---|---|---|
| image_edit | | A6000/L40S | 48GB+ | ~$0.05-0.15/job |
| upscale | | RTX 3090/4090 | 24GB | ~$0.01-0.05/job |
| dewatermark | | RTX 3090/4090 | 24GB | ~$0.05-0.30/job |
| sadtalker | | RTX 4090 | 24GB | ~$0.05-0.15/job |
| qwen3_tts | | ADA 24GB | 24GB | ~$0.01-0.05/job |
Total monthly cost: Rarely exceeds $10 even with heavy use.
所有镜像都在GHCR上公开,无需身份验证。
| 工具 | Docker Image | GPU | VRAM | 单次任务预估成本 |
|---|---|---|---|---|
| image_edit | | A6000/L40S | 48GB+ | ~$0.05-0.15/job |
| upscale | | RTX 3090/4090 | 24GB | ~$0.01-0.05/job |
| dewatermark | | RTX 3090/4090 | 24GB | ~$0.05-0.30/job |
| sadtalker | | RTX 4090 | 24GB | ~$0.05-0.15/job |
| qwen3_tts | | ADA 24GB | 24GB | ~$0.01-0.05/job |
总月成本: 即便是重度使用,也很少超过10美元。
How It Works
工作原理
All tools follow the same pattern:
Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output- File transfer: Tools use Cloudflare R2 when configured (,
R2_ACCOUNT_ID,R2_ACCESS_KEY_ID,R2_SECRET_ACCESS_KEY), falling back to free upload servicesR2_BUCKET_NAME - RunPod API: Tools call the endpoint, then poll
/rununtil complete/status/{job_id} - Cold vs warm start: First request after idle spins up a worker (~30-90s). Subsequent requests are fast (~5-15s)
所有工具都遵循相同的运行流程:
Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output- 文件传输: 配置了Cloudflare R2参数(、
R2_ACCOUNT_ID、R2_ACCESS_KEY_ID、R2_SECRET_ACCESS_KEY)时工具会使用Cloudflare R2,否则自动降级使用免费上传服务R2_BUCKET_NAME - RunPod API: 工具调用端点,然后轮询
/run直到任务完成/status/{job_id} - 冷启动与热启动: 闲置后的首次请求需要启动工作节点(约30-90秒),后续请求速度很快(约5-15秒)
Endpoint Management
端点管理
Workers
工作节点配置
workersMin: 0 — Scale to zero when idle (no cost)
workersMax: 1 — Max concurrent jobs (increase for throughput)
idleTimeout: 5 — Seconds before worker scales downAcross all endpoints, you share a total worker pool based on your RunPod plan. If you hit limits, reduce on endpoints you're not actively using.
workersMaxworkersMin: 0 — 闲置时缩容到0节点(无成本)
workersMax: 1 — 最大并发任务数(可提升以增加吞吐量)
idleTimeout: 5 — 工作节点缩容前的闲置等待秒数所有端点共享基于你RunPod套餐的总工作节点池。如果达到资源上限,请降低你当前未活跃使用的端点的值。
workersMaxChecking Endpoint Status
查看端点状态
Each tool stores its endpoint ID in :
.env| Tool | Env Var |
|---|---|
| image_edit | |
| upscale | |
| dewatermark | |
| sadtalker | |
| qwen3_tts | |
每个工具的端点ID都存储在中:
.env| 工具 | 环境变量 |
|---|---|
| image_edit | |
| upscale | |
| dewatermark | |
| sadtalker | |
| qwen3_tts | |
Disabling an Endpoint
禁用端点
To free worker slots without deleting the endpoint, set via the RunPod dashboard or GraphQL API.
workersMax=0要在不删除端点的情况下释放工作节点插槽,可以通过RunPod控制台或GraphQL API将设为0。
workersMaxTroubleshooting
故障排查
Force Image Pull
强制拉取镜像
When you push a new Docker image version, RunPod may still use the cached old one. To force a pull:
- Update the template's to use
imageNamenotation@sha256:DIGEST - Wait for the worker to restart
- Revert to tag after confirming
:latest
当你推送了新版本的Docker镜像时,RunPod可能仍会使用缓存的旧版本。要强制拉取新版:
- 将模板的更新为
imageName格式@sha256:DIGEST - 等待工作节点重启
- 确认生效后可以改回标签
:latest
Cold Start Too Slow
冷启动过慢
- qwen3-tts: ~70s cold start, ~7s warm
- sadtalker: ~60s cold start, ~10s warm
- image_edit: ~90s cold start, ~15s warm
If cold starts are a problem, set (costs money when idle).
workersMin: 1- qwen3-tts: 冷启动约70秒,热启动约7秒
- sadtalker: 冷启动约60秒,热启动约10秒
- image_edit: 冷启动约90秒,热启动约15秒
如果冷启动速度无法满足需求,可以将设为1(闲置状态下也会产生费用)。
workersMinJob Fails with OOM
任务因OOM失败
The model needs more VRAM than the GPU provides. Options:
- Use a larger GPU tier
- For dewatermark: reduce (default 0.5 for safety)
--resize-ratio - For image_edit: reduce
--steps
模型需要的VRAM超出了当前GPU的可用显存。可选解决方案:
- 使用更高配置的GPU规格
- 去水印工具可降低参数(默认安全值为0.5)
--resize-ratio - 图像编辑工具可降低参数
--steps
"No workers available"
"无可用工作节点"
You've hit your plan's concurrent worker limit. Either:
- Wait for a running job to finish
- Set on endpoints you're not using
workersMax=0 - Upgrade your RunPod plan
你已经达到了套餐的并发工作节点上限,可选择以下方案解决:
- 等待运行中的任务完成
- 将未使用的端点的设为0
workersMax - 升级你的RunPod套餐
Docker Images
Docker镜像
All Dockerfiles live in . Images use as the base to share layers across tools.
docker/runpod-*/runpod/pytorchBuilding for RunPod (from Apple Silicon Mac):
bash
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latestGHCR packages default to private — you must manually make them public for RunPod to pull them. Go to GitHub > Packages > Package Settings > Change Visibility.
所有Dockerfile都存放在目录下。镜像以为基础镜像,方便不同工具共享镜像层。
docker/runpod-*/runpod/pytorch在Apple Silicon芯片的Mac上为RunPod构建镜像的命令:
bash
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latestGHCR包默认是私有的,你必须手动将其设为公开,RunPod才能拉取镜像。操作路径:GitHub > 包 > 包设置 > 修改可见性。
Cost Optimization
成本优化
- Keep on all endpoints (scale to zero)
workersMin: 0 - Only deploy endpoints you actively need
- Use to disable idle endpoints without deleting them
workersMax=0 - Qwen3-TTS is significantly cheaper than ElevenLabs for voiceovers
- Check the RunPod dashboard for usage and billing
- 所有端点保持配置(闲置时缩容到0)
workersMin: 0 - 仅部署你当前需要使用的端点
- 闲置端点可通过设置禁用,无需删除
workersMax=0 - 配音场景下Qwen3-TTS成本远低于ElevenLabs
- 可在RunPod控制台查看使用明细和账单