publish-models
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDocs
文档
- Cog reference: https://cog.run/llms.txt
- reference: https://cog.run/cli#cog-push
cog push - cog-safe-push: https://github.com/replicate/cog-safe-push
- Model CI template: https://github.com/replicate/model-ci-template
- Continuous deployment guide: https://replicate.com/docs/guides/continuous-model-deployment
- Cog参考文档:https://cog.run/llms.txt
- 参考文档:https://cog.run/cli#cog-push
cog push - cog-safe-push:https://github.com/replicate/cog-safe-push
- 模型CI模板:https://github.com/replicate/model-ci-template
- 持续部署指南:https://replicate.com/docs/guides/continuous-model-deployment
When to use this skill
何时使用此技能
- You have a working Cog project (see if you don't yet).
build-models - You want to publish a private or public model on Replicate.
- You're releasing a new version of an existing model and want to avoid breaking changes.
- You're setting up CI/CD for model releases.
- 你已拥有一个可用的Cog项目(如果没有,请参考)。
build-models - 你希望在Replicate上发布私有或公开模型。
- 你要发布现有模型的新版本,且希望避免破坏性变更。
- 你要为模型发布设置CI/CD流程。
Prerequisites
前置条件
- Cog installed and against
cog login(orr8.im).echo $TOKEN | cog login --token-stdin - A model created at via the API, web UI, or
replicate.com/{owner}/{name}CLI.r8-model - set in your environment.
REPLICATE_API_TOKEN
- 已安装Cog,并针对执行
r8.im登录(或使用cog login)。echo $TOKEN | cog login --token-stdin - 已通过API、网页UI或CLI在
r8-model创建模型。replicate.com/{owner}/{name} - 环境中已设置。
REPLICATE_API_TOKEN
Plain cog push
cog push基础cog push
用法
cog pushThe simplest path. Build and upload a new version:
cog push r8.im/owner/my-modelOr set in and run a bare:
image: r8.im/owner/my-modelcog.yamlcog pushUseful flags:
- — store weights in a separate layer; faster cold boots and pushes for models with > 1GB of weights.
--separate-weights - — faster pushes during iteration (skips some validation).
--x-fast - — pass build-time secrets without baking them into image history.
--secret id=hf,src=$HOME/.hf_token
最简单的流程。构建并上传新版本:
cog push r8.im/owner/my-model或者在中设置,然后直接运行:
cog.yamlimage: r8.im/owner/my-modelcog push实用参数:
- — 将权重存储在独立层中;对于权重超过1GB的模型,冷启动和推送速度更快。
--separate-weights - — 迭代过程中加快推送速度(跳过部分验证)。
--x-fast - — 传递构建时密钥,无需将其嵌入镜像历史。
--secret id=hf,src=$HOME/.hf_token
cog-safe-push (recommended for any model with users)
cog-safe-push(推荐给有用户使用的模型)
cog-safe-push-testInstall:
pip install git+https://github.com/replicate/cog-safe-push.gitRequired env vars:
REPLICATE_API_TOKEN- (Claude judges output similarity for stochastic models)
ANTHROPIC_API_KEY
Basic usage:
cog-safe-push --test-hardware=gpu-l40s owner/my-modelThis will:
- Lint with ruff.
predict.py - Create a private test model if missing.
owner/my-model-test - Push the local Cog model to the test model.
- Lint the schema (descriptions, defaults, etc.).
- Check schema compatibility against the live version.
owner/my-model - Run prediction comparisons between live and test versions.
- Fuzz the test model with AI-generated inputs.
- If everything passes, push to .
owner/my-model
cog-safe-push-test安装:
pip install git+https://github.com/replicate/cog-safe-push.git所需环境变量:
REPLICATE_API_TOKEN- (Claude用于判断随机模型的输出相似度)
ANTHROPIC_API_KEY
基础用法:
cog-safe-push --test-hardware=gpu-l40s owner/my-model执行流程:
- 使用ruff检查代码规范。
predict.py - 如果不存在私有测试模型,则创建它。
owner/my-model-test - 将本地Cog模型推送到测试模型。
- 检查schema规范(描述、默认值等)。
- 检查与线上版本的schema兼容性。
owner/my-model - 运行线上版本与测试版本的预测对比。
- 使用AI生成的输入对测试模型进行模糊测试。
- 如果所有检查通过,将模型推送到。
owner/my-model
cog-safe-push.yaml schema
cog-safe-push.yaml 配置 schema
Drop a in your project root (or for multi-model repos). All five test-case checker types in one example:
cog-safe-push.yamlcog-safe-push-configs/<variant>.yamlyaml
model: owner/my-model
test_model: owner/my-model-test
test_hardware: gpu-l40s
predict:
compare_outputs: false # set false for stochastic models
predict_timeout: 600
test_cases:
- inputs:
prompt: "a serene mountain landscape"
match_prompt: "a landscape photo of mountains" # AI-judged via Claude
- inputs:
prompt: "a cat"
match_url: "https://example.com/reference-cat.png" # binary/image match
- inputs:
prompt: ""
error_contains: "prompt cannot be empty" # negative test
- inputs:
mode: "json"
jq_query: '.confidence > 0.8 and .status == "success"' # JSON output
- inputs:
prompt: "echo this"
exact_string: "echo this" # exact string match
fuzz:
fixed_inputs:
seed: 42
disabled_inputs:
- debug
iterations: 10
prompt: "Generate creative and diverse prompts"
train: # if your model has a trainer
destination: owner/my-model-trained
destination_hardware: gpu-l40s
train_timeout: 1800
test_cases:
- inputs:
input_images: "https://.../training.zip"
steps: 10
deployment: # auto-create or update on push
name: my-model
owner: owner
hardware: gpu-l40s
parallel: 4
fast_push: false
ignore_schema_compatibility: false
official_model: owner/my-model # for proxy/wrapper models, see belowTest case checkers are mutually exclusive: pick exactly one of , , , , or per case. Use for any stochastic model (diffusion, LLMs); the default is brittle.
match_promptmatch_urlerror_containsjq_queryexact_stringcompare_outputs: falsetrue在项目根目录放置(多模型仓库可放在)。以下示例包含五种测试用例检查类型:
cog-safe-push.yamlcog-safe-push-configs/<variant>.yamlyaml
model: owner/my-model
test_model: owner/my-model-test
test_hardware: gpu-l40s
predict:
compare_outputs: false # 随机模型设置为false
predict_timeout: 600
test_cases:
- inputs:
prompt: "a serene mountain landscape"
match_prompt: "a landscape photo of mountains" # 通过Claude进行AI判断
- inputs:
prompt: "a cat"
match_url: "https://example.com/reference-cat.png" # 二进制/图片匹配
- inputs:
prompt: ""
error_contains: "prompt cannot be empty" # 负面测试
- inputs:
mode: "json"
jq_query: '.confidence > 0.8 and .status == "success"' # JSON输出检查
- inputs:
prompt: "echo this"
exact_string: "echo this" # 精确字符串匹配
fuzz:
fixed_inputs:
seed: 42
disabled_inputs:
- debug
iterations: 10
prompt: "Generate creative and diverse prompts"
train: # 如果你的模型包含训练器
destination: owner/my-model-trained
destination_hardware: gpu-l40s
train_timeout: 1800
test_cases:
- inputs:
input_images: "https://.../training.zip"
steps: 10
deployment: # 推送时自动创建或更新部署
name: my-model
owner: owner
hardware: gpu-l40s
parallel: 4
fast_push: false
ignore_schema_compatibility: false
official_model: owner/my-model # 代理/包装模型使用,详见下文测试用例检查器互斥:每个用例只能选择、、、或中的一个。随机模型(扩散模型、大语言模型)需设置;默认值对随机模型不适用。
match_promptmatch_urlerror_containsjq_queryexact_stringcompare_outputs: falsetrueCI/CD: GitHub Actions
CI/CD:GitHub Actions
Two paths, depending on how much glue you want.
两种方案,取决于你需要的定制程度。
Path A: roll your own
方案A:自定义实现
yaml
undefinedyaml
undefined.github/workflows/push.yaml
.github/workflows/push.yaml
name: Push to Replicate
on:
workflow_dispatch:
inputs:
no_push:
type: boolean
default: false
jobs:
push:
runs-on: ubuntu-latest-4-cores # builds need disk + cores
steps:
- uses: actions/checkout@v4
- uses: jlumbroso/free-disk-space@v1.3.1
with:
tool-cache: false
docker-images: false
- uses: replicate/setup-cog@v2
with:
token: ${{ secrets.REPLICATE_API_TOKEN }}
- run: pip install git+https://github.com/replicate/cog-safe-push.git
- env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
REPLICATE_API_TOKEN: ${{ secrets.REPLICATE_API_TOKEN }}
run: |
cog-safe-push -vv ${{ inputs.no_push && '--no-push' || '' }}
Add a `concurrency:` block so PR builds cancel each other while main-branch pushes queue:
```yaml
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}name: Push to Replicate
on:
workflow_dispatch:
inputs:
no_push:
type: boolean
default: false
jobs:
push:
runs-on: ubuntu-latest-4-cores # 构建需要磁盘和核心资源
steps:
- uses: actions/checkout@v4
- uses: jlumbroso/free-disk-space@v1.3.1
with:
tool-cache: false
docker-images: false
- uses: replicate/setup-cog@v2
with:
token: ${{ secrets.REPLICATE_API_TOKEN }}
- run: pip install git+https://github.com/replicate/cog-safe-push.git
- env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
REPLICATE_API_TOKEN: ${{ secrets.REPLICATE_API_TOKEN }}
run: |
cog-safe-push -vv ${{ inputs.no_push && '--no-push' || '' }}
添加`concurrency:`块,使PR构建相互取消,而主分支推送排队执行:
```yaml
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}Path B: reusable workflow from model-ci-template
方案B:使用model-ci-template的可复用工作流
For Replicate-style multi-model repos, drop in:
yaml
undefined对于Replicate风格的多模型仓库,添加以下内容:
yaml
undefined.github/workflows/ci.yaml
.github/workflows/ci.yaml
name: CI
on:
pull_request: { branches: [main] }
push: { branches: [main] }
workflow_dispatch:
inputs:
models: { type: string, default: "all" }
ignore_schema_checks: { type: boolean, default: false }
cog_version: { type: string, default: "latest" }
test_only: { type: boolean, default: false }
jobs:
ci:
uses: replicate/model-ci-template/.github/workflows/template.yaml@main
with:
trigger_type: ${{ github.event_name }}
models: ${{ inputs.models || 'all' }}
ignore_schema_checks: ${{ inputs.ignore_schema_checks || false }}
cog_version: ${{ inputs.cog_version || 'latest' }}
test_only: ${{ inputs.test_only || false }}
secrets: inherit
The reusable workflow expects:
- `cog-safe-push-configs/<model>.yaml` — one per model variant.
- `script/select-model` — bash file with `if/elif [[ "$MODEL" == "..." ]]` blocks listing valid model names.
- Secrets: `COG_TOKEN`, `REPLICATE_API_TOKEN`, `ANTHROPIC_API_KEY`.name: CI
on:
pull_request: { branches: [main] }
push: { branches: [main] }
workflow_dispatch:
inputs:
models: { type: string, default: "all" }
ignore_schema_checks: { type: boolean, default: false }
cog_version: { type: string, default: "latest" }
test_only: { type: boolean, default: false }
jobs:
ci:
uses: replicate/model-ci-template/.github/workflows/template.yaml@main
with:
trigger_type: ${{ github.event_name }}
models: ${{ inputs.models || 'all' }}
ignore_schema_checks: ${{ inputs.ignore_schema_checks || false }}
cog_version: ${{ inputs.cog_version || 'latest' }}
test_only: ${{ inputs.test_only || false }}
secrets: inherit
可复用工作流要求:
- `cog-safe-push-configs/<model>.yaml` — 每个模型变体对应一个配置文件。
- `script/select-model` — bash脚本,包含`if/elif [[ "$MODEL" == "..." ]]`块,列出有效的模型名称。
- 密钥:`COG_TOKEN`、`REPLICATE_API_TOKEN`、`ANTHROPIC_API_KEY`。Multi-model matrix pushes
多模型矩阵推送
Pattern from : one repo, N variants, push them in parallel.
replicate/cog-fluxyaml
jobs:
prepare:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set.outputs.matrix }}
steps:
- id: set
run: |
if [ "${{ inputs.models }}" = "all" ]; then
echo 'matrix={"model":["schnell","dev","krea-dev"]}' >> "$GITHUB_OUTPUT"
else
list=$(echo "${{ inputs.models }}" | jq -Rc 'split(",")')
echo "matrix={\"model\":$list}" >> "$GITHUB_OUTPUT"
fi
push:
needs: prepare
runs-on: ubuntu-latest-4-cores
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.prepare.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
- run: ./script/select.sh ${{ matrix.model }} # produces cog.yaml from a template
- run: cog-safe-push --config cog-safe-push-configs/${{ matrix.model }}.yaml -vv来自的模式:一个仓库,N个变体,并行推送。
replicate/cog-fluxyaml
jobs:
prepare:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set.outputs.matrix }}
steps:
- id: set
run: |
if [ "${{ inputs.models }}" = "all" ]; then
echo 'matrix={"model":["schnell","dev","krea-dev"]}' >> "$GITHUB_OUTPUT"
else
list=$(echo "${{ inputs.models }}" | jq -Rc 'split(",")')
echo "matrix={\"model\":$list}" >> "$GITHUB_OUTPUT"
fi
push:
needs: prepare
runs-on: ubuntu-latest-4-cores
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.prepare.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
- run: ./script/select.sh ${{ matrix.model }} # 从模板生成cog.yaml
- run: cog-safe-push --config cog-safe-push-configs/${{ matrix.model }}.yaml -vvTwo-pass push for proxy / official models
代理/官方模型的两步推送
When you maintain a proxy that wraps a third-party API, you push to a private wrapper first, then update the public-facing official model card. Pattern from :
replicate/cog-official-templatebash
./script/write-api-key # bake API key into config
cog-safe-push --config cog-safe-push-configs/${MODEL}.yaml -vv
./script/delete-api-key # strip the key
cog-safe-push --push-official-model --config cog-safe-push-configs/${MODEL}.yaml -vvSet in the config so knows where to publish.
official_model: owner/name--push-official-model当你维护一个包装第三方API的代理时,需先推送到私有包装模型,再更新面向公众的官方模型卡片。来自的模式:
replicate/cog-official-templatebash
./script/write-api-key # 将API密钥写入配置
cog-safe-push --config cog-safe-push-configs/${MODEL}.yaml -vv
./script/delete-api-key # 移除密钥
cog-safe-push --push-official-model --config cog-safe-push-configs/${MODEL}.yaml -vv在配置中设置,使知道要发布到哪里。
official_model: owner/name--push-official-modelDeployments
部署配置
Add a block to to create or update a Replicate deployment automatically on each push:
deploymentcog-safe-push.yamlyaml
deployment:
name: my-model
owner: owner
hardware: gpu-l40sScaling defaults: CPU deployments scale 1-20 instances, GPU deployments scale 0-2. Adjust manually via the API or web UI when needed.
在中添加块,每次推送时自动创建或更新Replicate部署:
cog-safe-push.yamldeploymentyaml
deployment:
name: my-model
owner: owner
hardware: gpu-l40s默认缩放规则:CPU部署缩放1-20个实例,GPU部署缩放0-2个实例。必要时可通过API或网页UI手动调整。
Monitoring published models
已发布模型的监控
Run an hourly canary that exercises the registry path. Pattern from :
replicate/cog-pagerduty-checkyaml
name: Hourly cog push check
on:
schedule:
- cron: "0 * * * *"
workflow_dispatch:
jobs:
check:
runs-on: ubuntu-latest
steps:
- run: |
# generate a tiny model with a unique uuid, push it, run a prediction
# by digest, fail loudly if anything breaks.
./script/canary.shWorth doing for any production-critical model, especially when revenue depends on the registry being up.
运行每小时一次的金丝雀测试,验证注册路径是否正常。来自的模式:
replicate/cog-pagerduty-checkyaml
name: Hourly cog push check
on:
schedule:
- cron: "0 * * * *"
workflow_dispatch:
jobs:
check:
runs-on: ubuntu-latest
steps:
- run: |
# 生成带有唯一UUID的小型模型,推送它,通过摘要运行预测
# 如果任何步骤失败,立即报错。
./script/canary.sh对于任何生产关键模型,尤其是收入依赖于注册服务可用性的模型,都值得这么做。
Guidelines
指南
- Don't break schema compatibility unless you mean to. cog-safe-push catches it; is the opt-out.
--ignore-schema-compatibility - Pin so test pushes are reproducible.
test_hardware - Use for dry runs in PR CI; full push on merge to main or on version tags.
--no-push - Push from CI rather than laptops once you have users.
- Use for stochastic models. Use
compare_outputs: falsefor image/video outputs (VLM judgment),match_prompt:for binary outputs you control,match_url:for JSON,jq_query:for negative tests.error_contains: - Never commit or
REPLICATE_API_TOKEN. Use repo secrets.ANTHROPIC_API_KEY - For models with weights > 1GB, push with .
--separate-weights
- 除非有意为之,否则不要破坏schema兼容性。cog-safe-push会捕捉此类问题;是退出选项。
--ignore-schema-compatibility - 固定,使测试推送可重现。
test_hardware - 在PR CI中使用进行干运行;合并到主分支或版本标签时执行完整推送。
--no-push - 一旦有用户使用,就从CI而非本地电脑推送模型。
- 随机模型使用。图像/视频输出使用
compare_outputs: false(大语言模型判断),可控二进制输出使用match_prompt:,JSON输出使用match_url:,负面测试使用jq_query:。error_contains: - 切勿提交或
REPLICATE_API_TOKEN。使用仓库密钥。ANTHROPIC_API_KEY - 对于权重超过1GB的模型,使用进行推送。
--separate-weights
Production references
生产参考
- https://github.com/replicate/cog-safe-push — the tool itself, plus its config schema.
- https://github.com/replicate/model-ci-template — reusable GitHub Actions workflow.
- https://github.com/replicate/cog-official-template — proxy/official model template.
- https://github.com/replicate/cog-flux/blob/main/.github/workflows/push.yaml — matrix push across FLUX variants.
- https://github.com/replicate/cog-comfyui/blob/main/.github/workflows/ci.yaml — ComfyUI model CI with custom-node install step.
- https://github.com/replicate/cog-pagerduty-check — hourly canary pattern.
- https://github.com/replicate/cog-safe-push — 工具本身及其配置schema。
- https://github.com/replicate/model-ci-template — 可复用的GitHub Actions工作流。
- https://github.com/replicate/cog-official-template — 代理/官方模型模板。
- https://github.com/replicate/cog-flux/blob/main/.github/workflows/push.yaml — FLUX变体的矩阵推送示例。
- https://github.com/replicate/cog-comfyui/blob/main/.github/workflows/ci.yaml — 包含自定义节点安装步骤的ComfyUI模型CI示例。
- https://github.com/replicate/cog-pagerduty-check — 每小时金丝雀测试模式。