replicate
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDocs
文档
- Reference docs: https://replicate.com/docs/llms.txt
- HTTP API schema: https://api.replicate.com/openapi.json
- MCP server: https://mcp.replicate.com
- Set an header when requesting docs pages to get a Markdown response.
Accept: text/markdown
- 参考文档:https://replicate.com/docs/llms.txt
- HTTP API 架构:https://api.replicate.com/openapi.json
- MCP 服务器:https://mcp.replicate.com
- 请求文档页面时设置请求头,即可获取Markdown格式的响应。
Accept: text/markdown
Workflow
工作流程
Here's a common workflow for using Replicate's API to run a model:
- Choose the right model - Search with the API or ask the user
- Get model metadata - Fetch model input and output schema via API
- Create prediction - POST to /v1/predictions
- Poll for results - GET prediction until status is "succeeded"
- Return output - Usually URLs to generated content
以下是使用Replicate API运行模型的常见工作流程:
- 选择合适的模型 - 通过API搜索或询问用户
- 获取模型元数据 - 通过API获取模型的输入和输出架构
- 创建预测任务 - 向/v1/predictions发送POST请求
- 轮询结果 - 持续GET预测任务状态,直到状态变为“succeeded”
- 返回输出结果 - 通常是生成内容的URL
Choosing models
模型选择
- Use the search and collections APIs to find and compare the best models. Do not list all the models via API, as it's basically a firehose.
- Collections are curated by Replicate staff, so they're vetted.
- Official models are in the "official" collection.
- Use official models because they:
- are always running
- have stable API interfaces
- have predictable output pricing
- are maintained by Replicate staff
- If you must use a community model, be aware that it can take a long time to boot.
- You can create always-on deployments of community models, but you pay for model uptime.
- 使用搜索和合集API来查找并对比最佳模型。不要通过API列出所有模型,这会返回大量数据。
- 合集由Replicate团队精心挑选,经过审核。
- 官方模型位于“official”合集中。
- 优先使用官方模型,因为它们:
- 始终处于运行状态
- 拥有稳定的API接口
- 定价可预测
- 由Replicate团队维护
- 如果必须使用社区模型,请注意启动可能需要很长时间。
- 你可以为社区模型创建始终在线的部署,但需要为模型运行时间付费。
Running models
运行模型
Models take time to run. There are three ways to run a model via API and get its output:
- Create a prediction, store its id from the response, and poll until completion.
- Set a header when creating a prediction for a blocking synchronous response. Only recommended for very fast models.
Prefer: wait - Set an HTTPS webhook URL when creating a prediction, and Replicate will POST to that URL when the prediction completes.
Follow these guideliness when running models:
- Use the "POST /v1/predictions" endpoint, as it supports both official and community models.
- Every model has its own OpenAPI schema. Always fetch and check model schemas to make sure you're setting valid inputs.
- Use HTTPS URLs for file inputs whenever possible. You can also send base64-encoded files, but they should be avoided.
- Fire off multiple predictions concurrently. Don't wait for one to finish before starting the next.
- Output file URLs expire after 1 hour, so back them up if you need to keep them, using a service like Cloudflare R2.
- Webhooks are a good mechanism for receiving and storing prediction output.
模型运行需要时间。通过API运行模型并获取输出有三种方式:
- 创建预测任务,保存响应中的ID,然后轮询直到任务完成。
- 创建预测任务时设置请求头,以获得阻塞式同步响应。仅推荐用于运行速度极快的模型。
Prefer: wait - 创建预测任务时设置HTTPS Webhook URL,当预测任务完成时,Replicate会向该URL发送POST请求。
运行模型时请遵循以下指南:
- 使用“POST /v1/predictions”端点,因为它同时支持官方和社区模型。
- 每个模型都有自己的OpenAPI架构。请始终获取并检查模型架构,确保设置的输入有效。
- 尽可能使用HTTPS URL作为文件输入。你也可以发送base64编码的文件,但应尽量避免这种方式。
- 可以同时发起多个预测任务,无需等待前一个完成再启动下一个。
- 输出文件的URL会在1小时后过期,如果你需要保留它们,请使用Cloudflare R2等服务进行备份。
- Webhook是接收和存储预测输出的理想机制。