venice-api-overview

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Venice API Overview

Venice API 概览

Venice.ai is an OpenAI-compatible inference platform for text, image, audio, video, and embeddings. One API — two ways to pay: a traditional API key (Pro account), or a wallet (x402, USDC on Base, no account required).
Venice.ai 是一个兼容OpenAI的推理平台,支持文本、图像、音频、视频以及嵌入任务。一个API,两种付费方式:传统的API key(专业账户),或钱包(x402,基于Base网络的USDC,无需注册账户)。

Use when

适用场景

  • You're writing code against
    api.venice.ai
    for the first time.
  • You need to decide between API-key and x402/wallet authentication.
  • You want a quick map of which endpoint to call for which task.
  • You need to understand the common response headers (
    X-Balance-Remaining
    ,
    PAYMENT-REQUIRED
    , etc.).
  • 首次针对
    api.venice.ai
    编写代码时。
  • 需要在API-key和x402/钱包认证方式之间做选择时。
  • 需要快速了解不同任务对应的调用端点时。
  • 需要理解常见响应头(如
    X-Balance-Remaining
    PAYMENT-REQUIRED
    等)时。

Base URL

基础URL

All endpoints live under:
https://api.venice.ai/api/v1
The OpenAPI spec is distributed at
outerface/swagger.yaml
(current version
20260420.235001
).
所有端点均位于:
https://api.venice.ai/api/v1
OpenAPI规范发布于
outerface/swagger.yaml
(当前版本
20260420.235001
)。

Authentication (pick one per request)

认证方式(每次请求选择一种)

SchemeHeaderBest for
BearerAuth
Authorization: Bearer <VENICE_API_KEY>
Server-side apps, dashboards, usage analytics, bundled credits
siwx
(x402)
X-Sign-In-With-X: <base64 SIWE JSON>
No account, pay-as-you-go with USDC on Base, serverless / agents
Every inference endpoint accepts either — see
venice-auth
.
bash
undefined
方案请求头最佳适用场景
BearerAuth
Authorization: Bearer <VENICE_API_KEY>
后端应用、控制面板、使用分析、套餐额度
siwx
(x402)
X-Sign-In-With-X: <base64 SIWE JSON>
无需账户、基于Base网络USDC按需付费、无服务器/Agent场景
所有推理端点均支持上述任意一种认证方式——详见
venice-auth
bash
undefined

Bearer

Bearer

curl https://api.venice.ai/api/v1/models
-H "Authorization: Bearer $VENICE_API_KEY"
curl https://api.venice.ai/api/v1/models
-H "Authorization: Bearer $VENICE_API_KEY"

x402 / SIWE (one-liner via the SDK)

x402 / SIWE (one-liner via the SDK)

import { VeniceClient } from 'venice-x402-client' const v = new VeniceClient(process.env.WALLET_KEY) await v.models.list()
undefined
import { VeniceClient } from 'venice-x402-client' const v = new VeniceClient(process.env.WALLET_KEY) await v.models.list()
undefined

Endpoint map

端点映射

Inference

推理

CategoryEndpointsSkill
Chat
POST /chat/completions
venice-chat
Responses (Alpha)
POST /responses
venice-responses
Embeddings
POST /embeddings
venice-embeddings
Image gen
POST /image/generate
,
POST /images/generations
,
GET /image/styles
venice-image-generate
Image edit
POST /image/edit
,
POST /image/multi-edit
,
POST /image/upscale
,
POST /image/background-remove
venice-image-edit
TTS
POST /audio/speech
venice-audio-speech
STT
POST /audio/transcriptions
venice-audio-transcription
Music (async)
POST /audio/quote
,
/audio/queue
,
/audio/retrieve
,
/audio/complete
venice-audio-music
Video (async)
POST /video/quote
,
/video/queue
,
/video/retrieve
,
/video/complete
,
/video/transcriptions
venice-video
分类端点Skill
对话
POST /chat/completions
venice-chat
响应(测试版)
POST /responses
venice-responses
嵌入
POST /embeddings
venice-embeddings
图像生成
POST /image/generate
,
POST /images/generations
,
GET /image/styles
venice-image-generate
图像编辑
POST /image/edit
,
POST /image/multi-edit
,
POST /image/upscale
,
POST /image/background-remove
venice-image-edit
文本转语音(TTS)
POST /audio/speech
venice-audio-speech
语音转文本(STT)
POST /audio/transcriptions
venice-audio-transcription
音乐(异步)
POST /audio/quote
,
/audio/queue
,
/audio/retrieve
,
/audio/complete
venice-audio-music
视频(异步)
POST /video/quote
,
/video/queue
,
/video/retrieve
,
/video/complete
,
/video/transcriptions
venice-video

Catalog

资源目录

CategoryEndpointsSkill
Models
GET /models
,
/models/traits
,
/models/compatibility_mapping
venice-models
Characters
GET /characters
,
/characters/{slug}
,
/characters/{slug}/reviews
venice-characters
分类端点Skill
模型
GET /models
,
/models/traits
,
/models/compatibility_mapping
venice-models
角色
GET /characters
,
/characters/{slug}
,
/characters/{slug}/reviews
venice-characters

Account, billing, wallet

账户、计费与钱包

CategoryEndpointsSkill
API keys`GETPOST
Billing
GET /billing/balance
,
/billing/usage
,
/billing/usage-analytics
venice-billing
x402 wallet
GET /x402/balance/{wallet}
,
POST /x402/top-up
,
GET /x402/transactions/{wallet}
venice-x402
分类端点Skill
API密钥`GETPOST
计费
GET /billing/balance
,
/billing/usage
,
/billing/usage-analytics
venice-billing
x402钱包
GET /x402/balance/{wallet}
,
POST /x402/top-up
,
GET /x402/transactions/{wallet}
venice-x402

Utility

工具类

CategoryEndpointsSkill
Crypto RPC proxy
GET /crypto/rpc/networks
,
POST /crypto/rpc/{network}
venice-crypto-rpc
Augment
POST /augment/text-parser
,
/augment/scrape
,
/augment/search
venice-augment
分类端点Skill
加密货币RPC代理
GET /crypto/rpc/networks
,
POST /crypto/rpc/{network}
venice-crypto-rpc
增强工具
POST /augment/text-parser
,
/augment/scrape
,
/augment/search
venice-augment

Response headers to watch

需要关注的响应头

HeaderWhenMeaning
X-Balance-Remaining
x402 inference successUSDC credits left, e.g.
"4.230000"
X-RateLimit-Limit-*
/
X-RateLimit-Remaining-*
all inferenceyour current per-minute/day caps
PAYMENT-REQUIRED
402
on x402 inference
base64 JSON with top-up + SIWX challenge (x402 v2)
Content-Encoding
200
when client sent
Accept-Encoding: gzip, br
compression (embeddings, chat)
响应头触发场景含义
X-Balance-Remaining
x402推理请求成功时剩余USDC额度,例如
"4.230000"
X-RateLimit-Limit-*
/
X-RateLimit-Remaining-*
所有推理请求当前的每分钟/每日请求上限
PAYMENT-REQUIRED
x402推理请求返回
402
状态码时
包含充值说明和SIWX挑战的base64格式JSON(x402 v2)
Content-Encoding
客户端发送
Accept-Encoding: gzip, br
且返回
200
状态码时
压缩方式(适用于嵌入、对话接口)

Pricing model at a glance

定价模型概览

  • Pricing is dynamic per request, metered in USD.
  • Paid inference endpoints in the spec carry an
    x-payment-info
    block with
    min
    and
    max
    bounds in USD (typically
    min: 0.001
    ,
    max: 10.00
    ; higher for bulk video/audio). Read-only discovery routes like
    GET /models
    ,
    /models/traits
    , and
    /models/compatibility_mapping
    do not.
  • Pro (Bearer) accounts draw from DIEM (staked credits), USD balance, and bundled credits in priority order.
  • x402 (wallet) users draw from a prepaid USDC credit balance on Base.
  • The authoritative per-model price is on
    GET /models
    model_spec.pricing
    (when present — video models omit it; use
    /video/quote
    for video pricing) (see
    venice-models
    ).
  • 定价为按请求动态计算,以美元为计价单位。
  • 规范中的付费推理端点包含
    x-payment-info
    字段,标注了美元计价的最低和最高费用(通常
    min: 0.001
    max: 10.00
    ;批量视频/音频任务费用更高)。
    GET /models
    /models/traits
    /models/compatibility_mapping
    等只读发现接口无需付费。
  • 专业版(Bearer)账户将按优先级依次从DIEM(质押额度)美元余额套餐额度中扣费。
  • x402(钱包)用户从Base网络上的预付费USDC额度余额中扣费。
  • 每个模型的官方定价可通过
    GET /models
    接口的
    model_spec.pricing
    字段获取(部分视频模型未提供该字段;视频定价请使用
    /video/quote
    接口)(详见
    venice-models
    )。

Standard error shape

标准错误格式

Every error body follows one of:
json
{ "error": "Human-readable message" }
or, for 400 validation errors:
json
{ "error": "...", "details": { "fieldName": { "_errors": ["Field is required"] } } }
402
on x402 adds structured
topUpInstructions
and
siwxChallenge
. See
venice-errors
for the full table and retry strategy.
所有错误响应体均遵循以下格式之一:
json
{ "error": "Human-readable message" }
或者,针对400验证错误:
json
{ "error": "...", "details": { "fieldName": { "_errors": ["Field is required"] } } }
x402方式返回
402
状态码时,会额外包含结构化的
topUpInstructions
siwxChallenge
字段。完整的错误码表和重试策略请详见
venice-errors

OpenAI compatibility — what works and what doesn't

OpenAI兼容性说明——支持与不支持的功能

  • Drop-in:
    /chat/completions
    ,
    /embeddings
    ,
    /images/generations
    ,
    /audio/speech
    ,
    /audio/transcriptions
    ,
    /models
    .
  • Ignored but accepted for compat:
    user
    ,
    store
    .
  • Venice-only extensions live under:
    • venice_parameters
      (chat completions)
    • venice_parameters
      is rejected on
      /responses
      — use headers / native fields instead
  • Model feature suffixes (e.g.
    zai-org-glm-5-1:enable_web_search=on
    ,
    kimi-k2-6:strip_thinking_response=true&disable_thinking=true
    ) flip
    venice_parameters
    via the model ID — see
    venice-chat
    .
  • 即插即用:
    /chat/completions
    /embeddings
    /images/generations
    /audio/speech
    /audio/transcriptions
    /models
  • 为兼容而接受但会被忽略的参数:
    user
    store
  • Venice专属扩展参数位于:
    • venice_parameters
      (对话补全接口)
    • /responses
      接口不接受
      venice_parameters
      参数——请使用请求头或原生字段替代
  • 模型功能后缀(例如
    zai-org-glm-5-1:enable_web_search=on
    kimi-k2-6:strip_thinking_response=true&disable_thinking=true
    )可通过模型ID设置
    venice_parameters
    参数——详见
    venice-chat

Versioning

版本控制

  • info.version
    in
    swagger.yaml
    is a timestamp (
    YYYYMMDD.HHMMSS
    ). There is no
    /v2
    ; features roll forward on the single
    /api/v1
    surface and are guarded by:
    • Alpha/Beta tags in endpoint descriptions (e.g.
      /responses
      , Billing).
    • x-guidance
      / model capability flags on
      /models
      .
  • Always check the model's
    model_spec.capabilities
    from
    GET /models
    for feature flags (
    supportsWebSearch
    ,
    supportsReasoning
    ,
    supportsE2EE
    ,
    supportsXSearch
    ,
    supportsMultipleImages
    ,
    supportsFunctionCalling
    ,
    supportsAudioInput
    ,
    supportsVideoInput
    , …) before relying on a feature.
  • swagger.yaml
    中的
    info.version
    为时间戳格式(
    YYYYMMDD.HHMMSS
    )。平台不提供
    /v2
    版本;所有功能均在单一的
    /api/v1
    接口集上迭代更新,并通过以下方式区分:
    • 端点描述中的Alpha/Beta标签(例如
      /responses
      、计费接口)。
    • /models
      接口中的
      x-guidance
      /模型能力标识。
  • 在依赖某项功能前,请务必通过
    GET /models
    接口查看模型的
    model_spec.capabilities
    字段中的功能标识(如
    supportsWebSearch
    supportsReasoning
    supportsE2EE
    supportsXSearch
    supportsMultipleImages
    supportsFunctionCalling
    supportsAudioInput
    supportsVideoInput
    等)。

Fast start checklist

快速上手清单

  1. Read
    venice-auth
    and choose Bearer vs x402.
  2. GET /models
    — pick a model and note its
    model_spec.constraints
    and
    model_spec.pricing
    .
  3. Wire up one happy-path call from the matching skill.
  4. Add error handling using
    venice-errors
    (402, 422, 429).
  5. Hook up observability via
    X-Balance-Remaining
    /
    /billing/usage
    /
    /x402/transactions
    .
  1. 阅读
    venice-auth
    并选择Bearer或x402认证方式。
  2. 调用
    GET /models
    接口——选择一个模型并记录其
    model_spec.constraints
    model_spec.pricing
    信息。
  3. 根据对应的Skill编写一个正常流程的调用示例。
  4. 参考
    venice-errors
    添加错误处理逻辑(针对402、422、429状态码)。
  5. 通过
    X-Balance-Remaining
    响应头、
    /billing/usage
    /x402/transactions
    接口实现监控功能。