agentbox-inference

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LLM Inference

LLM推理

Paid OpenAI-compatible chat completions API at
https://inference.x402.agentbox.fyi
. Costs $0.001-$0.003 USDC per call via x402 on Solana. Use the
x_payment
tool for all requests.
付费的兼容OpenAI的聊天补全API,地址为
https://inference.x402.agentbox.fyi
。通过Solana上的x402进行支付,每次调用费用为$0.001-$0.003 USDC。所有请求均需使用
x_payment
工具。

Endpoint

端点

Chat Completions

聊天补全

Generate a chat completion from a supported model.
x_payment({
  "url": "https://inference.x402.agentbox.fyi/v1/chat/completions",
  "method": "POST",
  "body": "{\"model\": \"moonshotai/kimi-k2.5\", \"messages\": [{\"role\": \"user\", \"content\": \"Explain x402 in one sentence\"}]}"
})
Body Parameters:
ParamTypeRequiredDescription
modelstringyesModel ID (see table below)
messagesarrayyesArray of
{role, content}
objects
max_tokensintegernoMaximum tokens to generate
temperaturenumbernoSampling temperature (0-2)
top_pnumbernoNucleus sampling (0-1)
Message roles:
system
,
user
,
assistant
从支持的模型生成聊天补全结果。
x_payment({
  "url": "https://inference.x402.agentbox.fyi/v1/chat/completions",
  "method": "POST",
  "body": "{\"model\": \"moonshotai/kimi-k2.5\", \"messages\": [{\"role\": \"user\", \"content\": \"Explain x402 in one sentence\"}]}"
})
请求体参数:
参数类型是否必填描述
model字符串模型ID(见下表)
messages数组包含
{role, content}
对象的数组
max_tokens整数生成的最大tokens数量
temperature数字采样温度(0-2)
top_p数字核采样(0-1)
消息角色:
system
user
assistant

Models & Pricing

模型与定价

ModelCost/callBest for
moonshotai/kimi-k2.5
$0.003High-quality output, large context (262K)
minimax/minimax-m2.5
$0.002Balanced quality/cost
模型每次调用费用适用场景
moonshotai/kimi-k2.5
$0.003高质量输出、大上下文(262K)
minimax/minimax-m2.5
$0.002平衡质量与成本

Usage Patterns

使用模式

Simple question

简单问题

x_payment({
  "url": "https://inference.x402.agentbox.fyi/v1/chat/completions",
  "method": "POST",
  "body": "{\"model\": \"moonshotai/kimi-k2.5\", \"messages\": [{\"role\": \"user\", \"content\": \"What is the x402 protocol?\"}]}"
})
x_payment({
  "url": "https://inference.x402.agentbox.fyi/v1/chat/completions",
  "method": "POST",
  "body": "{\"model\": \"moonshotai/kimi-k2.5\", \"messages\": [{\"role\": \"user\", \"content\": \"What is the x402 protocol?\"}]}"
})

With system prompt and parameters

带系统提示词与参数

x_payment({
  "url": "https://inference.x402.agentbox.fyi/v1/chat/completions",
  "method": "POST",
  "body": "{\"model\": \"moonshotai/kimi-k2.5\", \"messages\": [{\"role\": \"system\", \"content\": \"You are a concise technical writer.\"}, {\"role\": \"user\", \"content\": \"Write a summary of Solana's transaction model\"}], \"max_tokens\": 500, \"temperature\": 0.7}"
})
x_payment({
  "url": "https://inference.x402.agentbox.fyi/v1/chat/completions",
  "method": "POST",
  "body": "{\"model\": \"moonshotai/kimi-k2.5\", \"messages\": [{\"role\": \"system\", \"content\": \"You are a concise technical writer.\"}, {\"role\": \"user\", \"content\": \"Write a summary of Solana's transaction model\"}], \"max_tokens\": 500, \"temperature\": 0.7}"
})

Response Format

响应格式

Standard OpenAI chat completion response:
json
{
  "id": "gen-...",
  "object": "chat.completion",
  "model": "moonshotai/kimi-k2.5",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "..." },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 42,
    "total_tokens": 54
  }
}
标准的OpenAI聊天补全响应:
json
{
  "id": "gen-...",
  "object": "chat.completion",
  "model": "moonshotai/kimi-k2.5",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "..." },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 42,
    "total_tokens": 54
  }
}

Errors

错误说明

HTTPMeaning
400Invalid request (check model name and messages format)
402Payment required (handled automatically by x_payment)
502Upstream provider error
HTTP状态码含义
400请求无效(检查模型名称和消息格式)
402需要支付(由x_payment自动处理)
502上游服务商错误

Cost

费用说明

Flat rate per model per call. Price is determined by the
model
field in the request body. Each call is independent - no sessions or state.
按模型按次收取固定费用。价格由请求体中的
model
字段决定。每次调用相互独立——无会话或状态留存。