z-ai-api
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseZ.ai API Skill
Z.ai API 技能
Quick Reference
快速参考
Base URL:
Coding Plan URL:
Auth:
https://api.z.ai/api/paas/v4https://api.z.ai/api/coding/paas/v4Authorization: Bearer YOUR_API_KEYBase URL:
Coding Plan URL:
Auth:
https://api.z.ai/api/paas/v4https://api.z.ai/api/coding/paas/v4Authorization: Bearer YOUR_API_KEYCore Endpoints
核心接口
| Endpoint | Purpose |
|---|---|
| Text/vision chat |
| Image generation |
| Video generation (async) |
| Speech-to-text |
| Web search |
| Poll async tasks |
| Translation, slides, effects |
| 接口 | 用途 |
|---|---|
| 文本/视觉对话 |
| 图像生成 |
| 视频生成(异步) |
| 语音转文字 |
| 网页搜索 |
| 轮询异步任务 |
| 翻译、幻灯片、特效 |
Model Selection
模型选择
Chat (pick by need):
- — Latest flagship, best quality, agentic coding
glm-4.7 - — Fast, high quality
glm-4.7-flash - — Reliable general use
glm-4.6 - — Fastest, lower cost
glm-4.5-flash
Vision:
- — Best multimodal (images, video, files)
glm-4.6v - — Fast vision
glm-4.6v-flash
Media:
- — High-quality images (HD, ~20s)
glm-image - — Fast images (~5-10s)
cogview-4-250304 - — Video, up to 4K, 5-10s
cogvideox-3 - — Vidu video generation
viduq1-text/image
对话(按需选择):
- — 最新旗舰模型,质量最优,支持Agent式编码
glm-4.7 - — 速度快,质量高
glm-4.7-flash - — 可靠的通用模型
glm-4.6 - — 速度最快,成本更低
glm-4.5-flash
视觉:
- — 最佳多模态模型(支持图像、视频、文件)
glm-4.6v - — 高速视觉模型
glm-4.6v-flash
媒体:
- — 高质量图像(高清,约20秒生成)
glm-image - — 快速生成图像(约5-10秒)
cogview-4-250304 - — 视频生成,最高4K分辨率,时长5-10秒
cogvideox-3 - — Vidu视频生成
viduq1-text/image
Implementation Patterns
实现模式
Basic Chat
基础对话
python
from zai import ZaiClient
client = ZaiClient(api_key="YOUR_KEY")
response = client.chat.completions.create(
model="glm-4.7",
messages=[
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)python
from zai import ZaiClient
client = ZaiClient(api_key="YOUR_KEY")
response = client.chat.completions.create(
model="glm-4.7",
messages=[
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)OpenAI SDK Compatibility
OpenAI SDK 兼容性
python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_ZAI_KEY",
base_url="https://api.z.ai/api/paas/v4/"
)python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_ZAI_KEY",
base_url="https://api.z.ai/api/paas/v4/"
)Use exactly like OpenAI SDK
Use exactly like OpenAI SDK
undefinedundefinedStreaming
流式输出
python
response = client.chat.completions.create(
model="glm-4.7",
messages=[...],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")python
response = client.chat.completions.create(
model="glm-4.7",
messages=[...],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")Function Calling
函数调用
python
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
}]
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
tools=tools,
tool_choice="auto"
)python
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
}]
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
tools=tools,
tool_choice="auto"
)Handle tool_calls in response.choices[0].message.tool_calls
Handle tool_calls in response.choices[0].message.tool_calls
undefinedundefinedVision (Images/Video/Files)
视觉(图像/视频/文件)
python
response = client.chat.completions.create(
model="glm-4.6v",
messages=[{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": "https://..."}},
{"type": "text", "text": "Describe this image"}
]
}]
)python
response = client.chat.completions.create(
model="glm-4.6v",
messages=[{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": "https://..."}},
{"type": "text", "text": "Describe this image"}
]
}]
)Image Generation
图像生成
python
response = client.images.generate(
model="glm-image",
prompt="A serene mountain at sunset",
size="1280x1280",
quality="hd"
)
print(response.data[0].url) # Expires in 30 dayspython
response = client.images.generate(
model="glm-image",
prompt="A serene mountain at sunset",
size="1280x1280",
quality="hd"
)
print(response.data[0].url) # Expires in 30 daysVideo Generation (Async)
视频生成(异步)
python
undefinedpython
undefinedSubmit
Submit
response = client.videos.generate(
model="cogvideox-3",
prompt="A cat playing with yarn",
size="1920x1080",
duration=5
)
task_id = response.id
response = client.videos.generate(
model="cogvideox-3",
prompt="A cat playing with yarn",
size="1920x1080",
duration=5
)
task_id = response.id
Poll for result
Poll for result
import time
while True:
result = client.async_result.get(task_id)
if result.task_status == "SUCCESS":
print(result.video_result[0].url)
break
time.sleep(5)
undefinedimport time
while True:
result = client.async_result.get(task_id)
if result.task_status == "SUCCESS":
print(result.video_result[0].url)
break
time.sleep(5)
undefinedWeb Search Integration
网页搜索集成
python
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Latest AI news?"}],
tools=[{
"type": "web_search",
"web_search": {
"enable": True,
"search_result": True
}
}]
)python
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Latest AI news?"}],
tools=[{
"type": "web_search",
"web_search": {
"enable": True,
"search_result": True
}
}]
)Access response.web_search for sources
Access response.web_search for sources
undefinedundefinedThinking Mode (Chain-of-Thought)
思维链模式
python
response = client.chat.completions.create(
model="glm-4.7",
messages=[...],
thinking={"type": "enabled"},
stream=True # Recommended with thinking
)python
response = client.chat.completions.create(
model="glm-4.7",
messages=[...],
thinking={"type": "enabled"},
stream=True # Recommended with thinking
)Access reasoning_content in response
Access reasoning_content in response
undefinedundefinedKey Parameters
关键参数
| Parameter | Values | Notes |
|---|---|---|
| 0.0-1.0 | GLM-4.7: 1.0, GLM-4.5: 0.6 default |
| 0.01-1.0 | Default ~0.95 |
| varies | GLM-4.7: 128K, GLM-4.5: 96K max |
| bool | Enable SSE streaming |
| | Force JSON output |
| 参数 | 取值 | 说明 |
|---|---|---|
| 0.0-1.0 | GLM-4.7默认值1.0,GLM-4.5默认值0.6 |
| 0.01-1.0 | 默认值约0.95 |
| 可变 | GLM-4.7最大128K,GLM-4.5最大96K |
| 布尔值 | 启用SSE流式输出 |
| | 强制输出JSON格式 |
Error Handling
错误处理
- 429: Rate limited — implement exponential backoff
- 401: Bad API key — verify credentials
- sensitive: Content filtered — modify input
python
if response.choices[0].finish_reason == "tool_calls":
# Execute function and continue conversation
elif response.choices[0].finish_reason == "length":
# Increase max_tokens or truncate
elif response.choices[0].finish_reason == "sensitive":
# Content was filtered- 429: 请求频率超限 — 实现指数退避策略
- 401: API密钥无效 — 验证凭据
- sensitive: 内容被过滤 — 修改输入内容
python
if response.choices[0].finish_reason == "tool_calls":
# Execute function and continue conversation
elif response.choices[0].finish_reason == "length":
# Increase max_tokens or truncate
elif response.choices[0].finish_reason == "sensitive":
# Content was filteredReference Files
参考文档
For detailed API specifications, consult:
- — Full chat API, parameters, models
references/chat-completions.md - — Function calling, web search, retrieval
references/tools-and-functions.md - — Image, video, audio APIs
references/media-generation.md - — Translation, slides, effects agents
references/agents.md - — Error handling, rate limits
references/error-codes.md
如需详细API规格,请参考:
- — 完整对话API、参数、模型说明
references/chat-completions.md - — 函数调用、网页搜索、检索功能
references/tools-and-functions.md - — 图像、视频、音频API
references/media-generation.md - — 翻译、幻灯片、特效Agent
references/agents.md - — 错误处理、请求频率限制
references/error-codes.md