troubleshooting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTroubleshooting
故障排查
Common issues, gotchas, and their solutions when working with the Gladia API.
SDK-first diagnostics: first verify the user is on the official SDK — many issues (polling, reconnection, retries) are solved automatically. See sdk-integration for setup and policy.
使用Gladia API时常见的问题、陷阱及解决方案。
优先使用SDK诊断:首先确认用户是否使用官方SDK——许多问题(轮询、重连、重试)会自动解决。查看sdk-integration了解设置和规范。
When to Use
使用场景
- User encounters errors (401, 403, 429, invalid format, timeout) when calling the Gladia API
- Unexpected behavior: poor transcription quality, missing words, wrong language
- WebSocket disconnections, polling failures, or session hangs
- Billing confusion (multi-channel, concurrency limits, plan restrictions)
- Verifying that an integration is correctly configured before going to production
When NOT to use: For initial SDK setup and configuration, use sdk-integration. For feature-specific guidance (options, parameters, response structure), use pre-recorded-transcription or live-transcription.
- 用户调用Gladia API时遇到错误(401、403、429、格式无效、超时)
- 异常行为:转录质量差、漏词、语言识别错误
- WebSocket断开连接、轮询失败或会话挂起
- 计费困惑(多声道、并发限制、套餐限制)
- 上线前验证集成配置是否正确
不适用场景:初始SDK设置与配置,请使用sdk-integration。功能特定指导(选项、参数、响应结构),请使用pre-recorded-transcription或live-transcription。
References
参考资料
Consult these resources as needed:
- ../sdk-integration/SKILL.md -- SDK setup, client config (retry, timeouts), and SDK vs raw API decision guide
- ../sdk-integration/references/sdk-versions.md -- Current SDK versions (auto-synced by CI)
- ../pre-recorded-transcription/SKILL.md -- Pre-recorded config options and limits
- ../live-transcription/SKILL.md -- Live session config and WebSocket event handling
按需查阅以下资源:
- ../sdk-integration/SKILL.md -- SDK设置、客户端配置(重试、超时)以及SDK与原生API的选择指南
- ../sdk-integration/references/sdk-versions.md -- 当前SDK版本(由CI自动同步)
- ../pre-recorded-transcription/SKILL.md -- 预录制转录的配置选项与限制
- ../live-transcription/SKILL.md -- 实时会话配置与WebSocket事件处理
Authentication Errors
认证错误
401 Unauthorized
401 未授权
- Cause: Invalid or missing API key
- Fix (SDK): Verify the key is passed via the (JS) /
apiKey(Python) constructor option, or set theapi_keyenvironment variableGLADIA_API_KEY - Fix (raw REST): Verify the key is passed in the header
x-gladia-key - Check: Go to app.gladia.io → API keys → verify the key is active
- 原因:API密钥无效或缺失
- 修复(SDK):确认密钥通过(JS)/
apiKey(Python)构造函数选项传入,或设置api_key环境变量GLADIA_API_KEY - 修复(原生REST):确认密钥在请求头中传入
x-gladia-key - 检查:访问app.gladia.io → API密钥 → 验证密钥是否处于激活状态
403 Forbidden
403 禁止访问
- Cause: Key doesn't have access to the requested feature or region
- Fix: Check your plan tier; some features are plan-restricted
- 原因:密钥无权限访问请求的功能或区域
- 修复:检查你的套餐等级;部分功能受套餐限制
Rate Limiting and Concurrency
速率限制与并发
429 Too Many Requests
429 请求过多
Concurrency limits by plan:
| Plan | Pre-recorded | Live | Notes |
|---|---|---|---|
| Free | 3 | 1 | 10 hrs/month total |
| Starter | 25 | 30 | — |
| Growth | 25 | 30 | — |
| Enterprise | Unlimited | Unlimited | — |
Fix: Wait for in-progress jobs to complete before starting new ones, or upgrade your plan.
各套餐的并发限制:
| 套餐 | 预录制转录 | 实时转录 | 备注 |
|---|---|---|---|
| Free | 3 | 1 | 每月总计10小时 |
| Starter | 25 | 30 | — |
| Growth | 25 | 30 | — |
| Enterprise | 无限制 | 无限制 | — |
修复:等待进行中的任务完成后再启动新任务,或升级套餐。
Common Gotchas
常见陷阱
1. Code switching without language list
1. 未指定语言列表时启用代码切换
Problem: Enabling with an empty array causes evaluation across 100+ languages and frequent misdetections.
code_switching: truelanguagesFix: Always provide 3-5 expected languages:
json
{
"language_config": {
"languages": ["en", "fr", "es"],
"code_switching": true
}
}问题:启用但数组为空,会导致系统在100+种语言中进行识别,频繁出现误判。
code_switching: truelanguages修复:始终提供3-5种预期语言:
json
{
"language_config": {
"languages": ["en", "fr", "es"],
"code_switching": true
}
}2. Custom vocabulary intensity too high
2. 自定义词汇强度过高
Problem: values above 0.6 cause false positives where unrelated words get replaced by vocabulary entries.
intensityFix: Keep intensity at 0.4-0.6. Use for better recognition instead of raising intensity:
pronunciationsjson
{
"vocabulary": [
{ "value": "Gladia", "pronunciations": ["gla-dee-ah"], "intensity": 0.5 }
]
}问题:值超过0.6会导致误判,无关词汇被替换为词汇表中的条目。
intensity修复:将强度保持在0.4-0.6之间。使用提升识别效果,而非提高强度:
pronunciationsjson
{
"vocabulary": [
{ "value": "Gladia", "pronunciations": ["gla-dee-ah"], "intensity": 0.5 }
]
}3. Audio exceeding duration limits silently
3. 音频时长超过限制无提示
Problem: Pre-recorded files over 135 minutes may fail without a clear error message.
Fix: Split long audio into chunks of ~60 minutes before uploading. For enterprise (4h15 limit), contact support.
问题:超过135分钟的预录制文件可能会失败,但无明确错误提示。
修复:上传前将长音频分割为约60分钟的片段。企业版(限制4小时15分钟)请联系支持团队。
4. Multi-channel billing surprise
4. 多声道计费意外
Problem: Sending 2-channel (stereo) audio is billed as 2x the duration.
Fix: Merge to mono if you don't need per-channel speaker identification. Only use multi-channel intentionally for distinct audio sources.
问题:发送双声道(立体声)音频会按2倍时长计费。
修复:如果不需要按声道识别说话人,合并为单声道。仅在需要区分不同音频源时才使用多声道。
5. WebSocket disconnection without recovery
5. WebSocket断开后无法恢复
Problem: If the WebSocket drops, creating a new session loses context.
Fix (SDK — recommended): The SDK handles reconnection automatically with configurable . No action needed if using the SDK.
wsRetryFix (raw WebSocket): Reconnect to the same WebSocket URL to resume the session. Do NOT call again.
/v2/live问题:WebSocket断开后,创建新会话会丢失上下文。
修复(推荐使用SDK):SDK会通过可配置的自动处理重连。使用SDK无需额外操作。
wsRetry修复(原生WebSocket):重新连接到同一个WebSocket URL以恢复会话。请勿再次调用。
/v2/live6. Polling without backoff
6. 无退避策略的轮询
Problem: Rapidly polling wastes requests and may trigger rate limits.
/v2/pre-recorded/:idFix (SDK — recommended): The SDK handles polling automatically. Use which includes built-in backoff, or configure directly:
transcribe()poll()typescript
const result = await client.preRecorded().transcribe(audio, options, {
interval: 5000, // 5 seconds between polls
});Fix (raw REST): Implement exponential backoff (start at 3s, max 30s), or use webhooks/callbacks instead.
问题:频繁轮询会浪费请求,可能触发速率限制。
/v2/pre-recorded/:id修复(推荐使用SDK):SDK会自动处理轮询。使用包含内置退避策略的,或直接配置:
transcribe()poll()typescript
const result = await client.preRecorded().transcribe(audio, options, {
interval: 5000, // 轮询间隔5秒
});修复(原生REST):实现指数退避(初始3秒,最大30秒),或改用webhook/回调。
7. Forgetting to stop recording
7. 忘记停止录制
Problem: Leaving a WebSocket open without sending keeps the session hanging until the 3-hour timeout.
stop_recordingFix: Always explicitly call (or in Python) when done. Implement cleanup in error handlers.
session.stopRecording()session.stop_recording()问题:未发送就保持WebSocket连接,会话会一直挂起直到3小时超时。
stop_recording修复:完成录制后务必显式调用(Python中为)。在错误处理程序中实现清理逻辑。
session.stopRecording()session.stop_recording()8. Partial transcripts not appearing
8. 未显示部分转录内容
Problem: Real-time results come only as final transcripts by default.
Fix: Enable partial transcripts in session config:
json
{
"messages_config": {
"receive_partial_transcripts": true
}
}问题:默认情况下,实时结果仅返回最终转录文本。
修复:在会话配置中启用部分转录:
json
{
"messages_config": {
"receive_partial_transcripts": true
}
}9. Expecting diarization in live mode
9. 期望实时模式支持说话人分离
Problem: Speaker diarization is only available for pre-recorded transcription.
Fix: For live multi-speaker scenarios, use multi-channel audio with one speaker per channel and track by channel number.
问题:说话人分离仅适用于预录制转录。
修复:对于实时多说话人场景,使用多声道音频,每个声道对应一位说话人,通过声道编号区分。
10. PII redaction in live mode
10. 实时模式下的PII脱敏
Problem: is silently ignored in live transcription.
pii_redaction: trueFix: PII redaction only works for pre-recorded. For live compliance needs, implement client-side redaction on the transcript text.
问题:在实时转录中会被静默忽略。
pii_redaction: true修复:PII脱敏仅适用于预录制转录。对于实时合规需求,请在客户端对转录文本进行脱敏处理。
Audio Format Issues
音频格式问题
"Invalid audio format" error
"无效音频格式"错误
- Verify ,
encoding,sample_rate,bit_depthmatch your actual audio streamchannels - Common mismatch: sending MP3 while declaring
wav/pcm - For pre-recorded: format is auto-detected from the file; this error is live-specific
- For supported formats and size limits, see pre-recorded-transcription and live-transcription
- 验证、
encoding、sample_rate、bit_depth与实际音频流匹配channels - 常见不匹配情况:发送MP3但声明为
wav/pcm - 预录制:格式会从文件中自动检测;此错误仅出现在实时场景
- 支持的格式和大小限制,请查看pre-recorded-transcription和live-transcription
Transcription Quality Issues
转录质量问题
Poor accuracy
识别准确率低
- Check audio quality: Background noise, low volume, or heavy compression degrades results
- Enable audio enhancer: (live)
pre_processing.audio_enhancer: true - Specify languages: Always provide expected languages rather than relying on auto-detection
- Use custom vocabulary: For domain-specific terms (medical, legal, technical)
- Check sample rate: Higher sample rates (16kHz+) give better results than 8kHz
- 检查音频质量:背景噪音、低音量或重度压缩会降低结果质量
- 启用音频增强:设置(实时模式)
pre_processing.audio_enhancer: true - 指定语言:始终提供预期语言,而非依赖自动识别
- 使用自定义词汇:针对特定领域术语(医疗、法律、技术)
- 检查采样率:高采样率(16kHz+)比8kHz效果更好
Wrong language detected
语言识别错误
- Provide explicit list
languages - If multi-language, enable with 3-5 expected languages
code_switching - For single-language content, specify exactly one language and disable code switching
- 提供明确的列表
languages - 如果是多语言内容,启用并提供3-5种预期语言
code_switching - 如果是单语言内容,明确指定一种语言并禁用代码切换
Missing words or gaps
漏词或内容缺失
- Check for silence or very low volume sections
- Verify audio isn't corrupted (try playing it back)
- For live: ensure chunks are sent continuously without large gaps
- 检查是否存在静音或极低音量片段
- 验证音频未损坏(尝试播放)
- 实时场景:确保音频片段连续发送,无较大间隔
Webhook/Callback Issues
Webhook/回调问题
Callbacks not received
未收到回调
- Verify is publicly reachable (not localhost)
callback_url - Check your server returns 2xx within timeout
- Verify no firewall/WAF blocking incoming requests
- Test with a service like webhook.site first
- 验证可公开访问(非localhost)
callback_url - 检查服务器在超时时间内返回2xx状态码
- 验证无防火墙/WAF阻止入站请求
- 先使用webhook.site等服务测试
Webhook signature verification
Webhook签名验证
Webhooks are powered by Svix. Verify using the Svix libraries:
typescript
import { Webhook } from "svix";
const wh = new Webhook(webhookSecret);
wh.verify(payload, headers);Webhook由Svix提供支持。使用Svix库进行验证:
typescript
import { Webhook } from "svix";
const wh = new Webhook(webhookSecret);
wh.verify(payload, headers);Verification Checklist
验证清单
Before submitting transcription work:
- Using the official Gladia SDK (if not, confirm there is a valid reason for raw API calls)
- API key is valid and passed correctly (SDK constructor or env var)
GLADIA_API_KEY - Audio file is under 1000 MB and within duration limits
- Audio format is supported
- If using code switching, list has 3-5 entries
languages - If using custom vocabulary, intensity is 0.4-0.6
- For live: session properly closed with /
stopRecording()stop_recording() - For live: audio format config matches actual stream
- Callbacks/webhooks are reachable if configured
- Multi-channel audio is intentional
- Error responses are handled (SDK throws typed errors; raw API returns ,
status)error_message
提交转录任务前请检查:
- 使用官方Gladia SDK(若未使用,确认原生API调用有合理理由)
- API密钥有效且传入方式正确(SDK构造函数或环境变量)
GLADIA_API_KEY - 音频文件小于1000 MB且在时长限制内
- 音频格式受支持
- 若启用代码切换,列表包含3-5种语言
languages - 若使用自定义词汇,强度设置为0.4-0.6
- 实时场景:会话已通过/
stopRecording()正确关闭stop_recording() - 实时场景:音频格式配置与实际流匹配
- 若配置了回调/webhook,确保其可访问
- 使用多声道音频是有意为之
- 已处理错误响应(SDK抛出类型化错误;原生API返回、
status)error_message