troubleshooting

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Troubleshooting

故障排查

Common issues, gotchas, and their solutions when working with the Gladia API.
SDK-first diagnostics: first verify the user is on the official SDK — many issues (polling, reconnection, retries) are solved automatically. See sdk-integration for setup and policy.
使用Gladia API时常见的问题、陷阱及解决方案。
优先使用SDK诊断:首先确认用户是否使用官方SDK——许多问题(轮询、重连、重试)会自动解决。查看sdk-integration了解设置和规范。

When to Use

使用场景

  • User encounters errors (401, 403, 429, invalid format, timeout) when calling the Gladia API
  • Unexpected behavior: poor transcription quality, missing words, wrong language
  • WebSocket disconnections, polling failures, or session hangs
  • Billing confusion (multi-channel, concurrency limits, plan restrictions)
  • Verifying that an integration is correctly configured before going to production
When NOT to use: For initial SDK setup and configuration, use sdk-integration. For feature-specific guidance (options, parameters, response structure), use pre-recorded-transcription or live-transcription.
  • 用户调用Gladia API时遇到错误(401、403、429、格式无效、超时)
  • 异常行为:转录质量差、漏词、语言识别错误
  • WebSocket断开连接、轮询失败或会话挂起
  • 计费困惑(多声道、并发限制、套餐限制)
  • 上线前验证集成配置是否正确
不适用场景:初始SDK设置与配置,请使用sdk-integration。功能特定指导(选项、参数、响应结构),请使用pre-recorded-transcriptionlive-transcription

References

参考资料

Consult these resources as needed:
  • ../sdk-integration/SKILL.md -- SDK setup, client config (retry, timeouts), and SDK vs raw API decision guide
  • ../sdk-integration/references/sdk-versions.md -- Current SDK versions (auto-synced by CI)
  • ../pre-recorded-transcription/SKILL.md -- Pre-recorded config options and limits
  • ../live-transcription/SKILL.md -- Live session config and WebSocket event handling
按需查阅以下资源:
  • ../sdk-integration/SKILL.md -- SDK设置、客户端配置(重试、超时)以及SDK与原生API的选择指南
  • ../sdk-integration/references/sdk-versions.md -- 当前SDK版本(由CI自动同步)
  • ../pre-recorded-transcription/SKILL.md -- 预录制转录的配置选项与限制
  • ../live-transcription/SKILL.md -- 实时会话配置与WebSocket事件处理

Authentication Errors

认证错误

401 Unauthorized

401 未授权

  • Cause: Invalid or missing API key
  • Fix (SDK): Verify the key is passed via the
    apiKey
    (JS) /
    api_key
    (Python) constructor option, or set the
    GLADIA_API_KEY
    environment variable
  • Fix (raw REST): Verify the key is passed in the
    x-gladia-key
    header
  • Check: Go to app.gladia.io → API keys → verify the key is active
  • 原因:API密钥无效或缺失
  • 修复(SDK):确认密钥通过
    apiKey
    (JS)/
    api_key
    (Python)构造函数选项传入,或设置
    GLADIA_API_KEY
    环境变量
  • 修复(原生REST):确认密钥在
    x-gladia-key
    请求头中传入
  • 检查:访问app.gladia.io → API密钥 → 验证密钥是否处于激活状态

403 Forbidden

403 禁止访问

  • Cause: Key doesn't have access to the requested feature or region
  • Fix: Check your plan tier; some features are plan-restricted
  • 原因:密钥无权限访问请求的功能或区域
  • 修复:检查你的套餐等级;部分功能受套餐限制

Rate Limiting and Concurrency

速率限制与并发

429 Too Many Requests

429 请求过多

Concurrency limits by plan:
PlanPre-recordedLiveNotes
Free3110 hrs/month total
Starter2530
Growth2530
EnterpriseUnlimitedUnlimited
Fix: Wait for in-progress jobs to complete before starting new ones, or upgrade your plan.
各套餐的并发限制:
套餐预录制转录实时转录备注
Free31每月总计10小时
Starter2530
Growth2530
Enterprise无限制无限制
修复:等待进行中的任务完成后再启动新任务,或升级套餐。

Common Gotchas

常见陷阱

1. Code switching without language list

1. 未指定语言列表时启用代码切换

Problem: Enabling
code_switching: true
with an empty
languages
array causes evaluation across 100+ languages and frequent misdetections.
Fix: Always provide 3-5 expected languages:
json
{
  "language_config": {
    "languages": ["en", "fr", "es"],
    "code_switching": true
  }
}
问题:启用
code_switching: true
languages
数组为空,会导致系统在100+种语言中进行识别,频繁出现误判。
修复:始终提供3-5种预期语言:
json
{
  "language_config": {
    "languages": ["en", "fr", "es"],
    "code_switching": true
  }
}

2. Custom vocabulary intensity too high

2. 自定义词汇强度过高

Problem:
intensity
values above 0.6 cause false positives where unrelated words get replaced by vocabulary entries.
Fix: Keep intensity at 0.4-0.6. Use
pronunciations
for better recognition instead of raising intensity:
json
{
  "vocabulary": [
    { "value": "Gladia", "pronunciations": ["gla-dee-ah"], "intensity": 0.5 }
  ]
}
问题
intensity
值超过0.6会导致误判,无关词汇被替换为词汇表中的条目。
修复:将强度保持在0.4-0.6之间。使用
pronunciations
提升识别效果,而非提高强度:
json
{
  "vocabulary": [
    { "value": "Gladia", "pronunciations": ["gla-dee-ah"], "intensity": 0.5 }
  ]
}

3. Audio exceeding duration limits silently

3. 音频时长超过限制无提示

Problem: Pre-recorded files over 135 minutes may fail without a clear error message.
Fix: Split long audio into chunks of ~60 minutes before uploading. For enterprise (4h15 limit), contact support.
问题:超过135分钟的预录制文件可能会失败,但无明确错误提示。
修复:上传前将长音频分割为约60分钟的片段。企业版(限制4小时15分钟)请联系支持团队。

4. Multi-channel billing surprise

4. 多声道计费意外

Problem: Sending 2-channel (stereo) audio is billed as 2x the duration.
Fix: Merge to mono if you don't need per-channel speaker identification. Only use multi-channel intentionally for distinct audio sources.
问题:发送双声道(立体声)音频会按2倍时长计费。
修复:如果不需要按声道识别说话人,合并为单声道。仅在需要区分不同音频源时才使用多声道。

5. WebSocket disconnection without recovery

5. WebSocket断开后无法恢复

Problem: If the WebSocket drops, creating a new session loses context.
Fix (SDK — recommended): The SDK handles reconnection automatically with configurable
wsRetry
. No action needed if using the SDK.
Fix (raw WebSocket): Reconnect to the same WebSocket URL to resume the session. Do NOT call
/v2/live
again.
问题:WebSocket断开后,创建新会话会丢失上下文。
修复(推荐使用SDK):SDK会通过可配置的
wsRetry
自动处理重连。使用SDK无需额外操作。
修复(原生WebSocket):重新连接到同一个WebSocket URL以恢复会话。请勿再次调用
/v2/live

6. Polling without backoff

6. 无退避策略的轮询

Problem: Rapidly polling
/v2/pre-recorded/:id
wastes requests and may trigger rate limits.
Fix (SDK — recommended): The SDK handles polling automatically. Use
transcribe()
which includes built-in backoff, or configure
poll()
directly:
typescript
const result = await client.preRecorded().transcribe(audio, options, {
  interval: 5000, // 5 seconds between polls
});
Fix (raw REST): Implement exponential backoff (start at 3s, max 30s), or use webhooks/callbacks instead.
问题:频繁轮询
/v2/pre-recorded/:id
会浪费请求,可能触发速率限制。
修复(推荐使用SDK):SDK会自动处理轮询。使用包含内置退避策略的
transcribe()
,或直接配置
poll()
typescript
const result = await client.preRecorded().transcribe(audio, options, {
  interval: 5000, // 轮询间隔5秒
});
修复(原生REST):实现指数退避(初始3秒,最大30秒),或改用webhook/回调。

7. Forgetting to stop recording

7. 忘记停止录制

Problem: Leaving a WebSocket open without sending
stop_recording
keeps the session hanging until the 3-hour timeout.
Fix: Always explicitly call
session.stopRecording()
(or
session.stop_recording()
in Python) when done. Implement cleanup in error handlers.
问题:未发送
stop_recording
就保持WebSocket连接,会话会一直挂起直到3小时超时。
修复:完成录制后务必显式调用
session.stopRecording()
(Python中为
session.stop_recording()
)。在错误处理程序中实现清理逻辑。

8. Partial transcripts not appearing

8. 未显示部分转录内容

Problem: Real-time results come only as final transcripts by default.
Fix: Enable partial transcripts in session config:
json
{
  "messages_config": {
    "receive_partial_transcripts": true
  }
}
问题:默认情况下,实时结果仅返回最终转录文本。
修复:在会话配置中启用部分转录:
json
{
  "messages_config": {
    "receive_partial_transcripts": true
  }
}

9. Expecting diarization in live mode

9. 期望实时模式支持说话人分离

Problem: Speaker diarization is only available for pre-recorded transcription.
Fix: For live multi-speaker scenarios, use multi-channel audio with one speaker per channel and track by channel number.
问题:说话人分离仅适用于预录制转录。
修复:对于实时多说话人场景,使用多声道音频,每个声道对应一位说话人,通过声道编号区分。

10. PII redaction in live mode

10. 实时模式下的PII脱敏

Problem:
pii_redaction: true
is silently ignored in live transcription.
Fix: PII redaction only works for pre-recorded. For live compliance needs, implement client-side redaction on the transcript text.
问题
pii_redaction: true
在实时转录中会被静默忽略。
修复:PII脱敏仅适用于预录制转录。对于实时合规需求,请在客户端对转录文本进行脱敏处理。

Audio Format Issues

音频格式问题

"Invalid audio format" error

"无效音频格式"错误

  • Verify
    encoding
    ,
    sample_rate
    ,
    bit_depth
    ,
    channels
    match your actual audio stream
  • Common mismatch: sending MP3 while declaring
    wav/pcm
  • For pre-recorded: format is auto-detected from the file; this error is live-specific
  • For supported formats and size limits, see pre-recorded-transcription and live-transcription
  • 验证
    encoding
    sample_rate
    bit_depth
    channels
    与实际音频流匹配
  • 常见不匹配情况:发送MP3但声明为
    wav/pcm
  • 预录制:格式会从文件中自动检测;此错误仅出现在实时场景
  • 支持的格式和大小限制,请查看pre-recorded-transcriptionlive-transcription

Transcription Quality Issues

转录质量问题

Poor accuracy

识别准确率低

  1. Check audio quality: Background noise, low volume, or heavy compression degrades results
  2. Enable audio enhancer:
    pre_processing.audio_enhancer: true
    (live)
  3. Specify languages: Always provide expected languages rather than relying on auto-detection
  4. Use custom vocabulary: For domain-specific terms (medical, legal, technical)
  5. Check sample rate: Higher sample rates (16kHz+) give better results than 8kHz
  1. 检查音频质量:背景噪音、低音量或重度压缩会降低结果质量
  2. 启用音频增强:设置
    pre_processing.audio_enhancer: true
    (实时模式)
  3. 指定语言:始终提供预期语言,而非依赖自动识别
  4. 使用自定义词汇:针对特定领域术语(医疗、法律、技术)
  5. 检查采样率:高采样率(16kHz+)比8kHz效果更好

Wrong language detected

语言识别错误

  1. Provide explicit
    languages
    list
  2. If multi-language, enable
    code_switching
    with 3-5 expected languages
  3. For single-language content, specify exactly one language and disable code switching
  1. 提供明确的
    languages
    列表
  2. 如果是多语言内容,启用
    code_switching
    并提供3-5种预期语言
  3. 如果是单语言内容,明确指定一种语言并禁用代码切换

Missing words or gaps

漏词或内容缺失

  1. Check for silence or very low volume sections
  2. Verify audio isn't corrupted (try playing it back)
  3. For live: ensure chunks are sent continuously without large gaps
  1. 检查是否存在静音或极低音量片段
  2. 验证音频未损坏(尝试播放)
  3. 实时场景:确保音频片段连续发送,无较大间隔

Webhook/Callback Issues

Webhook/回调问题

Callbacks not received

未收到回调

  1. Verify
    callback_url
    is publicly reachable (not localhost)
  2. Check your server returns 2xx within timeout
  3. Verify no firewall/WAF blocking incoming requests
  4. Test with a service like webhook.site first
  1. 验证
    callback_url
    可公开访问(非localhost)
  2. 检查服务器在超时时间内返回2xx状态码
  3. 验证无防火墙/WAF阻止入站请求
  4. 先使用webhook.site等服务测试

Webhook signature verification

Webhook签名验证

Webhooks are powered by Svix. Verify using the Svix libraries:
typescript
import { Webhook } from "svix";
const wh = new Webhook(webhookSecret);
wh.verify(payload, headers);
Webhook由Svix提供支持。使用Svix库进行验证:
typescript
import { Webhook } from "svix";
const wh = new Webhook(webhookSecret);
wh.verify(payload, headers);

Verification Checklist

验证清单

Before submitting transcription work:
  • Using the official Gladia SDK (if not, confirm there is a valid reason for raw API calls)
  • API key is valid and passed correctly (SDK constructor or
    GLADIA_API_KEY
    env var)
  • Audio file is under 1000 MB and within duration limits
  • Audio format is supported
  • If using code switching,
    languages
    list has 3-5 entries
  • If using custom vocabulary, intensity is 0.4-0.6
  • For live: session properly closed with
    stopRecording()
    /
    stop_recording()
  • For live: audio format config matches actual stream
  • Callbacks/webhooks are reachable if configured
  • Multi-channel audio is intentional
  • Error responses are handled (SDK throws typed errors; raw API returns
    status
    ,
    error_message
    )
提交转录任务前请检查:
  • 使用官方Gladia SDK(若未使用,确认原生API调用有合理理由)
  • API密钥有效且传入方式正确(SDK构造函数或
    GLADIA_API_KEY
    环境变量)
  • 音频文件小于1000 MB且在时长限制内
  • 音频格式受支持
  • 若启用代码切换,
    languages
    列表包含3-5种语言
  • 若使用自定义词汇,强度设置为0.4-0.6
  • 实时场景:会话已通过
    stopRecording()
    /
    stop_recording()
    正确关闭
  • 实时场景:音频格式配置与实际流匹配
  • 若配置了回调/webhook,确保其可访问
  • 使用多声道音频是有意为之
  • 已处理错误响应(SDK抛出类型化错误;原生API返回
    status
    error_message

Support Resources

支持资源