deepgram-prod-checklist
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDeepgram Production Checklist
Deepgram生产检查清单
Overview
概述
Comprehensive checklist for deploying Deepgram integrations to production.
用于将Deepgram集成部署到生产环境的综合检查清单。
Pre-Deployment Checklist
预部署检查清单
API Configuration
API配置
- Production API key created and stored securely
- API key has appropriate scopes (minimal permissions)
- Key expiration set (recommended: 90 days)
- Fallback/backup key available
- Rate limits understood and planned for
- 创建并安全存储生产环境API密钥
- API密钥配置了合适的权限范围(最小权限原则)
- 设置了密钥过期时间(推荐:90天)
- 准备好备用API密钥
- 了解并规划了API速率限制
Error Handling
错误处理
- All API errors caught and logged
- Retry logic implemented with exponential backoff
- Circuit breaker pattern in place
- Fallback behavior defined for API failures
- User-friendly error messages configured
- 捕获并记录所有API错误
- 实现带指数退避的重试逻辑
- 配置断路器模式
- 定义API故障时的回退行为
- 设置用户友好的错误提示信息
Performance
性能优化
- Connection pooling configured
- Request timeouts set appropriately
- Concurrent request limits configured
- Audio preprocessing optimized
- Response caching implemented where applicable
- 配置连接池
- 设置合适的请求超时时间
- 配置并发请求限制
- 优化音频预处理流程
- 在合适的场景实现响应缓存
Security
安全配置
- API keys in secret manager (not environment variables in code)
- HTTPS enforced for all requests
- Input validation on audio URLs
- Sensitive data redaction configured
- Audit logging enabled
- API密钥存储在密钥管理系统中(而非代码中的环境变量)
- 强制所有请求使用HTTPS
- 对音频URL进行输入验证
- 配置敏感数据脱敏规则
- 启用审计日志
Monitoring
监控配置
- Health check endpoint implemented
- Metrics collection configured
- Alerting rules defined
- Dashboard created
- Log aggregation set up
- 实现健康检查端点
- 配置指标收集
- 定义告警规则
- 创建监控仪表盘
- 设置日志聚合
Documentation
文档准备
- API integration documented
- Runbooks created
- On-call procedures defined
- Escalation path established
- 完成API集成文档
- 创建运行手册
- 定义随叫随到流程
- 建立问题升级路径
Production Configuration
生产环境配置
TypeScript Production Client
TypeScript生产客户端
typescript
// lib/deepgram-production.ts
import { createClient, DeepgramClient } from '@deepgram/sdk';
import { getSecret } from './secrets';
import { metrics } from './metrics';
import { logger } from './logger';
interface ProductionConfig {
timeout: number;
retries: number;
model: string;
}
const config: ProductionConfig = {
timeout: 30000,
retries: 3,
model: 'nova-2',
};
let client: DeepgramClient | null = null;
export async function getProductionClient(): Promise<DeepgramClient> {
if (client) return client;
const apiKey = await getSecret('DEEPGRAM_API_KEY');
client = createClient(apiKey, {
global: {
fetch: {
options: {
timeout: config.timeout,
},
},
},
});
return client;
}
export async function transcribeProduction(
audioUrl: string,
options: { language?: string; callback?: string } = {}
) {
const startTime = Date.now();
const requestId = crypto.randomUUID();
logger.info('Starting transcription', { requestId, audioUrl: sanitize(audioUrl) });
try {
const deepgram = await getProductionClient();
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: config.model,
language: options.language || 'en',
smart_format: true,
punctuate: true,
callback: options.callback,
}
);
const duration = Date.now() - startTime;
metrics.histogram('deepgram.transcription.duration', duration);
if (error) {
metrics.increment('deepgram.transcription.error');
logger.error('Transcription failed', { requestId, error: error.message });
throw new Error(error.message);
}
metrics.increment('deepgram.transcription.success');
logger.info('Transcription complete', {
requestId,
deepgramRequestId: result.metadata?.request_id,
duration,
});
return result;
} catch (err) {
metrics.increment('deepgram.transcription.exception');
logger.error('Transcription exception', {
requestId,
error: err instanceof Error ? err.message : 'Unknown error',
});
throw err;
}
}
function sanitize(url: string): string {
try {
const parsed = new URL(url);
return `${parsed.protocol}//${parsed.host}${parsed.pathname}`;
} catch {
return '[invalid-url]';
}
}typescript
// lib/deepgram-production.ts
import { createClient, DeepgramClient } from '@deepgram/sdk';
import { getSecret } from './secrets';
import { metrics } from './metrics';
import { logger } from './logger';
interface ProductionConfig {
timeout: number;
retries: number;
model: string;
}
const config: ProductionConfig = {
timeout: 30000,
retries: 3,
model: 'nova-2',
};
let client: DeepgramClient | null = null;
export async function getProductionClient(): Promise<DeepgramClient> {
if (client) return client;
const apiKey = await getSecret('DEEPGRAM_API_KEY');
client = createClient(apiKey, {
global: {
fetch: {
options: {
timeout: config.timeout,
},
},
},
});
return client;
}
export async function transcribeProduction(
audioUrl: string,
options: { language?: string; callback?: string } = {}
) {
const startTime = Date.now();
const requestId = crypto.randomUUID();
logger.info('Starting transcription', { requestId, audioUrl: sanitize(audioUrl) });
try {
const deepgram = await getProductionClient();
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: config.model,
language: options.language || 'en',
smart_format: true,
punctuate: true,
callback: options.callback,
}
);
const duration = Date.now() - startTime;
metrics.histogram('deepgram.transcription.duration', duration);
if (error) {
metrics.increment('deepgram.transcription.error');
logger.error('Transcription failed', { requestId, error: error.message });
throw new Error(error.message);
}
metrics.increment('deepgram.transcription.success');
logger.info('Transcription complete', {
requestId,
deepgramRequestId: result.metadata?.request_id,
duration,
});
return result;
} catch (err) {
metrics.increment('deepgram.transcription.exception');
logger.error('Transcription exception', {
requestId,
error: err instanceof Error ? err.message : 'Unknown error',
});
throw err;
}
}
function sanitize(url: string): string {
try {
const parsed = new URL(url);
return `${parsed.protocol}//${parsed.host}${parsed.pathname}`;
} catch {
return '[invalid-url]';
}
}Health Check Endpoint
健康检查端点
typescript
// routes/health.ts
import { getProductionClient } from '../lib/deepgram-production';
interface HealthStatus {
status: 'healthy' | 'degraded' | 'unhealthy';
timestamp: string;
checks: {
deepgram: {
status: 'pass' | 'fail';
latency?: number;
message?: string;
};
};
}
export async function healthCheck(): Promise<HealthStatus> {
const checks: HealthStatus['checks'] = {
deepgram: { status: 'fail' },
};
// Test Deepgram API
const startTime = Date.now();
try {
const client = await getProductionClient();
const { error } = await client.manage.getProjects();
checks.deepgram = {
status: error ? 'fail' : 'pass',
latency: Date.now() - startTime,
message: error?.message,
};
} catch (err) {
checks.deepgram = {
status: 'fail',
latency: Date.now() - startTime,
message: err instanceof Error ? err.message : 'Unknown error',
};
}
const allPassing = Object.values(checks).every(c => c.status === 'pass');
const anyFailing = Object.values(checks).some(c => c.status === 'fail');
return {
status: allPassing ? 'healthy' : anyFailing ? 'unhealthy' : 'degraded',
timestamp: new Date().toISOString(),
checks,
};
}typescript
// routes/health.ts
import { getProductionClient } from '../lib/deepgram-production';
interface HealthStatus {
status: 'healthy' | 'degraded' | 'unhealthy';
timestamp: string;
checks: {
deepgram: {
status: 'pass' | 'fail';
latency?: number;
message?: string;
};
};
}
export async function healthCheck(): Promise<HealthStatus> {
const checks: HealthStatus['checks'] = {
deepgram: { status: 'fail' },
};
// Test Deepgram API
const startTime = Date.now();
try {
const client = await getProductionClient();
const { error } = await client.manage.getProjects();
checks.deepgram = {
status: error ? 'fail' : 'pass',
latency: Date.now() - startTime,
message: error?.message,
};
} catch (err) {
checks.deepgram = {
status: 'fail',
latency: Date.now() - startTime,
message: err instanceof Error ? err.message : 'Unknown error',
};
}
const allPassing = Object.values(checks).every(c => c.status === 'pass');
const anyFailing = Object.values(checks).some(c => c.status === 'fail');
return {
status: allPassing ? 'healthy' : anyFailing ? 'unhealthy' : 'degraded',
timestamp: new Date().toISOString(),
checks,
};
}Production Metrics
生产环境指标
typescript
// lib/metrics.ts
import { Counter, Histogram, Registry } from 'prom-client';
export const registry = new Registry();
export const transcriptionDuration = new Histogram({
name: 'deepgram_transcription_duration_seconds',
help: 'Duration of Deepgram transcription requests',
labelNames: ['status', 'model'],
buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60],
registers: [registry],
});
export const transcriptionTotal = new Counter({
name: 'deepgram_transcription_total',
help: 'Total number of transcription requests',
labelNames: ['status', 'error_code'],
registers: [registry],
});
export const audioProcessedSeconds = new Counter({
name: 'deepgram_audio_processed_seconds_total',
help: 'Total seconds of audio processed',
registers: [registry],
});
export const rateLimitHits = new Counter({
name: 'deepgram_rate_limit_hits_total',
help: 'Number of rate limit errors encountered',
registers: [registry],
});
export const metrics = {
recordTranscription(status: 'success' | 'error', duration: number, audioSeconds?: number) {
transcriptionDuration.labels(status, 'nova-2').observe(duration / 1000);
transcriptionTotal.labels(status, '').inc();
if (audioSeconds) {
audioProcessedSeconds.inc(audioSeconds);
}
},
recordRateLimitHit() {
rateLimitHits.inc();
},
};typescript
// lib/metrics.ts
import { Counter, Histogram, Registry } from 'prom-client';
export const registry = new Registry();
export const transcriptionDuration = new Histogram({
name: 'deepgram_transcription_duration_seconds',
help: 'Duration of Deepgram transcription requests',
labelNames: ['status', 'model'],
buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60],
registers: [registry],
});
export const transcriptionTotal = new Counter({
name: 'deepgram_transcription_total',
help: 'Total number of transcription requests',
labelNames: ['status', 'error_code'],
registers: [registry],
});
export const audioProcessedSeconds = new Counter({
name: 'deepgram_audio_processed_seconds_total',
help: 'Total seconds of audio processed',
registers: [registry],
});
export const rateLimitHits = new Counter({
name: 'deepgram_rate_limit_hits_total',
help: 'Number of rate limit errors encountered',
registers: [registry],
});
export const metrics = {
recordTranscription(status: 'success' | 'error', duration: number, audioSeconds?: number) {
transcriptionDuration.labels(status, 'nova-2').observe(duration / 1000);
transcriptionTotal.labels(status, '').inc();
if (audioSeconds) {
audioProcessedSeconds.inc(audioSeconds);
}
},
recordRateLimitHit() {
rateLimitHits.inc();
},
});Alerting Configuration
告警规则配置
yaml
undefinedyaml
undefinedprometheus/alerts/deepgram.yml
prometheus/alerts/deepgram.yml
groups:
- name: deepgram
rules:
-
alert: DeepgramHighErrorRate expr: | sum(rate(deepgram_transcription_total{status="error"}[5m])) / sum(rate(deepgram_transcription_total[5m])) > 0.05 for: 5m labels: severity: critical annotations: summary: High Deepgram error rate description: Error rate is above 5% for the last 5 minutes
-
alert: DeepgramHighLatency expr: | histogram_quantile(0.95, sum(rate(deepgram_transcription_duration_seconds_bucket[5m])) by (le) ) > 10 for: 5m labels: severity: warning annotations: summary: High Deepgram latency description: P95 latency is above 10 seconds
-
alert: DeepgramRateLimiting expr: increase(deepgram_rate_limit_hits_total[1h]) > 10 for: 0m labels: severity: warning annotations: summary: Deepgram rate limiting detected description: More than 10 rate limit hits in the last hour
-
alert: DeepgramDown expr: up{job="deepgram-health"} == 0 for: 2m labels: severity: critical annotations: summary: Deepgram health check failing description: Health check has been failing for 2 minutes
-
undefinedgroups:
- name: deepgram
rules:
-
alert: DeepgramHighErrorRate expr: | sum(rate(deepgram_transcription_total{status="error"}[5m])) / sum(rate(deepgram_transcription_total[5m])) > 0.05 for: 5m labels: severity: critical annotations: summary: High Deepgram error rate description: Error rate is above 5% for the last 5 minutes
-
alert: DeepgramHighLatency expr: | histogram_quantile(0.95, sum(rate(deepgram_transcription_duration_seconds_bucket[5m])) by (le) ) > 10 for: 5m labels: severity: warning annotations: summary: High Deepgram latency description: P95 latency is above 10 seconds
-
alert: DeepgramRateLimiting expr: increase(deepgram_rate_limit_hits_total[1h]) > 10 for: 0m labels: severity: warning annotations: summary: Deepgram rate limiting detected description: More than 10 rate limit hits in the last hour
-
alert: DeepgramDown expr: up{job="deepgram-health"} == 0 for: 2m labels: severity: critical annotations: summary: Deepgram health check failing description: Health check has been failing for 2 minutes
-
undefinedRunbook Template
运行手册模板
markdown
undefinedmarkdown
undefinedDeepgram Incident Runbook
Deepgram事件运行手册
Quick Reference
快速参考
- Deepgram Status Page: https://status.deepgram.com
- Console: https://console.deepgram.com
- Support: support@deepgram.com
- Deepgram状态页面: https://status.deepgram.com
- 控制台: https://console.deepgram.com
- 支持: support@deepgram.com
Common Issues
常见问题
Issue: High Error Rate
问题:高错误率
Symptoms: Error rate > 5%
Steps:
- Check Deepgram status page
- Review error logs for specific error codes
- If 429 errors: check rate limit configuration
- If 401 errors: verify API key validity
- If 500 errors: escalate to Deepgram support
症状: 错误率 > 5%
步骤:
- 查看Deepgram状态页面
- 检查错误日志中的具体错误码
- 如果是429错误:检查速率限制配置
- 如果是401错误:验证API密钥有效性
- 如果是500错误:升级至Deepgram支持团队
Issue: High Latency
问题:高延迟
Symptoms: P95 > 10 seconds
Steps:
- Check audio file sizes (large files = longer processing)
- Review concurrent request count
- Check network latency to Deepgram
- Consider using callback URLs for large files
症状: P95延迟 > 10秒
步骤:
- 检查音频文件大小(大文件会增加处理时间)
- 查看并发请求数
- 检查与Deepgram的网络延迟
- 考虑为大文件使用回调URL
Issue: API Key Expiring
问题:API密钥即将过期
Symptoms: Alert from key monitoring
Steps:
- Generate new API key in Console
- Update secret manager
- Verify new key works
- Schedule deletion of old key (24h grace period)
undefined症状: 密钥监控发出告警
步骤:
- 在控制台生成新的API密钥
- 更新密钥管理系统
- 验证新密钥可用
- 计划删除旧密钥(保留24小时宽限期)
undefinedGo-Live Checklist
上线检查清单
markdown
undefinedmarkdown
undefinedPre-Launch (D-7)
上线前7天(D-7)
- Load testing completed
- Security review passed
- Documentation finalized
- Team trained on runbooks
- 完成负载测试
- 通过安全审查
- 文档最终定稿
- 团队完成运行手册培训
Launch Day (D-0)
上线当天(D-0)
- Final smoke test passed
- Monitoring dashboards open
- On-call rotation confirmed
- Rollback plan ready
- 通过最终冒烟测试
- 打开监控仪表盘
- 确认随叫随到轮值
- 准备好回滚方案
Post-Launch (D+1)
上线后1天(D+1)
- No critical alerts
- Error rate within SLA
- Performance metrics acceptable
- Customer feedback collected
undefined- 无严重告警
- 错误率符合SLA要求
- 性能指标达标
- 收集客户反馈
undefinedResources
资源
Next Steps
下一步
Proceed to for SDK upgrade guidance.
deepgram-upgrade-migration如需SDK升级指导,请执行。
deepgram-upgrade-migration