azure-ai-transcription-py

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Azure AI Transcription SDK for Python

Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.

适用于Azure AI Transcription（语音转文字）的客户端库，支持实时和批量转录。

Installation

安装

bash

pip install azure-ai-transcription

bash

pip install azure-ai-transcription

Environment Variables

环境变量

bash

TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>

bash

TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>

Authentication

身份验证

Use subscription key authentication (DefaultAzureCredential is not supported for this client):

python

import os
from azure.ai.transcription import TranscriptionClient

client = TranscriptionClient(
    endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
    credential=os.environ["TRANSCRIPTION_KEY"]
)

使用订阅密钥进行身份验证（此客户端不支持DefaultAzureCredential）：

python

import os
from azure.ai.transcription import TranscriptionClient

client = TranscriptionClient(
    endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
    credential=os.environ["TRANSCRIPTION_KEY"]
)

Transcription (Batch)

转录（批量）

python

job = client.begin_transcription(
    name="meeting-transcription",
    locale="en-US",
    content_urls=["https://<storage>/audio.wav"],
    diarization_enabled=True
)
result = job.result()
print(result.status)

python

job = client.begin_transcription(
    name="meeting-transcription",
    locale="en-US",
    content_urls=["https://<storage>/audio.wav"],
    diarization_enabled=True
)
result = job.result()
print(result.status)

Transcription (Real-time)

转录（实时）

python

stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
    print(event.text)

python

stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
    print(event.text)

Best Practices

最佳实践

Enable diarization when multiple speakers are present
Use batch transcription for long files stored in blob storage
Capture timestamps for subtitle generation
Specify language to improve recognition accuracy
Handle streaming backpressure for real-time transcription
Close transcription sessions when complete

启用说话人分离 当存在多个说话人时
使用批量转录 对于存储在Blob存储中的长文件
捕获时间戳 用于字幕生成
指定语言 以提高识别准确率
处理流背压 针对实时转录场景
关闭转录会话 当任务完成后