azure-ai-contentunderstanding-py
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAzure AI Content Understanding SDK for Python
适用于Python的Azure AI Content Understanding SDK
Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
一款多模态AI服务,可从文档、视频、音频和图像文件中提取语义内容,用于RAG和自动化工作流。
Installation
安装
bash
pip install azure-ai-contentunderstandingbash
pip install azure-ai-contentunderstandingEnvironment Variables
环境变量
bash
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/bash
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/Authentication
身份验证
python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)Core Workflow
核心工作流
Content Understanding operations are asynchronous long-running operations:
- Begin Analysis — Start the analysis operation with (returns a poller)
begin_analyze() - Poll for Results — Poll until analysis completes (SDK handles this with )
.result() - Process Results — Extract structured results from
AnalyzeResult.contents
内容理解操作是异步的长时间运行操作:
- 启动分析 — 调用启动分析操作(返回一个轮询器)
begin_analyze() - 轮询结果 — 轮询直到分析完成(SDK通过处理此过程)
.result() - 处理结果 — 从中提取结构化结果
AnalyzeResult.contents
Prebuilt Analyzers
预构建分析器
| Analyzer | Content Type | Purpose |
|---|---|---|
| Documents | Extract markdown for RAG applications |
| Images | Extract content from images |
| Audio | Transcribe audio with timing |
| Video | Extract frames, transcripts, summaries |
| Documents | Extract invoice fields |
| 分析器 | 内容类型 | 用途 |
|---|---|---|
| 文档 | 提取Markdown内容用于RAG应用 |
| 图像 | 从图像中提取内容 |
| 音频 | 带时间戳的音频转写 |
| 视频 | 提取帧、转录文本、摘要 |
| 文档 | 提取发票字段 |
Analyze Document
分析文档
python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
)python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
)Analyze document from URL
从URL分析文档
poller = client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
)
result = poller.result()
poller = client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
)
result = poller.result()
Access markdown content (contents is a list)
访问Markdown内容(contents为列表)
content = result.contents[0]
print(content.markdown)
undefinedcontent = result.contents[0]
print(content.markdown)
undefinedAccess Document Content Details
访问文档内容详情
python
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
document_content: DocumentContent = content # type: ignore
print(document_content.start_page_number)python
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
document_content: DocumentContent = content # type: ignore
print(document_content.start_page_number)Analyze Image
分析图像
python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-imageSearch",
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-imageSearch",
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)Analyze Video
分析视频
python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-videoSearch",
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)
result = poller.result()python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-videoSearch",
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)
result = poller.result()Access video content (AudioVisualContent)
访问视频内容(AudioVisualContent)
content = result.contents[0]
content = result.contents[0]
Get transcript phrases with timing
获取带时间戳的转录语句
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
Get key frames (for video)
获取关键帧(针对视频)
for frame in content.key_frames:
print(f"Frame at {frame.time}: {frame.description}")
undefinedfor frame in content.key_frames:
print(f"Frame at {frame.time}: {frame.description}")
undefinedAnalyze Audio
分析音频
python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-audioSearch",
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)
result = poller.result()python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-audioSearch",
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)
result = poller.result()Access audio transcript
访问音频转录内容
content = result.contents[0]
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time}] {phrase.text}")
undefinedcontent = result.contents[0]
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time}] {phrase.text}")
undefinedCustom Analyzers
自定义分析器
Create custom analyzers with field schemas for specialized extraction:
python
undefined创建带有字段 schema 的自定义分析器,用于特定场景的提取:
python
undefinedCreate custom analyzer
创建自定义分析器
analyzer = client.create_analyzer(
analyzer_id="my-invoice-analyzer",
analyzer={
"description": "Custom invoice analyzer",
"base_analyzer_id": "prebuilt-documentSearch",
"field_schema": {
"fields": {
"vendor_name": {"type": "string"},
"invoice_total": {"type": "number"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"amount": {"type": "number"}
}
}
}
}
}
}
)
analyzer = client.create_analyzer(
analyzer_id="my-invoice-analyzer",
analyzer={
"description": "Custom invoice analyzer",
"base_analyzer_id": "prebuilt-documentSearch",
"field_schema": {
"fields": {
"vendor_name": {"type": "string"},
"invoice_total": {"type": "number"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"amount": {"type": "number"}
}
}
}
}
}
}
)
Use custom analyzer
使用自定义分析器
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="my-invoice-analyzer",
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
)
result = poller.result()
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="my-invoice-analyzer",
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
)
result = poller.result()
Access extracted fields
访问提取的字段
print(result.fields["vendor_name"])
print(result.fields["invoice_total"])
undefinedprint(result.fields["vendor_name"])
print(result.fields["invoice_total"])
undefinedAnalyzer Management
分析器管理
python
undefinedpython
undefinedList all analyzers
列出所有分析器
analyzers = client.list_analyzers()
for analyzer in analyzers:
print(f"{analyzer.analyzer_id}: {analyzer.description}")
analyzers = client.list_analyzers()
for analyzer in analyzers:
print(f"{analyzer.analyzer_id}: {analyzer.description}")
Get specific analyzer
获取特定分析器
analyzer = client.get_analyzer("prebuilt-documentSearch")
analyzer = client.get_analyzer("prebuilt-documentSearch")
Delete custom analyzer
删除自定义分析器
client.delete_analyzer("my-custom-analyzer")
undefinedclient.delete_analyzer("my-custom-analyzer")
undefinedAsync Client
异步客户端
python
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential
async def analyze_document():
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
async with ContentUnderstandingClient(
endpoint=endpoint,
credential=credential
) as client:
poller = await client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
)
result = await poller.result()
content = result.contents[0]
return content.markdown
asyncio.run(analyze_document())python
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential
async def analyze_document():
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
async with ContentUnderstandingClient(
endpoint=endpoint,
credential=credential
) as client:
poller = await client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
)
result = await poller.result()
content = result.contents[0]
return content.markdown
asyncio.run(analyze_document())Content Types
内容类型
| Class | For | Provides |
|---|---|---|
| PDF, images, Office docs | Pages, tables, figures, paragraphs |
| Audio, video files | Transcript phrases, timing, key frames |
Both derive from which provides basic info and markdown representation.
MediaContent| 类 | 适用对象 | 提供内容 |
|---|---|---|
| PDF、图像、Office文档 | 页面、表格、图表、段落 |
| 音频、视频文件 | 转录语句、时间戳、关键帧 |
两者均继承自,后者提供基础信息和Markdown格式的内容表示。
MediaContentModel Imports
模型导入
python
from azure.ai.contentunderstanding.models import (
AnalyzeInput,
AnalyzeResult,
MediaContentKind,
DocumentContent,
AudioVisualContent,
)python
from azure.ai.contentunderstanding.models import (
AnalyzeInput,
AnalyzeResult,
MediaContentKind,
DocumentContent,
AudioVisualContent,
)Client Types
客户端类型
| Client | Purpose |
|---|---|
| Sync client for all operations |
| Async client for all operations |
| 客户端 | 用途 |
|---|---|
| 所有操作的同步客户端 |
| 所有操作的异步客户端 |
Best Practices
最佳实践
- Use with
begin_analyze— this is the correct method signatureAnalyzeInput - Access results via — results are returned as a list
result.contents[0] - Use prebuilt analyzers for common scenarios (document/image/audio/video search)
- Create custom analyzers only for domain-specific field extraction
- Use async client for high-throughput scenarios with credentials
azure.identity.aio - Handle long-running operations — video/audio analysis can take minutes
- Use URL sources when possible to avoid upload overhead
- 使用搭配
begin_analyze— 这是正确的方法签名AnalyzeInput - 通过访问结果 — 结果以列表形式返回
result.contents[0] - 使用预构建分析器 处理常见场景(文档/图像/音频/视频搜索)
- 仅为特定领域的字段提取创建自定义分析器
- 在高吞吐量场景下使用异步客户端,搭配凭证
azure.identity.aio - 处理长时间运行的操作 — 视频/音频分析可能需要数分钟
- 尽可能使用URL源 以避免上传开销