azure-ai-contentunderstanding-py

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Azure AI Content Understanding SDK for Python

适用于Python的Azure AI Content Understanding SDK

Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
一款多模态AI服务,可从文档、视频、音频和图像文件中提取语义内容,用于RAG和自动化工作流。

Installation

安装

bash
pip install azure-ai-contentunderstanding
bash
pip install azure-ai-contentunderstanding

Environment Variables

环境变量

bash
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
bash
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/

Authentication

身份验证

python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)
python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)

Core Workflow

核心工作流

Content Understanding operations are asynchronous long-running operations:
  1. Begin Analysis — Start the analysis operation with
    begin_analyze()
    (returns a poller)
  2. Poll for Results — Poll until analysis completes (SDK handles this with
    .result()
    )
  3. Process Results — Extract structured results from
    AnalyzeResult.contents
内容理解操作是异步的长时间运行操作:
  1. 启动分析 — 调用
    begin_analyze()
    启动分析操作(返回一个轮询器)
  2. 轮询结果 — 轮询直到分析完成(SDK通过
    .result()
    处理此过程)
  3. 处理结果 — 从
    AnalyzeResult.contents
    中提取结构化结果

Prebuilt Analyzers

预构建分析器

AnalyzerContent TypePurpose
prebuilt-documentSearch
DocumentsExtract markdown for RAG applications
prebuilt-imageSearch
ImagesExtract content from images
prebuilt-audioSearch
AudioTranscribe audio with timing
prebuilt-videoSearch
VideoExtract frames, transcripts, summaries
prebuilt-invoice
DocumentsExtract invoice fields
分析器内容类型用途
prebuilt-documentSearch
文档提取Markdown内容用于RAG应用
prebuilt-imageSearch
图像从图像中提取内容
prebuilt-audioSearch
音频带时间戳的音频转写
prebuilt-videoSearch
视频提取帧、转录文本、摘要
prebuilt-invoice
文档提取发票字段

Analyze Document

分析文档

python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
    endpoint=endpoint,
    credential=DefaultAzureCredential()
)
python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
    endpoint=endpoint,
    credential=DefaultAzureCredential()
)

Analyze document from URL

从URL分析文档

poller = client.begin_analyze( analyzer_id="prebuilt-documentSearch", inputs=[AnalyzeInput(url="https://example.com/document.pdf")] )
result = poller.result()
poller = client.begin_analyze( analyzer_id="prebuilt-documentSearch", inputs=[AnalyzeInput(url="https://example.com/document.pdf")] )
result = poller.result()

Access markdown content (contents is a list)

访问Markdown内容(contents为列表)

content = result.contents[0] print(content.markdown)
undefined
content = result.contents[0] print(content.markdown)
undefined

Access Document Content Details

访问文档内容详情

python
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent

content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
    document_content: DocumentContent = content  # type: ignore
    print(document_content.start_page_number)
python
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent

content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
    document_content: DocumentContent = content  # type: ignore
    print(document_content.start_page_number)

Analyze Image

分析图像

python
from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-imageSearch",
    inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)
python
from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-imageSearch",
    inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)

Analyze Video

分析视频

python
from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-videoSearch",
    inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)

result = poller.result()
python
from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-videoSearch",
    inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)

result = poller.result()

Access video content (AudioVisualContent)

访问视频内容(AudioVisualContent)

content = result.contents[0]
content = result.contents[0]

Get transcript phrases with timing

获取带时间戳的转录语句

for phrase in content.transcript_phrases: print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
for phrase in content.transcript_phrases: print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")

Get key frames (for video)

获取关键帧(针对视频)

for frame in content.key_frames: print(f"Frame at {frame.time}: {frame.description}")
undefined
for frame in content.key_frames: print(f"Frame at {frame.time}: {frame.description}")
undefined

Analyze Audio

分析音频

python
from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-audioSearch",
    inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)

result = poller.result()
python
from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-audioSearch",
    inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)

result = poller.result()

Access audio transcript

访问音频转录内容

content = result.contents[0] for phrase in content.transcript_phrases: print(f"[{phrase.start_time}] {phrase.text}")
undefined
content = result.contents[0] for phrase in content.transcript_phrases: print(f"[{phrase.start_time}] {phrase.text}")
undefined

Custom Analyzers

自定义分析器

Create custom analyzers with field schemas for specialized extraction:
python
undefined
创建带有字段 schema 的自定义分析器,用于特定场景的提取:
python
undefined

Create custom analyzer

创建自定义分析器

analyzer = client.create_analyzer( analyzer_id="my-invoice-analyzer", analyzer={ "description": "Custom invoice analyzer", "base_analyzer_id": "prebuilt-documentSearch", "field_schema": { "fields": { "vendor_name": {"type": "string"}, "invoice_total": {"type": "number"}, "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": {"type": "string"}, "amount": {"type": "number"} } } } } } } )
analyzer = client.create_analyzer( analyzer_id="my-invoice-analyzer", analyzer={ "description": "Custom invoice analyzer", "base_analyzer_id": "prebuilt-documentSearch", "field_schema": { "fields": { "vendor_name": {"type": "string"}, "invoice_total": {"type": "number"}, "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": {"type": "string"}, "amount": {"type": "number"} } } } } } } )

Use custom analyzer

使用自定义分析器

from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze( analyzer_id="my-invoice-analyzer", inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")] )
result = poller.result()
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze( analyzer_id="my-invoice-analyzer", inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")] )
result = poller.result()

Access extracted fields

访问提取的字段

print(result.fields["vendor_name"]) print(result.fields["invoice_total"])
undefined
print(result.fields["vendor_name"]) print(result.fields["invoice_total"])
undefined

Analyzer Management

分析器管理

python
undefined
python
undefined

List all analyzers

列出所有分析器

analyzers = client.list_analyzers() for analyzer in analyzers: print(f"{analyzer.analyzer_id}: {analyzer.description}")
analyzers = client.list_analyzers() for analyzer in analyzers: print(f"{analyzer.analyzer_id}: {analyzer.description}")

Get specific analyzer

获取特定分析器

analyzer = client.get_analyzer("prebuilt-documentSearch")
analyzer = client.get_analyzer("prebuilt-documentSearch")

Delete custom analyzer

删除自定义分析器

client.delete_analyzer("my-custom-analyzer")
undefined
client.delete_analyzer("my-custom-analyzer")
undefined

Async Client

异步客户端

python
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential

async def analyze_document():
    endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
    credential = DefaultAzureCredential()
    
    async with ContentUnderstandingClient(
        endpoint=endpoint,
        credential=credential
    ) as client:
        poller = await client.begin_analyze(
            analyzer_id="prebuilt-documentSearch",
            inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
        )
        result = await poller.result()
        content = result.contents[0]
        return content.markdown

asyncio.run(analyze_document())
python
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential

async def analyze_document():
    endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
    credential = DefaultAzureCredential()
    
    async with ContentUnderstandingClient(
        endpoint=endpoint,
        credential=credential
    ) as client:
        poller = await client.begin_analyze(
            analyzer_id="prebuilt-documentSearch",
            inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
        )
        result = await poller.result()
        content = result.contents[0]
        return content.markdown

asyncio.run(analyze_document())

Content Types

内容类型

ClassForProvides
DocumentContent
PDF, images, Office docsPages, tables, figures, paragraphs
AudioVisualContent
Audio, video filesTranscript phrases, timing, key frames
Both derive from
MediaContent
which provides basic info and markdown representation.
适用对象提供内容
DocumentContent
PDF、图像、Office文档页面、表格、图表、段落
AudioVisualContent
音频、视频文件转录语句、时间戳、关键帧
两者均继承自
MediaContent
,后者提供基础信息和Markdown格式的内容表示。

Model Imports

模型导入

python
from azure.ai.contentunderstanding.models import (
    AnalyzeInput,
    AnalyzeResult,
    MediaContentKind,
    DocumentContent,
    AudioVisualContent,
)
python
from azure.ai.contentunderstanding.models import (
    AnalyzeInput,
    AnalyzeResult,
    MediaContentKind,
    DocumentContent,
    AudioVisualContent,
)

Client Types

客户端类型

ClientPurpose
ContentUnderstandingClient
Sync client for all operations
ContentUnderstandingClient
(aio)
Async client for all operations
客户端用途
ContentUnderstandingClient
所有操作的同步客户端
ContentUnderstandingClient
(aio)
所有操作的异步客户端

Best Practices

最佳实践

  1. Use
    begin_analyze
    with
    AnalyzeInput
    — this is the correct method signature
  2. Access results via
    result.contents[0]
    — results are returned as a list
  3. Use prebuilt analyzers for common scenarios (document/image/audio/video search)
  4. Create custom analyzers only for domain-specific field extraction
  5. Use async client for high-throughput scenarios with
    azure.identity.aio
    credentials
  6. Handle long-running operations — video/audio analysis can take minutes
  7. Use URL sources when possible to avoid upload overhead
  1. 使用
    begin_analyze
    搭配
    AnalyzeInput
    — 这是正确的方法签名
  2. 通过
    result.contents[0]
    访问结果
    — 结果以列表形式返回
  3. 使用预构建分析器 处理常见场景(文档/图像/音频/视频搜索)
  4. 仅为特定领域的字段提取创建自定义分析器
  5. 在高吞吐量场景下使用异步客户端,搭配
    azure.identity.aio
    凭证
  6. 处理长时间运行的操作 — 视频/音频分析可能需要数分钟
  7. 尽可能使用URL源 以避免上传开销