azure-ai-contentunderstanding-py

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Azure AI Content Understanding SDK for Python

适用于Python的Azure AI Content Understanding SDK

Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.

一款多模态AI服务，可从文档、视频、音频和图像文件中提取语义内容，用于RAG和自动化工作流。

Installation

安装

bash

pip install azure-ai-contentunderstanding

bash

pip install azure-ai-contentunderstanding

Environment Variables

环境变量

bash

CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/

bash

CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/

Authentication

身份验证

python

import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)

python

import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)

Core Workflow

核心工作流

Content Understanding operations are asynchronous long-running operations:

Begin Analysis — Start the analysis operation with
```
begin_analyze()
```
(returns a poller)
Poll for Results — Poll until analysis completes (SDK handles this with
```
.result()
```
)
Process Results — Extract structured results from
```
AnalyzeResult.contents
```

内容理解操作是异步的长时间运行操作：

启动分析 — 调用
```
begin_analyze()
```
启动分析操作（返回一个轮询器）
轮询结果 — 轮询直到分析完成（SDK通过
```
.result()
```
处理此过程）
处理结果 — 从
```
AnalyzeResult.contents
```
中提取结构化结果

Prebuilt Analyzers

预构建分析器

Analyzer	Content Type	Purpose
`prebuilt-documentSearch`	Documents	Extract markdown for RAG applications
`prebuilt-imageSearch`	Images	Extract content from images
`prebuilt-audioSearch`	Audio	Transcribe audio with timing
`prebuilt-videoSearch`	Video	Extract frames, transcripts, summaries
`prebuilt-invoice`	Documents	Extract invoice fields

分析器	内容类型	用途
`prebuilt-documentSearch`	文档	提取Markdown内容用于RAG应用
`prebuilt-imageSearch`	图像	从图像中提取内容
`prebuilt-audioSearch`	音频	带时间戳的音频转写
`prebuilt-videoSearch`	视频	提取帧、转录文本、摘要
`prebuilt-invoice`	文档	提取发票字段

Analyze Document

分析文档

python

import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
    endpoint=endpoint,
    credential=DefaultAzureCredential()
)

python

import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
    endpoint=endpoint,
    credential=DefaultAzureCredential()
)

Analyze document from URL

从URL分析文档

poller = client.begin_analyze( analyzer_id="prebuilt-documentSearch", inputs=[AnalyzeInput(url="https://example.com/document.pdf")] )

result = poller.result()

poller = client.begin_analyze( analyzer_id="prebuilt-documentSearch", inputs=[AnalyzeInput(url="https://example.com/document.pdf")] )

result = poller.result()

Access markdown content (contents is a list)

访问Markdown内容（contents为列表）

content = result.contents[0] print(content.markdown)

undefined

content = result.contents[0] print(content.markdown)

undefined

Access Document Content Details

访问文档内容详情

python

from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent

content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
    document_content: DocumentContent = content  # type: ignore
    print(document_content.start_page_number)

python

from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent

content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
    document_content: DocumentContent = content  # type: ignore
    print(document_content.start_page_number)

Analyze Image

分析图像

python

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-imageSearch",
    inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)

python

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-imageSearch",
    inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)

Analyze Video

分析视频

python

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-videoSearch",
    inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)

result = poller.result()

python

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-videoSearch",
    inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)

result = poller.result()

Access video content (AudioVisualContent)

访问视频内容（AudioVisualContent）

content = result.contents[0]

Get transcript phrases with timing

获取带时间戳的转录语句

for phrase in content.transcript_phrases: print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")

Get key frames (for video)

获取关键帧（针对视频）

for frame in content.key_frames: print(f"Frame at {frame.time}: {frame.description}")

undefined

for frame in content.key_frames: print(f"Frame at {frame.time}: {frame.description}")

undefined

Analyze Audio

分析音频

python

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-audioSearch",
    inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)

result = poller.result()

python

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze(
    analyzer_id="prebuilt-audioSearch",
    inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)

result = poller.result()

Access audio transcript

访问音频转录内容

content = result.contents[0] for phrase in content.transcript_phrases: print(f"[{phrase.start_time}] {phrase.text}")

undefined

content = result.contents[0] for phrase in content.transcript_phrases: print(f"[{phrase.start_time}] {phrase.text}")

undefined

Custom Analyzers

自定义分析器

Create custom analyzers with field schemas for specialized extraction:

python

undefined

创建带有字段 schema 的自定义分析器，用于特定场景的提取：

python

undefined

Create custom analyzer

创建自定义分析器

analyzer = client.create_analyzer( analyzer_id="my-invoice-analyzer", analyzer={ "description": "Custom invoice analyzer", "base_analyzer_id": "prebuilt-documentSearch", "field_schema": { "fields": { "vendor_name": {"type": "string"}, "invoice_total": {"type": "number"}, "line_items": { "type": "array", "items": { "type": "object", "properties": { "description": {"type": "string"}, "amount": {"type": "number"} } } } } } } )

Use custom analyzer

使用自定义分析器

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze( analyzer_id="my-invoice-analyzer", inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")] )

result = poller.result()

from azure.ai.contentunderstanding.models import AnalyzeInput

poller = client.begin_analyze( analyzer_id="my-invoice-analyzer", inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")] )

result = poller.result()

Access extracted fields

访问提取的字段

print(result.fields["vendor_name"]) print(result.fields["invoice_total"])

undefined

print(result.fields["vendor_name"]) print(result.fields["invoice_total"])

undefined

Analyzer Management

分析器管理

python

undefined

python

undefined

List all analyzers

列出所有分析器

analyzers = client.list_analyzers() for analyzer in analyzers: print(f"{analyzer.analyzer_id}: {analyzer.description}")

Get specific analyzer

获取特定分析器

analyzer = client.get_analyzer("prebuilt-documentSearch")

Delete custom analyzer

删除自定义分析器

client.delete_analyzer("my-custom-analyzer")

undefined

client.delete_analyzer("my-custom-analyzer")

undefined

Async Client

异步客户端

python

import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential

async def analyze_document():
    endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
    credential = DefaultAzureCredential()
    
    async with ContentUnderstandingClient(
        endpoint=endpoint,
        credential=credential
    ) as client:
        poller = await client.begin_analyze(
            analyzer_id="prebuilt-documentSearch",
            inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
        )
        result = await poller.result()
        content = result.contents[0]
        return content.markdown

asyncio.run(analyze_document())

python

import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential

async def analyze_document():
    endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
    credential = DefaultAzureCredential()
    
    async with ContentUnderstandingClient(
        endpoint=endpoint,
        credential=credential
    ) as client:
        poller = await client.begin_analyze(
            analyzer_id="prebuilt-documentSearch",
            inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
        )
        result = await poller.result()
        content = result.contents[0]
        return content.markdown

asyncio.run(analyze_document())

Content Types

内容类型

Class	For	Provides
`DocumentContent`	PDF, images, Office docs	Pages, tables, figures, paragraphs
`AudioVisualContent`	Audio, video files	Transcript phrases, timing, key frames

Both derive from

MediaContent

which provides basic info and markdown representation.

类	适用对象	提供内容
`DocumentContent`	PDF、图像、Office文档	页面、表格、图表、段落
`AudioVisualContent`	音频、视频文件	转录语句、时间戳、关键帧

两者均继承自

MediaContent

，后者提供基础信息和Markdown格式的内容表示。

Model Imports

模型导入

python

from azure.ai.contentunderstanding.models import (
    AnalyzeInput,
    AnalyzeResult,
    MediaContentKind,
    DocumentContent,
    AudioVisualContent,
)

python

from azure.ai.contentunderstanding.models import (
    AnalyzeInput,
    AnalyzeResult,
    MediaContentKind,
    DocumentContent,
    AudioVisualContent,
)

Client Types

客户端类型

Client	Purpose
`ContentUnderstandingClient`	Sync client for all operations
`ContentUnderstandingClient` (aio)	Async client for all operations

客户端	用途
`ContentUnderstandingClient`	所有操作的同步客户端
`ContentUnderstandingClient` (aio)	所有操作的异步客户端

Best Practices

最佳实践

Use
begin_analyze
with
AnalyzeInput
— this is the correct method signature
Access results via
result.contents[0]
— results are returned as a list
Use prebuilt analyzers for common scenarios (document/image/audio/video search)
Create custom analyzers only for domain-specific field extraction
Use async client for high-throughput scenarios with
```
azure.identity.aio
```
credentials
Handle long-running operations — video/audio analysis can take minutes
Use URL sources when possible to avoid upload overhead

使用
begin_analyze
搭配
AnalyzeInput
— 这是正确的方法签名
通过
result.contents[0]
访问结果 — 结果以列表形式返回
使用预构建分析器 处理常见场景（文档/图像/音频/视频搜索）
仅为特定领域的字段提取创建自定义分析器
在高吞吐量场景下使用异步客户端，搭配
```
azure.identity.aio
```
凭证
处理长时间运行的操作 — 视频/音频分析可能需要数分钟
尽可能使用URL源 以避免上传开销