container-apps-gpu-2025
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAzure Container Apps GPU Support - 2025 Features
Azure Container Apps GPU支持 - 2025年新特性
Complete knowledge base for Azure Container Apps with GPU support, serverless capabilities, and Dapr integration (2025 GA features).
这是关于支持GPU、具备无服务器能力和Dapr集成的Azure Container Apps的完整知识库(2025年正式发布特性)。
Overview
概述
Azure Container Apps is a serverless container platform with native GPU support, Dapr integration, and scale-to-zero capabilities for cost-efficient AI/ML workloads.
Azure Container Apps是一款原生支持GPU、Dapr集成和缩容至零能力的无服务器容器平台,可为AI/ML工作负载提供高成本效益的运行环境。
Key 2025 Features (Build Announcements)
2025年核心特性(Build大会发布内容)
1. Serverless GPU (GA)
1. 无服务器GPU(正式发布)
- Automatic scaling: Scale GPU workloads based on demand
- Scale-to-zero: Pay only when GPU is actively used
- Per-second billing: Granular cost control
- Optimized cold start: Fast initialization for AI models
- Reduced operational overhead: No infrastructure management
- 自动扩缩容:根据需求自动调整GPU工作负载规模
- 缩容至零:仅在GPU活跃使用时付费
- 按秒计费:精细化成本控制
- 优化冷启动:AI模型快速初始化
- 降低运维开销:无需管理基础设施
2. Dedicated GPU (GA)
2. 专用GPU(正式发布)
- Consistent performance: Dedicated GPU resources
- Simplified AI deployment: Easy model hosting
- Long-running workloads: Ideal for training and continuous inference
- Multiple GPU types: NVIDIA A100, T4, and more
- 性能稳定:独享GPU资源
- AI部署简化:模型托管流程便捷
- 长时工作负载适配:适用于模型训练和持续推理场景
- 多GPU类型支持:NVIDIA A100、T4等多种型号
3. Dynamic Sessions with GPU (Early Access)
3. GPU动态会话(预览版)
- Sandboxed execution: Run untrusted AI-generated code
- Hyper-V isolation: Enhanced security
- GPU-powered Python interpreter: Handle compute-intensive AI workloads
- Scale at runtime: Dynamic resource allocation
- 沙箱执行:运行不可信的AI生成代码
- Hyper-V隔离:增强安全性
- GPU加速Python解释器:处理计算密集型AI工作负载
- 运行时扩缩容:动态分配资源
4. Foundry Models Integration
4. Foundry模型集成
- Deploy AI models directly: During container app creation
- Ready-to-use models: Pre-configured inference endpoints
- Azure AI Foundry: Seamless integration
- 直接部署AI模型:在容器应用创建阶段即可完成部署
- 预配置模型:提供现成的推理端点
- Azure AI Foundry无缝集成:实现生态联动
5. Workflow with Durable Task Scheduler (Preview)
5. 基于Durable Task Scheduler的工作流(预览版)
- Long-running workflows: Reliable orchestration
- State management: Automatic persistence
- Event-driven: Trigger workflows from events
- 长时工作流支持:可靠的编排能力
- 状态管理:自动持久化状态
- 事件驱动:通过事件触发工作流
6. Native Azure Functions Support
6. 原生Azure Functions支持
- Functions runtime: Run Azure Functions in Container Apps
- Consistent development: Same code, serverless execution
- Event triggers: All Functions triggers supported
- Functions运行时:在Container Apps中运行Azure Functions
- 开发一致性:代码复用,无服务器执行
- 全事件触发器支持:兼容所有Functions触发器
7. Dapr Integration (GA)
7. Dapr集成(正式发布)
- Service discovery: Built-in DNS-based discovery
- State management: Distributed state stores
- Pub/sub messaging: Reliable messaging patterns
- Service invocation: Resilient service-to-service calls
- Observability: Integrated tracing and metrics
- 服务发现:内置基于DNS的发现机制
- 状态管理:分布式状态存储
- 发布/订阅消息:可靠的消息传递模式
- 服务调用:弹性服务间调用
- 可观测性:集成链路追踪和指标监控
Creating Container Apps with GPU
创建支持GPU的容器应用
Basic Container App with Serverless GPU
基础无服务器GPU容器应用
bash
undefinedbash
undefinedCreate Container Apps environment
创建Container Apps环境
az containerapp env create
--name myenv
--resource-group MyRG
--location eastus
--logs-workspace-id <workspace-id>
--logs-workspace-key <workspace-key>
--name myenv
--resource-group MyRG
--location eastus
--logs-workspace-id <workspace-id>
--logs-workspace-key <workspace-key>
az containerapp env create
--name myenv
--resource-group MyRG
--location eastus
--logs-workspace-id <workspace-id>
--logs-workspace-key <workspace-key>
--name myenv
--resource-group MyRG
--location eastus
--logs-workspace-id <workspace-id>
--logs-workspace-key <workspace-key>
Create Container App with GPU
创建带GPU的容器应用
az containerapp create
--name myapp-gpu
--resource-group MyRG
--environment myenv
--image myregistry.azurecr.io/ai-model:latest
--cpu 4
--memory 8Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 0
--max-replicas 10
--ingress external
--target-port 8080
--name myapp-gpu
--resource-group MyRG
--environment myenv
--image myregistry.azurecr.io/ai-model:latest
--cpu 4
--memory 8Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 0
--max-replicas 10
--ingress external
--target-port 8080
undefinedaz containerapp create
--name myapp-gpu
--resource-group MyRG
--environment myenv
--image myregistry.azurecr.io/ai-model:latest
--cpu 4
--memory 8Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 0
--max-replicas 10
--ingress external
--target-port 8080
--name myapp-gpu
--resource-group MyRG
--environment myenv
--image myregistry.azurecr.io/ai-model:latest
--cpu 4
--memory 8Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 0
--max-replicas 10
--ingress external
--target-port 8080
undefinedProduction-Ready Container App with GPU
生产级GPU容器应用
bash
az containerapp create \
--name myapp-gpu-prod \
--resource-group MyRG \
--environment myenv \
\
# Container configuration
--image myregistry.azurecr.io/ai-model:latest \
--registry-server myregistry.azurecr.io \
--registry-identity system \
\
# Resources
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
\
# Scaling
--min-replicas 0 \
--max-replicas 20 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 10 \
\
# Networking
--ingress external \
--target-port 8080 \
--transport http2 \
--exposed-port 8080 \
\
# Security
--registry-identity system \
--env-vars "AZURE_CLIENT_ID=secretref:client-id" \
\
# Monitoring
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http \
--enable-dapr \
\
# Identity
--system-assignedbash
az containerapp create \
--name myapp-gpu-prod \
--resource-group MyRG \
--environment myenv \
\
# 容器配置
--image myregistry.azurecr.io/ai-model:latest \
--registry-server myregistry.azurecr.io \
--registry-identity system \
\
# 资源配置
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
\
# 扩缩容配置
--min-replicas 0 \
--max-replicas 20 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 10 \
\
# 网络配置
--ingress external \
--target-port 8080 \
--transport http2 \
--exposed-port 8080 \
\
# 安全配置
--registry-identity system \
--env-vars "AZURE_CLIENT_ID=secretref:client-id" \
\
# 监控配置
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http \
--enable-dapr \
\
# 身份配置
--system-assignedContainer Apps Environment Configuration
容器应用环境配置
Environment with Zone Redundancy
支持区域冗余的环境
bash
az containerapp env create \
--name myenv-prod \
--resource-group MyRG \
--location eastus \
--logs-workspace-id <workspace-id> \
--logs-workspace-key <workspace-key> \
--zone-redundant true \
--enable-workload-profiles truebash
az containerapp env create \
--name myenv-prod \
--resource-group MyRG \
--location eastus \
--logs-workspace-id <workspace-id> \
--logs-workspace-key <workspace-key> \
--zone-redundant true \
--enable-workload-profiles trueWorkload Profiles (Dedicated GPU)
工作负载配置文件(专用GPU)
bash
undefinedbash
undefinedCreate environment with workload profiles
创建带工作负载配置文件的环境
az containerapp env create
--name myenv-gpu
--resource-group MyRG
--location eastus
--enable-workload-profiles true
--name myenv-gpu
--resource-group MyRG
--location eastus
--enable-workload-profiles true
az containerapp env create
--name myenv-gpu
--resource-group MyRG
--location eastus
--enable-workload-profiles true
--name myenv-gpu
--resource-group MyRG
--location eastus
--enable-workload-profiles true
Add GPU workload profile
添加GPU工作负载配置文件
az containerapp env workload-profile add
--name myenv-gpu
--resource-group MyRG
--workload-profile-name gpu-profile
--workload-profile-type GPU-A100
--min-nodes 0
--max-nodes 10
--name myenv-gpu
--resource-group MyRG
--workload-profile-name gpu-profile
--workload-profile-type GPU-A100
--min-nodes 0
--max-nodes 10
az containerapp env workload-profile add
--name myenv-gpu
--resource-group MyRG
--workload-profile-name gpu-profile
--workload-profile-type GPU-A100
--min-nodes 0
--max-nodes 10
--name myenv-gpu
--resource-group MyRG
--workload-profile-name gpu-profile
--workload-profile-type GPU-A100
--min-nodes 0
--max-nodes 10
Create container app with GPU profile
创建使用GPU配置文件的容器应用
az containerapp create
--name myapp-dedicated-gpu
--resource-group MyRG
--environment myenv-gpu
--workload-profile-name gpu-profile
--image myregistry.azurecr.io/training-job:latest
--cpu 8
--memory 16Gi
--min-replicas 1
--max-replicas 5
--name myapp-dedicated-gpu
--resource-group MyRG
--environment myenv-gpu
--workload-profile-name gpu-profile
--image myregistry.azurecr.io/training-job:latest
--cpu 8
--memory 16Gi
--min-replicas 1
--max-replicas 5
undefinedaz containerapp create
--name myapp-dedicated-gpu
--resource-group MyRG
--environment myenv-gpu
--workload-profile-name gpu-profile
--image myregistry.azurecr.io/training-job:latest
--cpu 8
--memory 16Gi
--min-replicas 1
--max-replicas 5
--name myapp-dedicated-gpu
--resource-group MyRG
--environment myenv-gpu
--workload-profile-name gpu-profile
--image myregistry.azurecr.io/training-job:latest
--cpu 8
--memory 16Gi
--min-replicas 1
--max-replicas 5
undefinedGPU Scaling Rules
GPU扩缩容规则
Custom Prometheus Scaling
自定义Prometheus扩缩容
bash
az containerapp create \
--name myapp-gpu-prometheus \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/ai-model:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name gpu-utilization \
--scale-rule-type custom \
--scale-rule-custom-type prometheus \
--scale-rule-metadata \
serverAddress=http://prometheus.monitoring.svc.cluster.local:9090 \
metricName=gpu_utilization \
threshold=80 \
query="avg(nvidia_gpu_utilization{app='myapp'})"bash
az containerapp create \
--name myapp-gpu-prometheus \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/ai-model:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name gpu-utilization \
--scale-rule-type custom \
--scale-rule-custom-type prometheus \
--scale-rule-metadata \
serverAddress=http://prometheus.monitoring.svc.cluster.local:9090 \
metricName=gpu_utilization \
threshold=80 \
query="avg(nvidia_gpu_utilization{app='myapp'})"Queue-Based Scaling (Azure Service Bus)
基于队列的扩缩容(Azure Service Bus)
bash
az containerapp create \
--name myapp-queue-processor \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/batch-processor:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-t4 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 50 \
--scale-rule-name queue-scaling \
--scale-rule-type azure-servicebus \
--scale-rule-metadata \
queueName=ai-jobs \
namespace=myservicebus \
messageCount=5 \
--scale-rule-auth connection=servicebus-connectionbash
az containerapp create \
--name myapp-queue-processor \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/batch-processor:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-t4 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 50 \
--scale-rule-name queue-scaling \
--scale-rule-type azure-servicebus \
--scale-rule-metadata \
queueName=ai-jobs \
namespace=myservicebus \
messageCount=5 \
--scale-rule-auth connection=servicebus-connectionDapr Integration
Dapr集成
Enable Dapr on Container App
在容器应用中启用Dapr
bash
az containerapp create \
--name myapp-dapr \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--enable-dapr \
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http \
--dapr-http-max-request-size 4 \
--dapr-http-read-buffer-size 4 \
--dapr-log-level info \
--dapr-enable-api-logging truebash
az containerapp create \
--name myapp-dapr \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--enable-dapr \
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http \
--dapr-http-max-request-size 4 \
--dapr-http-read-buffer-size 4 \
--dapr-log-level info \
--dapr-enable-api-logging trueDapr State Store (Azure Cosmos DB)
Dapr状态存储(Azure Cosmos DB)
yaml
undefinedyaml
undefinedCreate Dapr component for state store
创建用于状态存储的Dapr组件
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: statestore
spec:
type: state.azure.cosmosdb
version: v1
metadata:
- name: url
value: "https://mycosmosdb.documents.azure.com:443/"
- name: masterKey
secretRef: cosmosdb-key
- name: database
value: "mydb"
- name: collection
value: "state"
```bashapiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: statestore
spec:
type: state.azure.cosmosdb
version: v1
metadata:
- name: url
value: "https://mycosmosdb.documents.azure.com:443/"
- name: masterKey
secretRef: cosmosdb-key
- name: database
value: "mydb"
- name: collection
value: "state"
```bashCreate the component
创建组件
az containerapp env dapr-component set
--name myenv
--resource-group MyRG
--dapr-component-name statestore
--yaml component.yaml
--name myenv
--resource-group MyRG
--dapr-component-name statestore
--yaml component.yaml
undefinedaz containerapp env dapr-component set
--name myenv
--resource-group MyRG
--dapr-component-name statestore
--yaml component.yaml
--name myenv
--resource-group MyRG
--dapr-component-name statestore
--yaml component.yaml
undefinedDapr Pub/Sub (Azure Service Bus)
Dapr发布/订阅(Azure Service Bus)
yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: pubsub
spec:
type: pubsub.azure.servicebus.topics
version: v1
metadata:
- name: connectionString
secretRef: servicebus-connection
- name: consumerID
value: "myapp"yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: pubsub
spec:
type: pubsub.azure.servicebus.topics
version: v1
metadata:
- name: connectionString
secretRef: servicebus-connection
- name: consumerID
value: "myapp"Service-to-Service Invocation
服务间调用
python
undefinedpython
undefinedPython example using Dapr SDK
使用Dapr SDK的Python示例
from dapr.clients import DaprClient
with DaprClient() as client:
# Invoke another service
response = client.invoke_method(
app_id='other-service',
method_name='process',
data='{"input": "data"}'
)
# Save state
client.save_state(
store_name='statestore',
key='mykey',
value='myvalue'
)
# Publish message
client.publish_event(
pubsub_name='pubsub',
topic_name='orders',
data='{"orderId": "123"}'
)undefinedfrom dapr.clients import DaprClient
with DaprClient() as client:
# 调用其他服务
response = client.invoke_method(
app_id='other-service',
method_name='process',
data='{"input": "data"}'
)
# 保存状态
client.save_state(
store_name='statestore',
key='mykey',
value='myvalue'
)
# 发布消息
client.publish_event(
pubsub_name='pubsub',
topic_name='orders',
data='{"orderId": "123"}'
)undefinedAI Model Deployment Patterns
AI模型部署模式
OpenAI-Compatible Endpoint
兼容OpenAI的端点
dockerfile
undefineddockerfile
undefinedDockerfile for vLLM model serving
用于vLLM模型服务的Dockerfile
FROM vllm/vllm-openai:latest
ENV MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
ENV GPU_MEMORY_UTILIZATION=0.9
ENV MAX_MODEL_LEN=4096
CMD ["--model", "${MODEL_NAME}",
"--gpu-memory-utilization", "${GPU_MEMORY_UTILIZATION}",
"--max-model-len", "${MAX_MODEL_LEN}",
"--port", "8080"]
"--gpu-memory-utilization", "${GPU_MEMORY_UTILIZATION}",
"--max-model-len", "${MAX_MODEL_LEN}",
"--port", "8080"]
```bashFROM vllm/vllm-openai:latest
ENV MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
ENV GPU_MEMORY_UTILIZATION=0.9
ENV MAX_MODEL_LEN=4096
CMD ["--model", "${MODEL_NAME}",
"--gpu-memory-utilization", "${GPU_MEMORY_UTILIZATION}",
"--max-model-len", "${MAX_MODEL_LEN}",
"--port", "8080"]
"--gpu-memory-utilization", "${GPU_MEMORY_UTILIZATION}",
"--max-model-len", "${MAX_MODEL_LEN}",
"--port", "8080"]
```bashDeploy vLLM model
部署vLLM模型
az containerapp create
--name llama-inference
--resource-group MyRG
--environment myenv
--image vllm/vllm-openai:latest
--cpu 8
--memory 32Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 1
--max-replicas 5
--target-port 8080
--ingress external
--env-vars
MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
GPU_MEMORY_UTILIZATION="0.9"
HF_TOKEN=secretref:huggingface-token
--name llama-inference
--resource-group MyRG
--environment myenv
--image vllm/vllm-openai:latest
--cpu 8
--memory 32Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 1
--max-replicas 5
--target-port 8080
--ingress external
--env-vars
MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
GPU_MEMORY_UTILIZATION="0.9"
HF_TOKEN=secretref:huggingface-token
undefinedaz containerapp create
--name llama-inference
--resource-group MyRG
--environment myenv
--image vllm/vllm-openai:latest
--cpu 8
--memory 32Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 1
--max-replicas 5
--target-port 8080
--ingress external
--env-vars
MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
GPU_MEMORY_UTILIZATION="0.9"
HF_TOKEN=secretref:huggingface-token
--name llama-inference
--resource-group MyRG
--environment myenv
--image vllm/vllm-openai:latest
--cpu 8
--memory 32Gi
--gpu-type nvidia-a100
--gpu-count 1
--min-replicas 1
--max-replicas 5
--target-port 8080
--ingress external
--env-vars
MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
GPU_MEMORY_UTILIZATION="0.9"
HF_TOKEN=secretref:huggingface-token
undefinedStable Diffusion Image Generation
Stable Diffusion图像生成
bash
az containerapp create \
--name stable-diffusion \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/stable-diffusion:latest \
--cpu 4 \
--memory 16Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--target-port 7860 \
--ingress external \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 1bash
az containerapp create \
--name stable-diffusion \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/stable-diffusion:latest \
--cpu 4 \
--memory 16Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--target-port 7860 \
--ingress external \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 1Batch Processing Job
批量处理任务
bash
az containerapp job create \
--name batch-training-job \
--resource-group MyRG \
--environment myenv \
--trigger-type Manual \
--image myregistry.azurecr.io/training:latest \
--cpu 8 \
--memory 32Gi \
--gpu-type nvidia-a100 \
--gpu-count 2 \
--parallelism 1 \
--replica-timeout 7200 \
--replica-retry-limit 3 \
--env-vars \
DATASET_URL="https://mystorage.blob.core.windows.net/datasets/train.csv" \
MODEL_OUTPUT="https://mystorage.blob.core.windows.net/models/" \
EPOCHS="100"bash
az containerapp job create \
--name batch-training-job \
--resource-group MyRG \
--environment myenv \
--trigger-type Manual \
--image myregistry.azurecr.io/training:latest \
--cpu 8 \
--memory 32Gi \
--gpu-type nvidia-a100 \
--gpu-count 2 \
--parallelism 1 \
--replica-timeout 7200 \
--replica-retry-limit 3 \
--env-vars \
DATASET_URL="https://mystorage.blob.core.windows.net/datasets/train.csv" \
MODEL_OUTPUT="https://mystorage.blob.core.windows.net/models/" \
EPOCHS="100"Execute job
执行任务
az containerapp job start
--name batch-training-job
--resource-group MyRG
--name batch-training-job
--resource-group MyRG
undefinedaz containerapp job start
--name batch-training-job
--resource-group MyRG
--name batch-training-job
--resource-group MyRG
undefinedMonitoring and Observability
监控与可观测性
Application Insights Integration
Application Insights集成
bash
az containerapp create \
--name myapp-monitored \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--env-vars \
APPLICATIONINSIGHTS_CONNECTION_STRING=secretref:appinsights-connectionbash
az containerapp create \
--name myapp-monitored \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--env-vars \
APPLICATIONINSIGHTS_CONNECTION_STRING=secretref:appinsights-connectionQuery Logs
查询日志
bash
undefinedbash
undefinedStream logs
流式查看日志
az containerapp logs show
--name myapp-gpu
--resource-group MyRG
--follow
--name myapp-gpu
--resource-group MyRG
--follow
az containerapp logs show
--name myapp-gpu
--resource-group MyRG
--follow
--name myapp-gpu
--resource-group MyRG
--follow
Query with Log Analytics
使用Log Analytics查询
az monitor log-analytics query
--workspace <workspace-id>
--analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'myapp-gpu' | take 100"
--workspace <workspace-id>
--analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'myapp-gpu' | take 100"
undefinedaz monitor log-analytics query
--workspace <workspace-id>
--analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'myapp-gpu' | take 100"
--workspace <workspace-id>
--analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'myapp-gpu' | take 100"
undefinedMetrics and Alerts
指标与告警
bash
undefinedbash
undefinedCreate metric alert for GPU usage
创建GPU使用率指标告警
az monitor metrics alert create
--name high-gpu-usage
--resource-group MyRG
--scopes $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--condition "avg Requests > 100"
--window-size 5m
--evaluation-frequency 1m
--action <action-group-id>
--name high-gpu-usage
--resource-group MyRG
--scopes $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--condition "avg Requests > 100"
--window-size 5m
--evaluation-frequency 1m
--action <action-group-id>
undefinedaz monitor metrics alert create
--name high-gpu-usage
--resource-group MyRG
--scopes $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--condition "avg Requests > 100"
--window-size 5m
--evaluation-frequency 1m
--action <action-group-id>
--name high-gpu-usage
--resource-group MyRG
--scopes $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--condition "avg Requests > 100"
--window-size 5m
--evaluation-frequency 1m
--action <action-group-id>
undefinedSecurity Best Practices
安全最佳实践
Managed Identity
托管身份
bash
undefinedbash
undefinedCreate with system-assigned identity
创建带系统分配身份的应用
az containerapp create
--name myapp-identity
--resource-group MyRG
--environment myenv
--system-assigned
--image myregistry.azurecr.io/myapp:latest
--name myapp-identity
--resource-group MyRG
--environment myenv
--system-assigned
--image myregistry.azurecr.io/myapp:latest
az containerapp create
--name myapp-identity
--resource-group MyRG
--environment myenv
--system-assigned
--image myregistry.azurecr.io/myapp:latest
--name myapp-identity
--resource-group MyRG
--environment myenv
--system-assigned
--image myregistry.azurecr.io/myapp:latest
Get identity principal ID
获取身份主体ID
IDENTITY_ID=$(az containerapp show -g MyRG -n myapp-identity --query identity.principalId -o tsv)
IDENTITY_ID=$(az containerapp show -g MyRG -n myapp-identity --query identity.principalId -o tsv)
Assign role to access Key Vault
分配访问Key Vault的角色
az role assignment create
--assignee $IDENTITY_ID
--role "Key Vault Secrets User"
--scope /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.KeyVault/vaults/mykeyvault
--assignee $IDENTITY_ID
--role "Key Vault Secrets User"
--scope /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.KeyVault/vaults/mykeyvault
az role assignment create
--assignee $IDENTITY_ID
--role "Key Vault Secrets User"
--scope /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.KeyVault/vaults/mykeyvault
--assignee $IDENTITY_ID
--role "Key Vault Secrets User"
--scope /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.KeyVault/vaults/mykeyvault
Use user-assigned identity
使用用户分配身份
az identity create --name myapp-identity --resource-group MyRG
IDENTITY_RESOURCE_ID=$(az identity show -g MyRG -n myapp-identity --query id -o tsv)
az containerapp create
--name myapp-user-identity
--resource-group MyRG
--environment myenv
--user-assigned $IDENTITY_RESOURCE_ID
--image myregistry.azurecr.io/myapp:latest
--name myapp-user-identity
--resource-group MyRG
--environment myenv
--user-assigned $IDENTITY_RESOURCE_ID
--image myregistry.azurecr.io/myapp:latest
undefinedaz identity create --name myapp-identity --resource-group MyRG
IDENTITY_RESOURCE_ID=$(az identity show -g MyRG -n myapp-identity --query id -o tsv)
az containerapp create
--name myapp-user-identity
--resource-group MyRG
--environment myenv
--user-assigned $IDENTITY_RESOURCE_ID
--image myregistry.azurecr.io/myapp:latest
--name myapp-user-identity
--resource-group MyRG
--environment myenv
--user-assigned $IDENTITY_RESOURCE_ID
--image myregistry.azurecr.io/myapp:latest
undefinedSecret Management
密钥管理
bash
undefinedbash
undefinedAdd secrets
添加密钥
az containerapp secret set
--name myapp-gpu
--resource-group MyRG
--secrets
huggingface-token="<token>"
api-key="<key>"
--name myapp-gpu
--resource-group MyRG
--secrets
huggingface-token="<token>"
api-key="<key>"
az containerapp secret set
--name myapp-gpu
--resource-group MyRG
--secrets
huggingface-token="<token>"
api-key="<key>"
--name myapp-gpu
--resource-group MyRG
--secrets
huggingface-token="<token>"
api-key="<key>"
Reference secrets in environment variables
在环境变量中引用密钥
az containerapp update
--name myapp-gpu
--resource-group MyRG
--set-env-vars
HF_TOKEN=secretref:huggingface-token
API_KEY=secretref:api-key
--name myapp-gpu
--resource-group MyRG
--set-env-vars
HF_TOKEN=secretref:huggingface-token
API_KEY=secretref:api-key
undefinedaz containerapp update
--name myapp-gpu
--resource-group MyRG
--set-env-vars
HF_TOKEN=secretref:huggingface-token
API_KEY=secretref:api-key
--name myapp-gpu
--resource-group MyRG
--set-env-vars
HF_TOKEN=secretref:huggingface-token
API_KEY=secretref:api-key
undefinedCost Optimization
成本优化
Scale-to-Zero Configuration
缩容至零配置
bash
az containerapp create \
--name myapp-scale-zero \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 10Cost savings: Pay only when requests are being processed. GPU costs are per-second when active.
bash
az containerapp create \
--name myapp-scale-zero \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 10成本节省:仅在处理请求时付费,GPU资源按活跃时长按秒计费。
Right-Sizing Resources
资源合理配置
bash
undefinedbash
undefinedStart with minimal resources
从最小资源开始配置
--cpu 2 --memory 4Gi --gpu-count 1
--cpu 2 --memory 4Gi --gpu-count 1
Monitor and adjust based on actual usage
根据实际使用情况监控并调整
az monitor metrics list
--resource $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--metric "CpuPercentage,MemoryPercentage"
--resource $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--metric "CpuPercentage,MemoryPercentage"
undefinedaz monitor metrics list
--resource $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--metric "CpuPercentage,MemoryPercentage"
--resource $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv)
--metric "CpuPercentage,MemoryPercentage"
undefinedUse Spot/Preemptible GPUs (Future Feature)
使用Spot/抢占式GPU(未来特性)
When available, configure spot instances for non-critical workloads to save up to 80% on GPU costs.
当该特性可用时,为非关键工作负载配置Spot实例,可节省高达80%的GPU成本。
Troubleshooting
故障排查
Check Revision Status
检查修订版本状态
bash
az containerapp revision list \
--name myapp-gpu \
--resource-group MyRG \
--output tablebash
az containerapp revision list \
--name myapp-gpu \
--resource-group MyRG \
--output tableView Revision Details
查看修订版本详情
bash
az containerapp revision show \
--name <revision-name> \
--app myapp-gpu \
--resource-group MyRGbash
az containerapp revision show \
--name <revision-name> \
--app myapp-gpu \
--resource-group MyRGRestart Container App
重启容器应用
bash
az containerapp update \
--name myapp-gpu \
--resource-group MyRG \
--force-restartbash
az containerapp update \
--name myapp-gpu \
--resource-group MyRG \
--force-restartGPU Not Available
GPU无法分配
If GPU is not provisioning:
- Check region availability: Not all regions support GPU
- Verify quota: Request quota increase if needed
- Check workload profile: Ensure GPU workload profile is created
若GPU资源无法分配,请检查:
- 区域可用性:并非所有区域都支持GPU
- 配额限制:若需要,申请提高配额
- 工作负载配置文件:确保已创建GPU工作负载配置文件
Best Practices
最佳实践
✓ Use scale-to-zero for intermittent workloads
✓ Implement health probes (liveness and readiness)
✓ Use managed identities for authentication
✓ Store secrets in Azure Key Vault
✓ Enable Dapr for microservices patterns
✓ Configure appropriate scaling rules
✓ Monitor GPU utilization and adjust resources
✓ Use Container Apps jobs for batch processing
✓ Implement retry logic for transient failures
✓ Use Application Insights for observability
✓ 为间歇性工作负载启用缩容至零
✓ 配置健康探针(存活探针和就绪探针)
✓ 使用托管身份进行认证
✓ 在Azure Key Vault中存储密钥
✓ 为微服务场景启用Dapr
✓ 配置合适的扩缩容规则
✓ 监控GPU利用率并调整资源配置
✓ 使用Container Apps任务处理批量作业
✓ 为瞬时故障实现重试逻辑
✓ 使用Application Insights进行可观测性管理
References
参考链接
Azure Container Apps with GPU support provides the ultimate serverless platform for AI/ML workloads!
支持GPU的Azure Container Apps为AI/ML工作负载提供了终极无服务器平台!