adk-observability-guide
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseADK Observability Guide
ADK可观测性指南
Scaffolded project? Cloud Trace and prompt-response logging are pre-configured by Terraform. Seefor infrastructure details, env vars, and verification commands.references/cloud-trace-and-logging.mdNo scaffold? Follow the ADK docs links below for manual setup. For production infrastructure, scaffold with./adk-scaffold
使用脚手架项目? Cloud Trace和提示-响应日志已通过Terraform预先配置。有关基础设施详情、环境变量和验证命令,请参阅。references/cloud-trace-and-logging.md未使用脚手架? 请按照下方ADK文档链接进行手动设置。对于生产环境基础设施,请使用生成脚手架。/adk-scaffold
Reference Files
参考文件
| File | Contents |
|---|---|
| Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally |
| BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance |
| 文件 | 内容 |
|---|---|
| 脚手架项目详情——Terraform配置的资源、环境变量、验证命令、本地启用/禁用方法 |
| BQ Agent Analytics插件——启用方法、核心功能、GCS卸载、工具溯源 |
Observability Tiers
可观测性层级
Choose the right level of observability based on your needs:
| Tier | What It Does | Scope | Default State | Best For |
|---|---|---|---|---|
| Cloud Trace | Distributed tracing — execution flow, latency, errors via OpenTelemetry spans | All templates, all environments | Always enabled | Debugging latency, understanding agent execution flow |
| Prompt-Response Logging | GenAI interactions exported to GCS, BigQuery, and Cloud Logging | ADK agents only | Disabled locally, enabled when deployed | Auditing LLM interactions, compliance |
| BigQuery Agent Analytics | Structured agent events (LLM calls, tool use, outcomes) to BigQuery | ADK agents with plugin enabled | Opt-in ( | Conversational analytics, custom dashboards, LLM-as-judge evals |
| Third-Party Integrations | External observability platforms (AgentOps, Phoenix, MLflow, etc.) | Any ADK agent | Opt-in, per-provider setup | Team collaboration, specialized visualization, prompt management |
Ask the user which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.
根据需求选择合适的可观测性层级:
| 层级 | 功能 | 适用范围 | 默认状态 | 最佳适用场景 |
|---|---|---|---|---|
| Cloud Trace | 分布式追踪——通过OpenTelemetry Span追踪执行流程、延迟、错误 | 所有模板、所有环境 | 始终启用 | 调试延迟问题、理解Agent执行流程 |
| 提示-响应日志 | 将生成式AI交互数据导出至GCS、BigQuery和Cloud Logging | 仅适用于ADK agents | 本地禁用,部署后启用 | 审计LLM交互、合规需求 |
| BigQuery Agent Analytics | 将结构化Agent事件(LLM调用、工具使用、执行结果)写入BigQuery | 已启用该插件的ADK agents | 可选(脚手架生成时添加 | 对话分析、自定义仪表盘、LLM-as-judge评估 |
| 第三方集成 | 外部可观测性平台(AgentOps、Phoenix、MLflow等) | 所有ADK agents | 可选,需按供应商要求配置 | 团队协作、专业可视化、提示词管理 |
请询问用户需要哪些层级——这些层级可以组合使用。Cloud Trace始终处于启用状态;其他层级为可选附加功能。
Cloud Trace
Cloud Trace
ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.
ADK使用OpenTelemetry生成分布式追踪数据。每次Agent调用都会生成Span,用于追踪完整的执行流程。
Span Hierarchy
Span层级结构
invocation
└── agent_run (one per agent in the chain)
├── call_llm (model request/response)
└── execute_tool (tool execution)invocation
└── agent_run (链式调用中的每个Agent对应一个)
├── call_llm (模型请求/响应)
└── execute_tool (工具执行)Setup by Deployment Type
按部署类型设置
| Deployment | Setup |
|---|---|
| Agent Engine | Automatic — traces are exported to Cloud Trace by default |
| Cloud Run (scaffolded) | Automatic — |
| Cloud Run (manual) | Configure OpenTelemetry exporter in your app |
| Local dev | Works with |
View traces: Cloud Console → Trace → Trace explorer
For detailed setup instructions (Agent Engine CLI/SDK, Cloud Run, custom deployments), fetch .
https://google.github.io/adk-docs/integrations/cloud-trace/index.md| 部署方式 | 设置方法 |
|---|---|
| Agent Engine | 自动配置——追踪数据默认导出至Cloud Trace |
| Cloud Run(脚手架生成) | 自动配置——FastAPI应用中已设置 |
| Cloud Run(手动配置) | 在应用中配置OpenTelemetry导出器 |
| 本地开发 | 配合 |
查看追踪数据:Cloud Console → Trace → Trace explorer
如需详细设置说明(Agent Engine CLI/SDK、Cloud Run、自定义部署),请获取文档:。
https://google.github.io/adk-docs/integrations/cloud-trace/index.mdPrompt-Response Logging
提示-响应日志
Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL), BigQuery (external tables), and Cloud Logging (dedicated bucket). Privacy-preserving by default — only metadata is logged unless explicitly configured otherwise.
Key env var: — set to (metadata only, default in deployed envs), (full content), or (disabled). Logging is disabled locally unless is set.
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENTNO_CONTENTtruefalseLOGS_BUCKET_NAMEFor scaffolded project details (Terraform resources, env vars, privacy modes, enabling/disabling, verification commands), see .
references/cloud-trace-and-logging.mdFor ADK logging docs (log levels, configuration, debugging), fetch .
https://google.github.io/adk-docs/observability/logging/index.md捕获生成式AI交互数据(模型名称、Token数量、耗时)并导出至GCS(JSONL格式)、BigQuery(外部表)和Cloud Logging(专用存储桶)。默认采用隐私保护模式——除非明确配置,否则仅记录元数据。
关键环境变量:——可设置为(仅记录元数据,部署环境默认值)、(记录完整内容)或(禁用日志)。本地环境下,除非设置了,否则日志功能处于禁用状态。
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENTNO_CONTENTtruefalseLOGS_BUCKET_NAME有关脚手架项目的详情(Terraform资源、环境变量、隐私模式、启用/禁用方法、验证命令),请参阅。
references/cloud-trace-and-logging.md如需ADK日志文档(日志级别、配置、调试),请获取:。
https://google.github.io/adk-docs/observability/logging/index.mdBigQuery Agent Analytics Plugin
BigQuery Agent Analytics插件
Optional plugin that logs structured agent events to BigQuery. Enable with at scaffold time. See for details.
--bq-analyticsreferences/bigquery-agent-analytics.md可选插件,用于将结构化Agent事件写入BigQuery。在生成脚手架时添加参数即可启用。详情请参阅。
--bq-analyticsreferences/bigquery-agent-analytics.mdThird-Party Integrations
第三方集成
ADK supports several third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.
| Platform | Key Differentiator | Setup Complexity | Self-Hosted Option |
|---|---|---|---|
| AgentOps | Session replays, 2-line setup, replaces native telemetry | Minimal | No (SaaS) |
| Arize AX | Commercial platform, production monitoring, evaluation dashboards | Low | No (SaaS) |
| Phoenix | Open-source, custom evaluators, experiment testing | Low | Yes |
| MLflow | OTel traces to MLflow Tracking Server, span tree visualization | Medium (needs SQL backend) | Yes |
| Monocle | 1-call setup, VS Code Gantt chart visualizer | Minimal | Yes (local files) |
| Weave | W&B platform, team collaboration, timeline views | Low | No (SaaS) |
| Freeplay | Prompt management + evals + observability in one platform | Low | No (SaaS) |
Ask the user which platform they prefer — present the trade-offs and let them choose. For setup details, fetch the relevant ADK docs page from the Deep Dive table below.
ADK支持多款第三方可观测性平台。各平台通过OpenTelemetry或自定义工具来捕获Agent行为。
| 平台 | 核心优势 | 设置复杂度 | 可自托管选项 |
|---|---|---|---|
| AgentOps | 会话重放、两行代码即可完成设置、替代原生遥测 | 极低 | 无(仅SaaS) |
| Arize AX | 商用平台、生产环境监控、评估仪表盘 | 低 | 无(仅SaaS) |
| Phoenix | 开源、支持自定义评估器、实验测试 | 低 | 是 |
| MLflow | 将OTel追踪数据发送至MLflow Tracking Server、支持Span树可视化 | 中等(需SQL后端) | 是 |
| Monocle | 一键设置、VS Code甘特图可视化工具 | 极低 | 是(本地文件存储) |
| Weave | W&B平台、团队协作、时间线视图 | 低 | 无(仅SaaS) |
| Freeplay | 集提示词管理、评估、可观测性于一体的平台 | 低 | 无(仅SaaS) |
请询问用户偏好的平台——说明各平台的权衡,由用户选择。如需设置详情,请从下方深度探索表格中获取对应的ADK文档页面。
Troubleshooting
故障排查
| Issue | Solution |
|---|---|
| No traces in Cloud Trace | Verify |
| Prompt-response data not appearing | Check |
| Privacy mode misconfigured | Check |
| BigQuery Analytics not logging | Verify plugin is configured in |
| Third-party integration not capturing spans | Check provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry |
| Traces missing tool spans | Tool execution spans appear under |
| High telemetry costs | Switch to |
| 问题 | 解决方案 |
|---|---|
| Cloud Trace中无追踪数据 | 确认FastAPI应用中 |
| 提示-响应数据未显示 | 确认已设置 |
| 隐私模式配置错误 | 检查 |
| BigQuery分析未记录数据 | 确认 |
| 第三方集成未捕获Span | 检查供应商特定的环境变量(API密钥、端点);部分供应商(如AgentOps)会替代原生遥测 |
| 追踪数据中缺少工具Span | 工具执行Span位于 |
| 遥测成本过高 | 切换至 |
Deep Dive: ADK Docs (WebFetch URLs)
深度探索:ADK文档(WebFetch链接)
For detailed documentation beyond what this skill covers, fetch these pages:
| Topic | URL |
|---|---|
| Observability overview | |
| Agent activity logging | |
| Cloud Trace integration | |
| BigQuery Agent Analytics | |
| AgentOps | |
| Arize AX | |
| Phoenix (Arize) | |
| MLflow tracing | |
| Monocle | |
| W&B Weave | |
| Freeplay | |
如需本文档未涵盖的详细内容,请获取以下页面:
| 主题 | 链接 |
|---|---|
| 可观测性概述 | |
| Agent活动日志 | |
| Cloud Trace集成 | |
| BigQuery Agent分析 | |
| AgentOps | |
| Arize AX | |
| Phoenix (Arize) | |
| MLflow追踪 | |
| Monocle | |
| W&B Weave | |
| Freeplay | |