adk-observability-guide

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ADK Observability Guide

ADK可观测性指南

Scaffolded project? Cloud Trace and prompt-response logging are pre-configured by Terraform. See
references/cloud-trace-and-logging.md
for infrastructure details, env vars, and verification commands.
No scaffold? Follow the ADK docs links below for manual setup. For production infrastructure, scaffold with
/adk-scaffold
.
使用脚手架项目? Cloud Trace和提示词-响应日志已由Terraform预先配置。如需了解基础设施细节、环境变量和验证命令,请查看
references/cloud-trace-and-logging.md
无脚手架? 请按照下方ADK文档链接进行手动配置。对于生产环境基础设施,使用
/adk-scaffold
生成脚手架。

Reference Files

参考文件

FileContents
references/cloud-trace-and-logging.md
Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally
references/third-party.md
Third-party integration setup patterns, trade-offs, and ADK docs links for each provider

文件内容
references/cloud-trace-and-logging.md
脚手架项目详情——Terraform预配资源、环境变量、验证命令、本地启用/禁用方法
references/third-party.md
第三方集成配置模式、权衡要点,以及各供应商对应的ADK文档链接

Observability Tiers

可观测性层级

Choose the right level of observability based on your needs:
TierWhat It DoesScopeDefault StateBest For
Cloud TraceDistributed tracing — execution flow, latency, errors via OpenTelemetry spansAll templates, all environmentsAlways enabledDebugging latency, understanding agent execution flow
Prompt-Response LoggingGenAI interactions exported to GCS, BigQuery, and Cloud LoggingADK agents onlyDisabled locally, enabled when deployedAuditing LLM interactions, compliance
BigQuery Agent AnalyticsStructured agent events (LLM calls, tool use, outcomes) to BigQueryADK agents with plugin enabledOpt-in (
--bq-analytics
at scaffold time)
Conversational analytics, custom dashboards, LLM-as-judge evals
Third-Party IntegrationsExternal observability platforms (AgentOps, Phoenix, MLflow, etc.)Any ADK agentOpt-in, per-provider setupTeam collaboration, specialized visualization, prompt management
Ask the user which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.

根据需求选择合适的可观测性层级:
层级功能适用范围默认状态最佳适用场景
Cloud Trace分布式追踪——通过OpenTelemetry Span追踪执行流程、延迟和错误所有模板、所有环境始终启用调试延迟问题、理解Agent执行流程
提示词-响应日志将生成式AI交互数据导出至GCS、BigQuery和Cloud Logging仅ADK Agent本地禁用,部署后启用审计LLM交互、合规需求
BigQuery Agent分析将结构化Agent事件(LLM调用、工具使用、执行结果)同步至BigQuery已启用插件的ADK Agent可选(脚手架创建时通过
--bq-analytics
启用)
会话分析、自定义仪表盘、LLM作为评判者的评估
第三方集成对接外部可观测性平台(AgentOps、Phoenix、MLflow等)任意ADK Agent可选,需按供应商配置团队协作、专业可视化、提示词管理
请询问用户需要启用哪些层级——这些层级可以组合使用。Cloud Trace始终处于启用状态;其他层级为可选附加功能。

Cloud Trace

Cloud Trace

ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.
ADK使用OpenTelemetry生成分布式追踪数据。每次Agent调用都会生成Span,用于追踪完整的执行流程。

Span Hierarchy

Span层级结构

invocation
  └── agent_run (one per agent in the chain)
        ├── call_llm (model request/response)
        └── execute_tool (tool execution)
invocation
  └── agent_run (链式调用中的每个Agent对应一个)
        ├── call_llm (模型请求/响应)
        └── execute_tool (工具执行)

Setup by Deployment Type

按部署类型配置

DeploymentSetup
Agent EngineAutomatic — traces are exported to Cloud Trace by default
Cloud Run (scaffolded)Automatic —
otel_to_cloud=True
in the FastAPI app
Cloud Run (manual)Configure OpenTelemetry exporter in your app
Local devWorks with
make playground
; traces visible in Cloud Console
View traces: Cloud Console → Trace → Trace explorer
For detailed setup instructions (Agent Engine CLI/SDK, Cloud Run, custom deployments), fetch the ADK docs:
  • WebFetch: https://google.github.io/adk-docs/integrations/cloud-trace/index.md

部署方式配置方法
Agent Engine自动配置——追踪数据默认导出至Cloud Trace
Cloud Run(脚手架生成)自动配置——FastAPI应用中已设置
otel_to_cloud=True
Cloud Run(手动部署)在应用中配置OpenTelemetry导出器
本地开发配合
make playground
使用;可在Cloud Console中查看追踪数据
查看追踪数据:Cloud Console → Trace → Trace explorer
如需详细配置说明(Agent Engine CLI/SDK、Cloud Run、自定义部署),请查阅ADK文档:
  • WebFetch: https://google.github.io/adk-docs/integrations/cloud-trace/index.md

Prompt-Response Logging

提示词-响应日志

Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL), BigQuery (external tables), and Cloud Logging (dedicated bucket).
捕获生成式AI交互数据(模型名称、Token数、耗时)并导出至GCS(JSONL格式)、BigQuery(外部表)和Cloud Logging(专用存储桶)。

Privacy Modes

隐私模式

Prompt-response logging is privacy-preserving by default — only metadata is logged. Controlled by
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
:
ValueBehavior
false
Logging disabled
NO_CONTENT
Enabled, metadata only — tokens, model name, timing (default in deployed environments)
true
Enabled with full prompt/response content (not recommended for production)
For Agent Engine: the platform requires
true
during deployment, but the app overrides to
NO_CONTENT
at runtime.
提示词-响应日志默认保护隐私——仅记录元数据。由环境变量
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
控制:
行为
false
禁用日志
NO_CONTENT
启用,仅记录元数据——Token数、模型名称、耗时(部署环境默认值)
true
启用,记录完整提示词/响应内容(不推荐用于生产环境)
对于Agent Engine:平台部署时要求设置为
true
,但应用运行时会覆盖为
NO_CONTENT

Behavior by Environment

不同环境下的行为

EnvironmentPrompt-Response LoggingWhy
Local dev (
make playground
)
DisabledNo
LOGS_BUCKET_NAME
set
Dev (Terraform deployed)Enabled (
NO_CONTENT
)
Terraform sets env vars
Staging / ProductionEnabled (
NO_CONTENT
)
Terraform sets env vars
To enable locally, set
LOGS_BUCKET_NAME
and
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=NO_CONTENT
before running
make playground
.
To disable in a deployed environment, set
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=false
in
deployment/terraform/service.tf
and re-apply.
For scaffolded project infrastructure details (Terraform resources, env vars, verification), see
references/cloud-trace-and-logging.md
.
For ADK logging docs (log levels, configuration, debugging):
  • WebFetch: https://google.github.io/adk-docs/observability/logging/index.md

环境提示词-响应日志状态原因
本地开发(
make playground
禁用未设置
LOGS_BUCKET_NAME
开发环境(Terraform部署)启用(
NO_CONTENT
Terraform已配置环境变量
预发布/生产环境启用(
NO_CONTENT
Terraform已配置环境变量
如需在本地启用,请在运行
make playground
前设置
LOGS_BUCKET_NAME
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=NO_CONTENT
如需在部署环境中禁用,请在
deployment/terraform/service.tf
中设置
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=false
并重新应用配置。
如需了解脚手架项目的基础设施细节(Terraform资源、环境变量、验证方法),请查看
references/cloud-trace-and-logging.md
如需ADK日志文档(日志级别、配置、调试):
  • WebFetch: https://google.github.io/adk-docs/observability/logging/index.md

BigQuery Agent Analytics Plugin

BigQuery Agent分析插件

An optional plugin that logs structured agent events directly to BigQuery via the Storage Write API. Enables:
  • Conversational analytics — session flows, user interaction patterns
  • LLM-as-judge evals — structured data for evaluation pipelines
  • Custom dashboards — Looker Studio integration
  • Tool provenance tracking — LOCAL, MCP, SUB_AGENT, A2A, TRANSFER_AGENT
一款可选插件,通过Storage Write API将结构化Agent事件直接记录至BigQuery。支持以下功能:
  • 会话分析——会话流程、用户交互模式
  • LLM作为评判者的评估——用于评估流水线的结构化数据
  • 自定义仪表盘——对接Looker Studio
  • 工具来源追踪——LOCAL、MCP、SUB_AGENT、A2A、TRANSFER_AGENT

Enabling

启用方式

MethodHow
At scaffold time
uvx agent-starter-pack create . --bq-analytics
Post-scaffoldAdd the plugin manually to
app/agent.py
(see ADK docs)
Infrastructure (BigQuery dataset, GCS offloading) is provisioned automatically by Terraform when enabled at scaffold time.
方法操作步骤
脚手架创建时
uvx agent-starter-pack create . --bq-analytics
脚手架创建后手动将插件添加至
app/agent.py
(请查阅ADK文档)
当在脚手架创建时启用该插件,Terraform会自动预配基础设施(BigQuery数据集、GCS转储)。

Key Features

核心特性

  • Auto-schema upgrade (new fields added without migration)
  • GCS offloading for multimodal content (images, audio)
  • Distributed tracing via OpenTelemetry span context
  • SQL-queryable event log for all agent interactions
For full schema, SQL query examples, and Looker Studio setup:
  • WebFetch: https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md

  • 自动升级Schema(无需迁移即可添加新字段)
  • GCS转储多模态内容(图片、音频)
  • 通过OpenTelemetry Span上下文实现分布式追踪
  • 所有Agent交互事件均可通过SQL查询
如需完整Schema、SQL查询示例和Looker Studio配置方法:
  • WebFetch: https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md

Third-Party Integrations

第三方集成

ADK supports six third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.
PlatformKey DifferentiatorSetup ComplexitySelf-Hosted Option
AgentOpsSession replays, 2-line setup, replaces native telemetryMinimalNo (SaaS)
PhoenixOpen-source, custom evaluators, experiment testingLowYes
MLflowOTel traces to MLflow Tracking Server, span tree visualizationMedium (needs SQL backend)Yes
Monocle1-call setup, VS Code Gantt chart visualizerMinimalYes (local files)
WeaveW&B platform, team collaboration, timeline viewsLowNo (SaaS)
FreeplayPrompt management + evals + observability in one platformLowNo (SaaS)
Ask the user which platform they prefer — present the trade-offs and let them choose. For setup details on each, see
references/third-party.md
.

ADK支持6款第三方可观测性平台。各平台通过OpenTelemetry或自定义埋点捕获Agent行为。
平台核心优势配置复杂度自托管选项
AgentOps会话重放、2行代码完成配置、替代原生遥测极低无(SaaS)
Phoenix开源、自定义评估器、实验测试
MLflow将OTel追踪数据同步至MLflow Tracking Server、Span树可视化中等(需SQL后端)
Monocle1调用完成配置、VS Code甘特图可视化工具极低是(本地文件)
Weave对接W&B平台、团队协作、时间线视图无(SaaS)
Freeplay提示词管理+评估+可观测性一体化平台无(SaaS)
请询问用户偏好的平台——说明各平台的权衡点并让用户选择。如需各平台的配置细节,请查看
references/third-party.md

Troubleshooting

故障排查

IssueSolution
No traces in Cloud TraceVerify
otel_to_cloud=True
in FastAPI app; check service account has
cloudtrace.agent
role
Prompt-response data not appearingCheck
LOGS_BUCKET_NAME
is set; verify SA has
storage.objectCreator
on the bucket; check app logs for telemetry setup warnings
Privacy mode misconfiguredCheck
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
value — use
NO_CONTENT
for metadata-only,
false
to disable
BigQuery Analytics not loggingVerify plugin is configured in
app/agent.py
; check
BQ_ANALYTICS_DATASET_ID
env var is set
Third-party integration not capturing spansCheck provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry
Traces missing tool spansTool execution spans appear under
execute_tool
— check trace explorer filters
High telemetry costsSwitch to
NO_CONTENT
mode; reduce BigQuery retention; disable unused tiers

问题解决方案
Cloud Trace中无追踪数据验证FastAPI应用中
otel_to_cloud=True
;检查服务账号是否拥有
cloudtrace.agent
角色
提示词-响应数据未显示检查是否已设置
LOGS_BUCKET_NAME
;验证服务账号是否拥有存储桶的
storage.objectCreator
权限;检查应用日志中的遥测配置警告
隐私模式配置错误检查
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
的值——仅记录元数据请用
NO_CONTENT
,禁用请用
false
BigQuery分析无日志验证
app/agent.py
中已配置插件;检查是否已设置
BQ_ANALYTICS_DATASET_ID
环境变量
第三方集成未捕获Span检查供应商特定的环境变量(API密钥、端点);部分供应商(如AgentOps)会替代原生遥测
追踪数据中缺少工具Span工具执行Span位于
execute_tool
下——检查追踪探索器的筛选条件
遥测成本过高切换至
NO_CONTENT
模式;缩短BigQuery数据保留时长;禁用未使用的层级

Deep Dive: ADK Docs (WebFetch URLs)

深入学习:ADK文档(WebFetch链接)

For detailed documentation beyond what this skill covers, fetch these pages:
TopicURL
Observability overview
https://google.github.io/adk-docs/observability/index.md
Agent activity logging
https://google.github.io/adk-docs/observability/logging/index.md
Cloud Trace integration
https://google.github.io/adk-docs/integrations/cloud-trace/index.md
BigQuery Agent Analytics
https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md
AgentOps
https://google.github.io/adk-docs/integrations/agentops/index.md
Phoenix (Arize)
https://google.github.io/adk-docs/integrations/phoenix/index.md
MLflow tracing
https://google.github.io/adk-docs/integrations/mlflow/index.md
Monocle
https://google.github.io/adk-docs/integrations/monocle/index.md
W&B Weave
https://google.github.io/adk-docs/integrations/weave/index.md
Freeplay
https://google.github.io/adk-docs/integrations/freeplay/index.md
如需本文档未涵盖的详细说明,请查阅以下页面:
主题链接
可观测性概述
https://google.github.io/adk-docs/observability/index.md
Agent活动日志
https://google.github.io/adk-docs/observability/logging/index.md
Cloud Trace集成
https://google.github.io/adk-docs/integrations/cloud-trace/index.md
BigQuery Agent分析
https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md
AgentOps
https://google.github.io/adk-docs/integrations/agentops/index.md
Phoenix (Arize)
https://google.github.io/adk-docs/integrations/phoenix/index.md
MLflow追踪
https://google.github.io/adk-docs/integrations/mlflow/index.md
Monocle
https://google.github.io/adk-docs/integrations/monocle/index.md
W&B Weave
https://google.github.io/adk-docs/integrations/weave/index.md
Freeplay
https://google.github.io/adk-docs/integrations/freeplay/index.md