adk-observability-guide

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ADK Observability Guide

ADK可观测性指南

Scaffolded project? Cloud Trace and prompt-response logging are pre-configured by Terraform. See
references/cloud-trace-and-logging.md
for infrastructure details, env vars, and verification commands.
No scaffold? Follow the ADK docs links below for manual setup. For production infrastructure, scaffold with
/adk-scaffold
.

使用脚手架项目？ Cloud Trace和提示词-响应日志已由Terraform预先配置。如需了解基础设施细节、环境变量和验证命令，请查看
references/cloud-trace-and-logging.md
。
无脚手架？ 请按照下方ADK文档链接进行手动配置。对于生产环境基础设施，使用
/adk-scaffold
生成脚手架。

Reference Files

参考文件

File	Contents
`references/cloud-trace-and-logging.md`	Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally
`references/third-party.md`	Third-party integration setup patterns, trade-offs, and ADK docs links for each provider

文件	内容
`references/cloud-trace-and-logging.md`	脚手架项目详情——Terraform预配资源、环境变量、验证命令、本地启用/禁用方法
`references/third-party.md`	第三方集成配置模式、权衡要点，以及各供应商对应的ADK文档链接

Observability Tiers

可观测性层级

Choose the right level of observability based on your needs:

Tier	What It Does	Scope	Default State	Best For
Cloud Trace	Distributed tracing — execution flow, latency, errors via OpenTelemetry spans	All templates, all environments	Always enabled	Debugging latency, understanding agent execution flow
Prompt-Response Logging	GenAI interactions exported to GCS, BigQuery, and Cloud Logging	ADK agents only	Disabled locally, enabled when deployed	Auditing LLM interactions, compliance
BigQuery Agent Analytics	Structured agent events (LLM calls, tool use, outcomes) to BigQuery	ADK agents with plugin enabled	Opt-in ( `--bq-analytics` at scaffold time)	Conversational analytics, custom dashboards, LLM-as-judge evals
Third-Party Integrations	External observability platforms (AgentOps, Phoenix, MLflow, etc.)	Any ADK agent	Opt-in, per-provider setup	Team collaboration, specialized visualization, prompt management

Ask the user which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.

根据需求选择合适的可观测性层级：

层级	功能	适用范围	默认状态	最佳适用场景
Cloud Trace	分布式追踪——通过OpenTelemetry Span追踪执行流程、延迟和错误	所有模板、所有环境	始终启用	调试延迟问题、理解Agent执行流程
提示词-响应日志	将生成式AI交互数据导出至GCS、BigQuery和Cloud Logging	仅ADK Agent	本地禁用，部署后启用	审计LLM交互、合规需求
BigQuery Agent分析	将结构化Agent事件（LLM调用、工具使用、执行结果）同步至BigQuery	已启用插件的ADK Agent	可选（脚手架创建时通过 `--bq-analytics` 启用）	会话分析、自定义仪表盘、LLM作为评判者的评估
第三方集成	对接外部可观测性平台（AgentOps、Phoenix、MLflow等）	任意ADK Agent	可选，需按供应商配置	团队协作、专业可视化、提示词管理

请询问用户需要启用哪些层级——这些层级可以组合使用。Cloud Trace始终处于启用状态；其他层级为可选附加功能。

Cloud Trace

ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.

ADK使用OpenTelemetry生成分布式追踪数据。每次Agent调用都会生成Span，用于追踪完整的执行流程。

Span Hierarchy

Span层级结构

invocation
  └── agent_run (one per agent in the chain)
        ├── call_llm (model request/response)
        └── execute_tool (tool execution)

invocation
  └── agent_run (链式调用中的每个Agent对应一个)
        ├── call_llm (模型请求/响应)
        └── execute_tool (工具执行)

Setup by Deployment Type

按部署类型配置

Deployment	Setup
Agent Engine	Automatic — traces are exported to Cloud Trace by default
Cloud Run (scaffolded)	Automatic — `otel_to_cloud=True` in the FastAPI app
Cloud Run (manual)	Configure OpenTelemetry exporter in your app
Local dev	Works with `make playground` ; traces visible in Cloud Console

View traces: Cloud Console → Trace → Trace explorer

For detailed setup instructions (Agent Engine CLI/SDK, Cloud Run, custom deployments), fetch the ADK docs:

WebFetch: https://google.github.io/adk-docs/integrations/cloud-trace/index.md

部署方式	配置方法
Agent Engine	自动配置——追踪数据默认导出至Cloud Trace
Cloud Run（脚手架生成）	自动配置——FastAPI应用中已设置 `otel_to_cloud=True`
Cloud Run（手动部署）	在应用中配置OpenTelemetry导出器
本地开发	配合 `make playground` 使用；可在Cloud Console中查看追踪数据

查看追踪数据：Cloud Console → Trace → Trace explorer

如需详细配置说明（Agent Engine CLI/SDK、Cloud Run、自定义部署），请查阅ADK文档：

WebFetch: https://google.github.io/adk-docs/integrations/cloud-trace/index.md

Prompt-Response Logging

提示词-响应日志

Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL), BigQuery (external tables), and Cloud Logging (dedicated bucket).

捕获生成式AI交互数据（模型名称、Token数、耗时）并导出至GCS（JSONL格式）、BigQuery（外部表）和Cloud Logging（专用存储桶）。

Privacy Modes

隐私模式

Prompt-response logging is privacy-preserving by default — only metadata is logged. Controlled by

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

Value	Behavior
`false`	Logging disabled
`NO_CONTENT`	Enabled, metadata only — tokens, model name, timing (default in deployed environments)
`true`	Enabled with full prompt/response content (not recommended for production)

For Agent Engine: the platform requires

true

during deployment, but the app overrides to

NO_CONTENT

at runtime.

提示词-响应日志默认保护隐私——仅记录元数据。由环境变量

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

控制：

值	行为
`false`	禁用日志
`NO_CONTENT`	启用，仅记录元数据——Token数、模型名称、耗时（部署环境默认值）
`true`	启用，记录完整提示词/响应内容（不推荐用于生产环境）

对于Agent Engine：平台部署时要求设置为

true

，但应用运行时会覆盖为

NO_CONTENT

。

Behavior by Environment

不同环境下的行为

Environment	Prompt-Response Logging	Why
Local dev ( `make playground` )	Disabled	No `LOGS_BUCKET_NAME` set
Dev (Terraform deployed)	Enabled ( `NO_CONTENT` )	Terraform sets env vars
Staging / Production	Enabled ( `NO_CONTENT` )	Terraform sets env vars

To enable locally, set

LOGS_BUCKET_NAME

and

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=NO_CONTENT

before running

make playground

To disable in a deployed environment, set

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=false

deployment/terraform/service.tf

and re-apply.

For scaffolded project infrastructure details (Terraform resources, env vars, verification), see

references/cloud-trace-and-logging.md

For ADK logging docs (log levels, configuration, debugging):

WebFetch: https://google.github.io/adk-docs/observability/logging/index.md

环境	提示词-响应日志状态	原因
本地开发（ `make playground` ）	禁用	未设置 `LOGS_BUCKET_NAME`
开发环境（Terraform部署）	启用（ `NO_CONTENT` ）	Terraform已配置环境变量
预发布/生产环境	启用（ `NO_CONTENT` ）	Terraform已配置环境变量

如需在本地启用，请在运行

make playground

前设置

LOGS_BUCKET_NAME

和

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=NO_CONTENT

。

如需在部署环境中禁用，请在

deployment/terraform/service.tf

中设置

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=false

并重新应用配置。

如需了解脚手架项目的基础设施细节（Terraform资源、环境变量、验证方法），请查看

references/cloud-trace-and-logging.md

。

如需ADK日志文档（日志级别、配置、调试）：

WebFetch: https://google.github.io/adk-docs/observability/logging/index.md

BigQuery Agent Analytics Plugin

BigQuery Agent分析插件

An optional plugin that logs structured agent events directly to BigQuery via the Storage Write API. Enables:

Conversational analytics — session flows, user interaction patterns
LLM-as-judge evals — structured data for evaluation pipelines
Custom dashboards — Looker Studio integration
Tool provenance tracking — LOCAL, MCP, SUB_AGENT, A2A, TRANSFER_AGENT

一款可选插件，通过Storage Write API将结构化Agent事件直接记录至BigQuery。支持以下功能：

会话分析——会话流程、用户交互模式
LLM作为评判者的评估——用于评估流水线的结构化数据
自定义仪表盘——对接Looker Studio
工具来源追踪——LOCAL、MCP、SUB_AGENT、A2A、TRANSFER_AGENT

Enabling

启用方式

Method	How
At scaffold time	`uvx agent-starter-pack create . --bq-analytics`
Post-scaffold	Add the plugin manually to `app/agent.py` (see ADK docs)

Infrastructure (BigQuery dataset, GCS offloading) is provisioned automatically by Terraform when enabled at scaffold time.

方法	操作步骤
脚手架创建时	`uvx agent-starter-pack create . --bq-analytics`
脚手架创建后	手动将插件添加至 `app/agent.py` （请查阅ADK文档）

当在脚手架创建时启用该插件，Terraform会自动预配基础设施（BigQuery数据集、GCS转储）。

Key Features

核心特性

Auto-schema upgrade (new fields added without migration)
GCS offloading for multimodal content (images, audio)
Distributed tracing via OpenTelemetry span context
SQL-queryable event log for all agent interactions

For full schema, SQL query examples, and Looker Studio setup:

WebFetch: https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md

自动升级Schema（无需迁移即可添加新字段）
GCS转储多模态内容（图片、音频）
通过OpenTelemetry Span上下文实现分布式追踪
所有Agent交互事件均可通过SQL查询

如需完整Schema、SQL查询示例和Looker Studio配置方法：

WebFetch: https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md

Third-Party Integrations

第三方集成

ADK supports six third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.

Platform	Key Differentiator	Setup Complexity	Self-Hosted Option
AgentOps	Session replays, 2-line setup, replaces native telemetry	Minimal	No (SaaS)
Phoenix	Open-source, custom evaluators, experiment testing	Low	Yes
MLflow	OTel traces to MLflow Tracking Server, span tree visualization	Medium (needs SQL backend)	Yes
Monocle	1-call setup, VS Code Gantt chart visualizer	Minimal	Yes (local files)
Weave	W&B platform, team collaboration, timeline views	Low	No (SaaS)
Freeplay	Prompt management + evals + observability in one platform	Low	No (SaaS)

Ask the user which platform they prefer — present the trade-offs and let them choose. For setup details on each, see

references/third-party.md

ADK支持6款第三方可观测性平台。各平台通过OpenTelemetry或自定义埋点捕获Agent行为。

平台	核心优势	配置复杂度	自托管选项
AgentOps	会话重放、2行代码完成配置、替代原生遥测	极低	无（SaaS）
Phoenix	开源、自定义评估器、实验测试	低	是
MLflow	将OTel追踪数据同步至MLflow Tracking Server、Span树可视化	中等（需SQL后端）	是
Monocle	1调用完成配置、VS Code甘特图可视化工具	极低	是（本地文件）
Weave	对接W&B平台、团队协作、时间线视图	低	无（SaaS）
Freeplay	提示词管理+评估+可观测性一体化平台	低	无（SaaS）

请询问用户偏好的平台——说明各平台的权衡点并让用户选择。如需各平台的配置细节，请查看

references/third-party.md

。

Troubleshooting

故障排查

Issue	Solution
No traces in Cloud Trace	Verify `otel_to_cloud=True` in FastAPI app; check service account has `cloudtrace.agent` role
Prompt-response data not appearing	Check `LOGS_BUCKET_NAME` is set; verify SA has `storage.objectCreator` on the bucket; check app logs for telemetry setup warnings
Privacy mode misconfigured	Check `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` value — use `NO_CONTENT` for metadata-only, `false` to disable
BigQuery Analytics not logging	Verify plugin is configured in `app/agent.py` ; check `BQ_ANALYTICS_DATASET_ID` env var is set
Third-party integration not capturing spans	Check provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry
Traces missing tool spans	Tool execution spans appear under `execute_tool` — check trace explorer filters
High telemetry costs	Switch to `NO_CONTENT` mode; reduce BigQuery retention; disable unused tiers

问题	解决方案
Cloud Trace中无追踪数据	验证FastAPI应用中 `otel_to_cloud=True` ；检查服务账号是否拥有 `cloudtrace.agent` 角色
提示词-响应数据未显示	检查是否已设置 `LOGS_BUCKET_NAME` ；验证服务账号是否拥有存储桶的 `storage.objectCreator` 权限；检查应用日志中的遥测配置警告
隐私模式配置错误	检查 `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` 的值——仅记录元数据请用 `NO_CONTENT` ，禁用请用 `false`
BigQuery分析无日志	验证 `app/agent.py` 中已配置插件；检查是否已设置 `BQ_ANALYTICS_DATASET_ID` 环境变量
第三方集成未捕获Span	检查供应商特定的环境变量（API密钥、端点）；部分供应商（如AgentOps）会替代原生遥测
追踪数据中缺少工具Span	工具执行Span位于 `execute_tool` 下——检查追踪探索器的筛选条件
遥测成本过高	切换至 `NO_CONTENT` 模式；缩短BigQuery数据保留时长；禁用未使用的层级

Deep Dive: ADK Docs (WebFetch URLs)

深入学习：ADK文档（WebFetch链接）

For detailed documentation beyond what this skill covers, fetch these pages:

Topic	URL
Observability overview	`https://google.github.io/adk-docs/observability/index.md`
Agent activity logging	`https://google.github.io/adk-docs/observability/logging/index.md`
Cloud Trace integration	`https://google.github.io/adk-docs/integrations/cloud-trace/index.md`
BigQuery Agent Analytics	`https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md`
AgentOps	`https://google.github.io/adk-docs/integrations/agentops/index.md`
Phoenix (Arize)	`https://google.github.io/adk-docs/integrations/phoenix/index.md`
MLflow tracing	`https://google.github.io/adk-docs/integrations/mlflow/index.md`
Monocle	`https://google.github.io/adk-docs/integrations/monocle/index.md`
W&B Weave	`https://google.github.io/adk-docs/integrations/weave/index.md`
Freeplay	`https://google.github.io/adk-docs/integrations/freeplay/index.md`

如需本文档未涵盖的详细说明，请查阅以下页面：

主题	链接
可观测性概述	`https://google.github.io/adk-docs/observability/index.md`
Agent活动日志	`https://google.github.io/adk-docs/observability/logging/index.md`
Cloud Trace集成	`https://google.github.io/adk-docs/integrations/cloud-trace/index.md`
BigQuery Agent分析	`https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/index.md`
AgentOps	`https://google.github.io/adk-docs/integrations/agentops/index.md`
Phoenix (Arize)	`https://google.github.io/adk-docs/integrations/phoenix/index.md`
MLflow追踪	`https://google.github.io/adk-docs/integrations/mlflow/index.md`
Monocle	`https://google.github.io/adk-docs/integrations/monocle/index.md`
W&B Weave	`https://google.github.io/adk-docs/integrations/weave/index.md`
Freeplay	`https://google.github.io/adk-docs/integrations/freeplay/index.md`