google-agents-cli-observability

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ADK Observability Guide

ADK可观测性指南

Cloud Trace works out of the box — no infrastructure needed. Prompt-response logging and BigQuery Agent Analytics require Terraform-provisioned infrastructure (service account, GCS bucket, BigQuery dataset). Run
agents-cli infra single-project --project PROJECT_ID
to provision these resources. See
references/cloud-trace-and-logging.md
for details, env vars, and verification commands. If your project isn't scaffolded yet, see
/google-agents-cli-scaffold
first.

Cloud Trace 开箱即用——无需额外基础设施。提示词响应日志和BigQuery Agent Analytics需要通过Terraform预置基础设施（服务账号、GCS存储桶、BigQuery数据集）。运行
agents-cli infra single-project --project PROJECT_ID
来预置这些资源。详情、环境变量和验证命令请查看
references/cloud-trace-and-logging.md
。如果你的项目尚未搭建，请先查看
/google-agents-cli-scaffold
。

Order of operations for

agent_runtime

deployments

agent_runtime

部署的操作顺序

For

deployment_target = agent_runtime

, run

agents-cli infra single-project

before the first

agents-cli deploy

. The Terraform module owns the entire Reasoning Engine resource (display_name, service account, deployment spec, env vars), so applying it after a SDK-based deploy creates a state mismatch — Terraform has no record of the SDK-deployed instance and cannot layer env vars onto it without taking ownership of the whole resource.

If you have already run

agents-cli deploy

, you have two options:

Switch to Terraform-managed. Delete the SDK-deployed Reasoning Engine, then run
```
agents-cli infra single-project
```
followed by
```
agents-cli deploy
```
. Sessions and any in-flight state on the previous instance are lost.
Keep the SDK-deployed instance. Skip
```
infra single-project
```
and set the observability env vars on the running instance directly via the
```
vertexai
```
client
```
update
```
API. You will also need to grant the instance's service account the IAM permissions required to emit telemetry — writing to the logs GCS bucket, BigQuery dataset access, log writer, etc. See
```
deployment/terraform/single-project/iam.tf
```
and
```
telemetry.tf
```
in your scaffolded project for the full set of bindings the Terraform module would otherwise provision. Terraform-managed env vars are not available in this mode.

对于

deployment_target = agent_runtime

，请在首次执行

agents-cli deploy

之前运行

agents-cli infra single-project

。Terraform模块管理整个Reasoning Engine资源（display_name、服务账号、部署规格、环境变量），因此在基于SDK的部署之后执行Terraform会导致状态不匹配——Terraform没有SDK部署实例的记录，无法在不接管整个资源的情况下为其添加环境变量。

如果你已经运行过

agents-cli deploy

，有两个选项：

切换到Terraform管理。删除SDK部署的Reasoning Engine，然后运行
```
agents-cli infra single-project
```
再执行
```
agents-cli deploy
```
。之前实例上的会话和任何进行中的状态将会丢失。
保留SDK部署的实例。跳过
```
infra single-project
```
，通过
```
vertexai
```
客户端的
```
update
```
API直接在运行实例上设置可观测性环境变量。你还需要为实例的服务账号授予发送遥测数据所需的IAM权限——写入日志GCS存储桶、BigQuery数据集访问、日志写入权限等。请查看你搭建的项目中的
```
deployment/terraform/single-project/iam.tf
```
和
```
telemetry.tf
```
文件，了解Terraform模块原本会预置的完整权限绑定。此模式下无法使用Terraform管理的环境变量。

Reference Files

参考文件

File	Contents
`references/cloud-trace-and-logging.md`	Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally
`references/bigquery-agent-analytics.md`	BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance

文件	内容
`references/cloud-trace-and-logging.md`	搭建项目详情——Terraform预置资源、环境变量、验证命令、本地启用/禁用方式
`references/bigquery-agent-analytics.md`	BQ Agent Analytics插件——启用方法、核心功能、GCS卸载、工具溯源

Observability Tiers

可观测性层级

Choose the right level of observability based on your needs:

Tier	What It Does	Scope	Default State	Best For
Cloud Trace	Distributed tracing — execution flow, latency, errors via OpenTelemetry spans	All templates, all environments	Always enabled	Debugging latency, understanding agent execution flow
Prompt-Response Logging	GenAI interactions exported to GCS, BigQuery, and Cloud Logging	ADK agents only	Disabled locally, enabled when deployed	Auditing LLM interactions, compliance
BigQuery Agent Analytics	Structured agent events (LLM calls, tool use, outcomes) to BigQuery	ADK agents with plugin enabled	Opt-in ( `--bq-analytics` at scaffold time)	Conversational analytics, custom dashboards, LLM-as-judge evals
Third-Party Integrations	External observability platforms (AgentOps, Phoenix, MLflow, etc.)	Any ADK agent	Opt-in, per-provider setup	Team collaboration, specialized visualization, prompt management

Ask the user which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.

根据你的需求选择合适的可观测性层级：

层级	功能	适用范围	默认状态	最佳适用场景
Cloud Trace	分布式追踪——通过OpenTelemetry Span追踪执行流程、延迟、错误	所有模板、所有环境	始终启用	调试延迟问题、理解Agent执行流程
提示词响应日志	将生成式AI交互导出至GCS、BigQuery和Cloud Logging	仅ADK Agent	本地禁用，部署后启用	审计LLM交互、合规需求
BigQuery Agent Analytics	将结构化Agent事件（LLM调用、工具使用、执行结果）记录至BigQuery	已启用插件的ADK Agent	可选启用（搭建时使用 `--bq-analytics` 参数）	对话分析、自定义仪表盘、LLM-as-judge评估
第三方集成	外部可观测性平台（AgentOps、Phoenix、MLflow等）	任意ADK Agent	可选启用，需按提供商配置	团队协作、专业可视化、提示词管理

请询问用户需要哪些层级——这些层级可以组合使用。Cloud Trace始终开启；其他层级为附加选项。

Cloud Trace

ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.

ADK使用OpenTelemetry发送分布式追踪数据。每次Agent调用都会生成Span，追踪完整的执行流程。

Span Hierarchy

Span层级结构

invocation
  └── agent_run (one per agent in the chain)
        ├── call_llm (model request/response)
        └── execute_tool (tool execution)

invocation
  └── agent_run (链中每个Agent对应一个)
        ├── call_llm (模型请求/响应)
        └── execute_tool (工具执行)

Setup by Deployment Type

按部署类型的设置方法

Deployment	Setup
Agent Runtime	Automatic — traces are exported to Cloud Trace by default
Cloud Run (scaffolded)	Automatic — `otel_to_cloud=True` in the FastAPI app
GKE (scaffolded)	Automatic — `otel_to_cloud=True` in the FastAPI app
Cloud Run / GKE (manual)	Configure OpenTelemetry exporter in your app
Local dev	Works with `agents-cli playground` ; traces visible in Cloud Console

View traces: Cloud Console → Trace → Trace explorer

For detailed setup instructions (Agent Runtime CLI/SDK, Cloud Run, custom deployments), fetch

https://adk.dev/integrations/cloud-trace/index.md

部署方式	设置方法
Agent Runtime	自动配置——默认将追踪数据导出至Cloud Trace
Cloud Run（已搭建）	自动配置——FastAPI应用中 `otel_to_cloud=True`
GKE（已搭建）	自动配置——FastAPI应用中 `otel_to_cloud=True`
Cloud Run / GKE（手动）	在应用中配置OpenTelemetry导出器
本地开发	配合 `agents-cli playground` 使用；可在Cloud Console中查看追踪数据

查看追踪数据：Cloud Console → Trace → Trace explorer

如需详细设置说明（Agent Runtime CLI/SDK、Cloud Run、自定义部署），请获取

https://adk.dev/integrations/cloud-trace/index.md

。

Prompt-Response Logging

提示词响应日志

Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL) and BigQuery (via direct log sinks and external tables). Privacy-preserving by default — only metadata is logged unless explicitly configured otherwise.

Key env var:

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

— set to

NO_CONTENT

(metadata only, default in deployed envs),

true

(full content), or

false

(disabled). Logging is disabled locally unless

LOGS_BUCKET_NAME

is set.

For scaffolded project details (Terraform resources, env vars, privacy modes, enabling/disabling, verification commands), see

references/cloud-trace-and-logging.md

For ADK logging docs (log levels, configuration, debugging), fetch

https://adk.dev/observability/logging/index.md

捕获生成式AI交互（模型名称、Token数、耗时）并导出至GCS（JSONL格式）和BigQuery（通过直接日志接收器和外部表）。默认隐私保护模式——除非明确配置，否则仅记录元数据。

核心环境变量：

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

——可设置为

NO_CONTENT

（仅元数据，部署环境默认值）、

true

（完整内容）或

false

（禁用）。本地环境下除非设置了

LOGS_BUCKET_NAME

，否则日志功能处于禁用状态。

搭建项目详情（Terraform资源、环境变量、隐私模式、启用/禁用方式、验证命令）请查看

references/cloud-trace-and-logging.md

。

ADK日志文档（日志级别、配置、调试）请获取

https://adk.dev/observability/logging/index.md

。

BigQuery Agent Analytics Plugin

BigQuery Agent Analytics插件

Optional plugin that logs structured agent events to BigQuery. Enable with

--bq-analytics

at scaffold time. See

references/bigquery-agent-analytics.md

for details.

可选插件，将结构化Agent事件记录至BigQuery。搭建时使用

--bq-analytics

参数启用。详情请查看

references/bigquery-agent-analytics.md

。

Third-Party Integrations

第三方集成

ADK supports several third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.

Platform	Key Differentiator	Setup Complexity	Self-Hosted Option
AgentOps	Session replays, 2-line setup, replaces native telemetry	Minimal	No (SaaS)
Arize AX	Commercial platform, production monitoring, evaluation dashboards	Low	No (SaaS)
Phoenix	Open-source, custom evaluators, experiment testing	Low	Yes
MLflow	OTel traces to MLflow Tracking Server, span tree visualization	Medium (needs SQL backend)	Yes
Monocle	1-call setup, VS Code Gantt chart visualizer	Minimal	Yes (local files)
Weave	W&B platform, team collaboration, timeline views	Low	No (SaaS)
Freeplay	Prompt management + evals + observability in one platform	Low	No (SaaS)

Ask the user which platform they prefer — present the trade-offs and let them choose. For setup details, fetch the relevant ADK docs page from the Deep Dive table below.

ADK支持多个第三方可观测性平台。每个平台通过OpenTelemetry或自定义工具捕获Agent行为。

平台	核心优势	设置复杂度	自托管选项
AgentOps	会话重放、2行代码即可设置、替代原生遥测	极低	无（SaaS）
Arize AX	商用平台、生产环境监控、评估仪表盘	低	无（SaaS）
Phoenix	开源、自定义评估器、实验测试	低	是
MLflow	将OTel追踪数据发送至MLflow Tracking Server、Span树可视化	中等（需要SQL后端）	是
Monocle	一键设置、VS Code甘特图可视化工具	极低	是（本地文件）
Weave	W&B平台、团队协作、时间线视图	低	无（SaaS）
Freeplay	提示词管理+评估+可观测性一体化平台	低	无（SaaS）

请询问用户偏好哪个平台——说明各平台的权衡，让用户选择。设置详情请从下方深度探索表格中获取相关ADK文档页面。

Troubleshooting

故障排查

Issue	Solution
No traces in Cloud Trace	Verify `otel_to_cloud=True` in FastAPI app; check service account has `cloudtrace.agent` role
Prompt-response data not appearing	Check `LOGS_BUCKET_NAME` is set; verify SA has `storage.objectCreator` on the bucket; check app logs for telemetry setup warnings
Privacy mode misconfigured	Check `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` value — use `NO_CONTENT` for metadata-only, `false` to disable
BigQuery Analytics not logging	Verify plugin is configured in `app/agent.py` ; check `BQ_ANALYTICS_DATASET_ID` env var is set
Third-party integration not capturing spans	Check provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry
Traces missing tool spans	Tool execution spans appear under `execute_tool` — check trace explorer filters
High telemetry costs	Switch to `NO_CONTENT` mode; reduce BigQuery retention; disable unused tiers

问题	解决方案
Cloud Trace中无追踪数据	验证FastAPI应用中 `otel_to_cloud=True` ；检查服务账号是否拥有 `cloudtrace.agent` 角色
提示词响应数据未显示	检查是否设置了 `LOGS_BUCKET_NAME` ；验证服务账号是否拥有存储桶的 `storage.objectCreator` 权限；查看应用日志中的遥测设置警告
隐私模式配置错误	检查 `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` 的值——使用 `NO_CONTENT` 仅记录元数据， `false` 则禁用日志
BigQuery Analytics未记录数据	验证插件是否在 `app/agent.py` 中配置；检查是否设置了 `BQ_ANALYTICS_DATASET_ID` 环境变量
第三方集成未捕获Span	检查提供商特定的环境变量（API密钥、端点）；部分提供商（如AgentOps）会替代原生遥测
追踪数据中缺少工具Span	工具执行Span位于 `execute_tool` 下——检查追踪探索器的筛选条件
遥测成本过高	切换至 `NO_CONTENT` 模式；缩短BigQuery数据保留周期；禁用未使用的层级

Deep Dive: ADK Docs (WebFetch URLs)

深度探索：ADK文档（WebFetch链接）

For detailed documentation beyond what this skill covers, fetch these pages:

Topic	URL
Observability overview	`https://adk.dev/observability/index.md`
Agent activity logging	`https://adk.dev/observability/logging/index.md`
Cloud Trace integration	`https://adk.dev/integrations/cloud-trace/index.md`
BigQuery Agent Analytics	`https://adk.dev/integrations/bigquery-agent-analytics/index.md`
AgentOps	`https://adk.dev/integrations/agentops/index.md`
Arize AX	`https://adk.dev/integrations/arize-ax/index.md`
Phoenix (Arize)	`https://adk.dev/integrations/phoenix/index.md`
MLflow tracing	`https://adk.dev/integrations/mlflow-tracing/index.md`
Monocle	`https://adk.dev/integrations/monocle/index.md`
W&B Weave	`https://adk.dev/integrations/weave/index.md`
Freeplay	`https://adk.dev/integrations/freeplay/index.md`

如需本技能未涵盖的详细文档，请获取以下页面：

主题	链接
可观测性概述	`https://adk.dev/observability/index.md`
Agent活动日志	`https://adk.dev/observability/logging/index.md`
Cloud Trace集成	`https://adk.dev/integrations/cloud-trace/index.md`
BigQuery Agent Analytics	`https://adk.dev/integrations/bigquery-agent-analytics/index.md`
AgentOps	`https://adk.dev/integrations/agentops/index.md`
Arize AX	`https://adk.dev/integrations/arize-ax/index.md`
Phoenix (Arize)	`https://adk.dev/integrations/phoenix/index.md`
MLflow追踪	`https://adk.dev/integrations/mlflow-tracing/index.md`
Monocle	`https://adk.dev/integrations/monocle/index.md`
W&B Weave	`https://adk.dev/integrations/weave/index.md`
Freeplay	`https://adk.dev/integrations/freeplay/index.md`

google-agents-cli-observability

Original

Translation

ADK Observability Guide

ADK可观测性指南

Order of operations for
`agent_runtime`
deployments

`agent_runtime`
部署的操作顺序

Reference Files

参考文件

Observability Tiers

可观测性层级

Cloud Trace

Cloud Trace

Span Hierarchy

Span层级结构

Setup by Deployment Type

按部署类型的设置方法

Prompt-Response Logging

提示词响应日志

BigQuery Agent Analytics Plugin

BigQuery Agent Analytics插件

Third-Party Integrations

第三方集成

Troubleshooting

故障排查

Deep Dive: ADK Docs (WebFetch URLs)

深度探索：ADK文档（WebFetch链接）

Related Skills

相关技能