google-agents-cli-observability
Original:🇺🇸 English
Translated
This skill should be used when the user wants to "set up tracing", "monitor my ADK agent", "configure logging", "add observability", "debug production traffic", or needs guidance on monitoring deployed ADK (Agent Development Kit) agents. Covers Cloud Trace, prompt-response logging, BigQuery Agent Analytics, third-party integrations (AgentOps, Phoenix, MLflow, etc.), and troubleshooting. Part of the Google ADK (Agent Development Kit) skills suite. Do NOT use for deployment setup (use google-agents-cli-deploy) or API code patterns (use google-agents-cli-adk-code).
3installs
Sourcegoogle/agents-cli
Added on
NPX Install
npx skill4agent add google/agents-cli google-agents-cli-observabilityTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →ADK Observability Guide
Cloud Trace works out of the box — no infrastructure needed. Prompt-response logging and BigQuery Agent Analytics require Terraform-provisioned infrastructure (service account, GCS bucket, BigQuery dataset). Runto provision these resources. Seeagents-cli infra single-project --project PROJECT_IDfor details, env vars, and verification commands. If your project isn't scaffolded yet, seereferences/cloud-trace-and-logging.mdfirst./google-agents-cli-scaffold
Order of operations for agent_runtime
deployments
agent_runtimeFor , run before the first . The Terraform module owns the entire Reasoning Engine resource (display_name, service account, deployment spec, env vars), so applying it after a SDK-based deploy creates a state mismatch — Terraform has no record of the SDK-deployed instance and cannot layer env vars onto it without taking ownership of the whole resource.
deployment_target = agent_runtimeagents-cli infra single-projectagents-cli deployIf you have already run , you have two options:
agents-cli deploy- Switch to Terraform-managed. Delete the SDK-deployed Reasoning Engine, then run followed by
agents-cli infra single-project. Sessions and any in-flight state on the previous instance are lost.agents-cli deploy - Keep the SDK-deployed instance. Skip and set the observability env vars on the running instance directly via the
infra single-projectclientvertexaiAPI. You will also need to grant the instance's service account the IAM permissions required to emit telemetry — writing to the logs GCS bucket, BigQuery dataset access, log writer, etc. Seeupdateanddeployment/terraform/single-project/iam.tfin your scaffolded project for the full set of bindings the Terraform module would otherwise provision. Terraform-managed env vars are not available in this mode.telemetry.tf
Reference Files
| File | Contents |
|---|---|
| Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally |
| BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance |
Observability Tiers
Choose the right level of observability based on your needs:
| Tier | What It Does | Scope | Default State | Best For |
|---|---|---|---|---|
| Cloud Trace | Distributed tracing — execution flow, latency, errors via OpenTelemetry spans | All templates, all environments | Always enabled | Debugging latency, understanding agent execution flow |
| Prompt-Response Logging | GenAI interactions exported to GCS, BigQuery, and Cloud Logging | ADK agents only | Disabled locally, enabled when deployed | Auditing LLM interactions, compliance |
| BigQuery Agent Analytics | Structured agent events (LLM calls, tool use, outcomes) to BigQuery | ADK agents with plugin enabled | Opt-in ( | Conversational analytics, custom dashboards, LLM-as-judge evals |
| Third-Party Integrations | External observability platforms (AgentOps, Phoenix, MLflow, etc.) | Any ADK agent | Opt-in, per-provider setup | Team collaboration, specialized visualization, prompt management |
Ask the user which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.
Cloud Trace
ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.
Span Hierarchy
invocation
└── agent_run (one per agent in the chain)
├── call_llm (model request/response)
└── execute_tool (tool execution)Setup by Deployment Type
| Deployment | Setup |
|---|---|
| Agent Runtime | Automatic — traces are exported to Cloud Trace by default |
| Cloud Run (scaffolded) | Automatic — |
| GKE (scaffolded) | Automatic — |
| Cloud Run / GKE (manual) | Configure OpenTelemetry exporter in your app |
| Local dev | Works with |
View traces: Cloud Console → Trace → Trace explorer
For detailed setup instructions (Agent Runtime CLI/SDK, Cloud Run, custom deployments), fetch .
https://adk.dev/integrations/cloud-trace/index.mdPrompt-Response Logging
Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL) and BigQuery (via direct log sinks and external tables). Privacy-preserving by default — only metadata is logged unless explicitly configured otherwise.
Key env var: — set to (metadata only, default in deployed envs), (full content), or (disabled). Logging is disabled locally unless is set.
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENTNO_CONTENTtruefalseLOGS_BUCKET_NAMEFor scaffolded project details (Terraform resources, env vars, privacy modes, enabling/disabling, verification commands), see .
references/cloud-trace-and-logging.mdFor ADK logging docs (log levels, configuration, debugging), fetch .
https://adk.dev/observability/logging/index.mdBigQuery Agent Analytics Plugin
Optional plugin that logs structured agent events to BigQuery. Enable with at scaffold time. See for details.
--bq-analyticsreferences/bigquery-agent-analytics.mdThird-Party Integrations
ADK supports several third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.
| Platform | Key Differentiator | Setup Complexity | Self-Hosted Option |
|---|---|---|---|
| AgentOps | Session replays, 2-line setup, replaces native telemetry | Minimal | No (SaaS) |
| Arize AX | Commercial platform, production monitoring, evaluation dashboards | Low | No (SaaS) |
| Phoenix | Open-source, custom evaluators, experiment testing | Low | Yes |
| MLflow | OTel traces to MLflow Tracking Server, span tree visualization | Medium (needs SQL backend) | Yes |
| Monocle | 1-call setup, VS Code Gantt chart visualizer | Minimal | Yes (local files) |
| Weave | W&B platform, team collaboration, timeline views | Low | No (SaaS) |
| Freeplay | Prompt management + evals + observability in one platform | Low | No (SaaS) |
Ask the user which platform they prefer — present the trade-offs and let them choose. For setup details, fetch the relevant ADK docs page from the Deep Dive table below.
Troubleshooting
| Issue | Solution |
|---|---|
| No traces in Cloud Trace | Verify |
| Prompt-response data not appearing | Check |
| Privacy mode misconfigured | Check |
| BigQuery Analytics not logging | Verify plugin is configured in |
| Third-party integration not capturing spans | Check provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry |
| Traces missing tool spans | Tool execution spans appear under |
| High telemetry costs | Switch to |
Deep Dive: ADK Docs (WebFetch URLs)
For detailed documentation beyond what this skill covers, fetch these pages:
| Topic | URL |
|---|---|
| Observability overview | |
| Agent activity logging | |
| Cloud Trace integration | |
| BigQuery Agent Analytics | |
| AgentOps | |
| Arize AX | |
| Phoenix (Arize) | |
| MLflow tracing | |
| Monocle | |
| W&B Weave | |
| Freeplay | |
Related Skills
- — Deployment targets, CI/CD pipelines, and production workflows
/google-agents-cli-deploy - — Development workflow, coding guidelines, and operational rules
/google-agents-cli-workflow - — ADK Python API quick reference for writing agent code
/google-agents-cli-adk-code