adk-deploy-guide

Original🇺🇸 English
Not Translated

MUST READ before deploying any ADK agent. ADK deployment guide — Agent Engine, Cloud Run, GKE, CI/CD pipelines, secrets, observability, and production workflows. Use when deploying agents to Google Cloud or troubleshooting deployments. Do NOT use for API code patterns (use adk-cheatsheet), evaluation (use adk-eval-guide), or project scaffolding (use adk-scaffold).

2installs
Added on

NPX Install

npx skill4agent add eliasecchig/adk-docs adk-deploy-guide

SKILL.md Content

ADK Deployment Guide

Scaffolded project? Use the
make
commands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline.
No scaffold? See Quick Deploy below, or the ADK deployment docs. For production infrastructure, scaffold with
/adk-scaffold
.

Reference Files

For deeper details, consult these reference files in
references/
:
  • cloud-run.md
    — Scaling defaults, Dockerfile, session types, networking
  • agent-engine.md
    — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
  • terraform-patterns.md
    — Custom infrastructure, IAM, state management, importing resources
  • event-driven.md
    — Pub/Sub, Eventarc, BigQuery Remote Function triggers via custom
    fast_api_app.py
    endpoints
Observability: See the adk-observability-guide skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.

Deployment Target Decision Matrix

Choose the right deployment target based on your requirements:
CriteriaAgent EngineCloud RunGKE
LanguagesPythonPythonPython (+ others via custom containers)
ScalingManaged auto-scaling (configurable min/max, concurrency)Fully configurable (min/max instances, concurrency, CPU allocation)Full Kubernetes scaling (HPA, VPA, node auto-provisioning)
NetworkingVPC-SC and PSC supportedFull VPC support, direct VPC egress, IAP, ingress rulesFull Kubernetes networking
Session stateNative
VertexAiSessionService
(persistent, managed)
In-memory (dev), Cloud SQL, or Agent Engine session backendCustom (any Kubernetes-compatible store)
Batch/event processingNot supported
/invoke
endpoint for Pub/Sub, Eventarc, BigQuery
Custom (Kubernetes Jobs, Pub/Sub)
Cost modelvCPU-hours + memory-hours (not billed when idle)Per-instance-second + min instance costsNode pool costs (always-on or auto-provisioned)
Setup complexityLower (managed, purpose-built for agents)Medium (Dockerfile, Terraform, networking)Higher (Kubernetes expertise required)
Best forManaged infrastructure, minimal opsCustom infra, event-driven workloadsFull control, open models, GPU workloads
Ask the user which deployment target fits their needs. Each is a valid production choice with different trade-offs.

Quick Deploy (ADK CLI)

For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required.
bash
# Cloud Run
adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/

# Agent Engine
adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/

# GKE (requires existing cluster)
adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/
All commands support
--with_ui
to deploy the ADK dev UI. Cloud Run also accepts extra
gcloud
flags after
--
(e.g.,
-- --no-allow-unauthenticated
).
See
adk deploy --help
or the ADK deployment docs for full flag reference.
For CI/CD, observability, or production infrastructure, scaffold with
/adk-scaffold
and use the sections below.

Dev Environment Setup & Deploy (Scaffolded Projects)

Setting Up Dev Infrastructure (Optional)

make setup-dev-env
runs
terraform apply
in
deployment/terraform/dev/
. This provisions supporting infrastructure:
  • Service accounts (
    app_sa
    for the agent, used for runtime permissions)
  • Artifact Registry repository (for container images)
  • IAM bindings (granting the app SA necessary roles)
  • Telemetry resources (Cloud Logging bucket, BigQuery dataset)
  • Any custom resources defined in
    deployment/terraform/dev/
This step is optional
make deploy
works without it (Cloud Run creates the service on the fly via
gcloud run deploy --source .
). However, running it gives you proper service accounts, observability, and IAM setup.
bash
make setup-dev-env

Deploying

  1. Notify the human: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
  2. Wait for explicit approval
  3. Once approved:
    make deploy
IMPORTANT: Never run
make deploy
without explicit human approval.

Production Deployment — CI/CD Pipeline

Best for: Production applications, teams requiring staging → production promotion.
Prerequisites:
  1. Project must NOT be in a gitignored folder
  2. User must provide staging and production GCP project IDs
  3. GitHub repository name and owner
Steps:
  1. If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (see
    /adk-scaffold
    for full options):
    bash
    uvx agent-starter-pack enhance . --cicd-runner github_actions -y -s
  2. Ensure you're logged in to GitHub CLI:
    bash
    gh auth login  # (skip if already authenticated)
  3. Run setup-cicd:
    bash
    uvx agent-starter-pack setup-cicd \
      --staging-project YOUR_STAGING_PROJECT \
      --prod-project YOUR_PROD_PROJECT \
      --repository-name YOUR_REPO_NAME \
      --repository-owner YOUR_GITHUB_USERNAME \
      --auto-approve \
      --create-repository
  4. Push code to trigger deployments

Key
setup-cicd
Flags

FlagDescription
--staging-project
GCP project ID for staging environment
--prod-project
GCP project ID for production environment
--repository-name
/
--repository-owner
GitHub repository name and owner
--auto-approve
Skip Terraform plan confirmation prompts
--create-repository
Create the GitHub repo if it doesn't exist
--cicd-project
Separate GCP project for CI/CD infrastructure. Defaults to prod project
--local-state
Store Terraform state locally instead of in GCS (see
references/terraform-patterns.md
)
Run
uvx agent-starter-pack setup-cicd --help
for the full flag reference (Cloud Build options, dev project, region, etc.).

Choosing a CI/CD Runner

RunnerProsCons
github_actions (Default)No PAT needed, uses
gh auth
, WIF-based, fully automated
Requires GitHub CLI authentication
google_cloud_buildNative GCP integrationRequires interactive browser authorization (or PAT + app installation ID for programmatic mode)

How Authentication Works (WIF)

Both runners use Workload Identity Federation (WIF) — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants
cicd_runner_sa
impersonation. No long-lived service account keys needed. Terraform in
setup-cicd
creates the pool, provider, and SA bindings automatically. If auth fails, re-run
terraform apply
in the CI/CD Terraform directory.

CI/CD Pipeline Stages

The pipeline has three stages:
  1. CI (PR checks) — Triggered on pull request. Runs unit and integration tests.
  2. Staging CD — Triggered on merge to
    main
    . Builds container, deploys to staging, runs load tests.
  3. Production CD — Triggered after successful staging deploy. Requires manual approval before deploying to production.
IMPORTANT:
setup-cicd
creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:
bash
git add . && git commit -m "Initial agent implementation"
git push origin main
To approve production deployment:
bash
# GitHub Actions: Approve via repository Actions tab (environment protection rules)

# Cloud Build: Find pending build and approve
gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECT

Cloud Run Specifics

For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see
references/cloud-run.md
. For ADK docs on Cloud Run deployment, fetch
https://google.github.io/adk-docs/deploy/cloud-run/index.md
via WebFetch.

Agent Engine Specifics

Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via
deploy.py
and the
AdkApp
class.
No
gcloud
CLI exists for Agent Engine.
Deploy via
deploy.py
or
adk deploy agent_engine
. Query via the Python
vertexai.Client
SDK.
Deployments can take 5-10 minutes. If
make deploy
times out, check if the engine was created and manually populate
deployment_metadata.json
with the engine resource ID (see reference for details).
For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see
references/agent-engine.md
. For ADK docs on Agent Engine deployment, fetch
https://google.github.io/adk-docs/deploy/agent-engine/index.md
via WebFetch.

Service Account Architecture

Scaffolded projects use two service accounts:
  • app_sa
    (per environment) — Runtime identity for the deployed agent. Roles defined in
    deployment/terraform/iam.tf
    .
  • cicd_runner_sa
    (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in both staging and prod projects.
Check
deployment/terraform/iam.tf
for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.
Common 403 errors:
  • "Permission denied on Cloud Run" →
    cicd_runner_sa
    missing deployment role in the target project
  • "Cannot act as service account" → Missing
    iam.serviceAccountUser
    binding on
    app_sa
  • "Secret access denied" →
    app_sa
    missing
    secretmanager.secretAccessor
  • "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project

Secret Manager (for API Credentials)

Instead of passing sensitive keys as environment variables, use GCP Secret Manager.
bash
# Create a secret
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-

# Update an existing secret
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-
Grant access: For Cloud Run, grant
secretmanager.secretAccessor
to
app_sa
. For Agent Engine, grant it to the platform-managed SA (
service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com
).
Pass secrets at deploy time (Agent Engine):
bash
make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"
Format:
ENV_VAR=SECRET_ID
or
ENV_VAR=SECRET_ID:VERSION
(defaults to latest). Access in code via
os.environ.get("API_KEY")
.

Observability

See the adk-observability-guide skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).

Testing Your Deployed Agent

Agent Engine Deployment

Option 1: Testing Notebook
bash
jupyter notebook notebooks/adk_app_testing.ipynb
Option 2: Python Script
python
import json
import vertexai

with open("deployment_metadata.json") as f:
    engine_id = json.load(f)["remote_agent_engine_id"]

client = vertexai.Client(location="us-central1")
agent = client.agent_engines.get(name=engine_id)

async for event in agent.async_stream_query(message="Hello!", user_id="test"):
    print(event)
Option 3: Playground
bash
make playground

Cloud Run Deployment

Auth required by default. Cloud Run deploys with
--no-allow-unauthenticated
, so all requests need an
Authorization: Bearer
header with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy with
--allow-unauthenticated
.
bash
# Test health endpoint
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
  https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health

# Test SSE streaming endpoint (ADK HTTP mode)
curl -X POST "https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/run_sse" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
  -d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}'

Load Tests

bash
make load-test
See
tests/load_test/README.md
for configuration, default settings, and CI/CD integration details.

Deploying with a UI (IAP)

To expose your agent with a web UI protected by Google identity authentication:
bash
# Deploy with IAP (built-in framework UI)
make deploy IAP=true

# Deploy with custom frontend on a different port
make deploy IAP=true PORT=5173
IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the Cloud Console IAP settings.
For Agent Engine with a custom frontend, use a decoupled deployment — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API.

Rollback & Recovery

The primary rollback mechanism is git-based: fix the issue, commit, and push to
main
. The CI/CD pipeline will automatically build and deploy the new version through staging → production.
For immediate Cloud Run rollback without a new commit, use revision traffic shifting:
bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
  --to-revisions=REVISION_NAME=100 --region=REGION
Agent Engine doesn't support revision-based rollback — fix and redeploy via
make deploy
.

Custom Infrastructure (Terraform)

For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult
references/terraform-patterns.md
for:
  • Where to put custom Terraform files (dev vs CI/CD)
  • Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
  • IAM bindings for custom resources
  • Terraform state management (remote vs local, importing resources)
  • Common infrastructure patterns

Troubleshooting

IssueSolution
Terraform state locked
terraform force-unlock -force LOCK_ID
in deployment/terraform/
GitHub Actions auth failedRe-run
terraform apply
in CI/CD terraform dir; verify WIF pool/provider
Cloud Build authorization pendingUse
github_actions
runner instead
Resource already exists
terraform import
(see
references/terraform-patterns.md
)
Agent Engine deploy timeout / hangsDeployments take 5-10 min; check if engine was created (see Agent Engine Specifics)
Secret not availableVerify
secretAccessor
granted to
app_sa
(not the default compute SA)
403 on deployCheck
deployment/terraform/iam.tf
cicd_runner_sa
needs deployment + SA impersonation roles in the target project
403 when testing Cloud RunDefault is
--no-allow-unauthenticated
; include
Authorization: Bearer $(gcloud auth print-identity-token)
header
Cold starts too slowSet
min_instance_count > 0
in Cloud Run Terraform config
Cloud Run 503 errorsCheck resource limits (memory/CPU), increase
max_instance_count
, or check container crash logs