adk-deploy-guide
Original:🇺🇸 English
Not Translated
MUST READ before deploying any ADK agent. ADK deployment guide — Agent Engine, Cloud Run, GKE, CI/CD pipelines, secrets, observability, and production workflows. Use when deploying agents to Google Cloud or troubleshooting deployments. Do NOT use for API code patterns (use adk-cheatsheet), evaluation (use adk-eval-guide), or project scaffolding (use adk-scaffold).
2installs
Sourceeliasecchig/adk-docs
Added on
NPX Install
npx skill4agent add eliasecchig/adk-docs adk-deploy-guideSKILL.md Content
ADK Deployment Guide
Scaffolded project? Use thecommands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline.makeNo scaffold? See Quick Deploy below, or the ADK deployment docs. For production infrastructure, scaffold with./adk-scaffold
Reference Files
For deeper details, consult these reference files in :
references/- — Scaling defaults, Dockerfile, session types, networking
cloud-run.md - — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
agent-engine.md - — Custom infrastructure, IAM, state management, importing resources
terraform-patterns.md - — Pub/Sub, Eventarc, BigQuery Remote Function triggers via custom
event-driven.mdendpointsfast_api_app.py
Observability: See the adk-observability-guide skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.
Deployment Target Decision Matrix
Choose the right deployment target based on your requirements:
| Criteria | Agent Engine | Cloud Run | GKE |
|---|---|---|---|
| Languages | Python | Python | Python (+ others via custom containers) |
| Scaling | Managed auto-scaling (configurable min/max, concurrency) | Fully configurable (min/max instances, concurrency, CPU allocation) | Full Kubernetes scaling (HPA, VPA, node auto-provisioning) |
| Networking | VPC-SC and PSC supported | Full VPC support, direct VPC egress, IAP, ingress rules | Full Kubernetes networking |
| Session state | Native | In-memory (dev), Cloud SQL, or Agent Engine session backend | Custom (any Kubernetes-compatible store) |
| Batch/event processing | Not supported | | Custom (Kubernetes Jobs, Pub/Sub) |
| Cost model | vCPU-hours + memory-hours (not billed when idle) | Per-instance-second + min instance costs | Node pool costs (always-on or auto-provisioned) |
| Setup complexity | Lower (managed, purpose-built for agents) | Medium (Dockerfile, Terraform, networking) | Higher (Kubernetes expertise required) |
| Best for | Managed infrastructure, minimal ops | Custom infra, event-driven workloads | Full control, open models, GPU workloads |
Ask the user which deployment target fits their needs. Each is a valid production choice with different trade-offs.
Quick Deploy (ADK CLI)
For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required.
bash
# Cloud Run
adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/
# Agent Engine
adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/
# GKE (requires existing cluster)
adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/All commands support to deploy the ADK dev UI. Cloud Run also accepts extra flags after (e.g., ).
--with_uigcloud---- --no-allow-unauthenticatedSee or the ADK deployment docs for full flag reference.
adk deploy --helpFor CI/CD, observability, or production infrastructure, scaffold withand use the sections below./adk-scaffold
Dev Environment Setup & Deploy (Scaffolded Projects)
Setting Up Dev Infrastructure (Optional)
make setup-dev-envterraform applydeployment/terraform/dev/- Service accounts (for the agent, used for runtime permissions)
app_sa - Artifact Registry repository (for container images)
- IAM bindings (granting the app SA necessary roles)
- Telemetry resources (Cloud Logging bucket, BigQuery dataset)
- Any custom resources defined in
deployment/terraform/dev/
This step is optional — works without it (Cloud Run creates the service on the fly via ). However, running it gives you proper service accounts, observability, and IAM setup.
make deploygcloud run deploy --source .bash
make setup-dev-envDeploying
- Notify the human: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
- Wait for explicit approval
- Once approved:
make deploy
IMPORTANT: Never run without explicit human approval.
make deployProduction Deployment — CI/CD Pipeline
Best for: Production applications, teams requiring staging → production promotion.
Prerequisites:
- Project must NOT be in a gitignored folder
- User must provide staging and production GCP project IDs
- GitHub repository name and owner
Steps:
-
If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (seefor full options):
/adk-scaffoldbashuvx agent-starter-pack enhance . --cicd-runner github_actions -y -s -
Ensure you're logged in to GitHub CLI:bash
gh auth login # (skip if already authenticated) -
Run setup-cicd:bash
uvx agent-starter-pack setup-cicd \ --staging-project YOUR_STAGING_PROJECT \ --prod-project YOUR_PROD_PROJECT \ --repository-name YOUR_REPO_NAME \ --repository-owner YOUR_GITHUB_USERNAME \ --auto-approve \ --create-repository -
Push code to trigger deployments
Key setup-cicd
Flags
setup-cicd| Flag | Description |
|---|---|
| GCP project ID for staging environment |
| GCP project ID for production environment |
| GitHub repository name and owner |
| Skip Terraform plan confirmation prompts |
| Create the GitHub repo if it doesn't exist |
| Separate GCP project for CI/CD infrastructure. Defaults to prod project |
| Store Terraform state locally instead of in GCS (see |
Run for the full flag reference (Cloud Build options, dev project, region, etc.).
uvx agent-starter-pack setup-cicd --helpChoosing a CI/CD Runner
| Runner | Pros | Cons |
|---|---|---|
| github_actions (Default) | No PAT needed, uses | Requires GitHub CLI authentication |
| google_cloud_build | Native GCP integration | Requires interactive browser authorization (or PAT + app installation ID for programmatic mode) |
How Authentication Works (WIF)
Both runners use Workload Identity Federation (WIF) — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants impersonation. No long-lived service account keys needed. Terraform in creates the pool, provider, and SA bindings automatically. If auth fails, re-run in the CI/CD Terraform directory.
cicd_runner_sasetup-cicdterraform applyCI/CD Pipeline Stages
The pipeline has three stages:
- CI (PR checks) — Triggered on pull request. Runs unit and integration tests.
- Staging CD — Triggered on merge to . Builds container, deploys to staging, runs load tests.
main - Production CD — Triggered after successful staging deploy. Requires manual approval before deploying to production.
IMPORTANT: creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:
setup-cicdbash
git add . && git commit -m "Initial agent implementation"
git push origin mainTo approve production deployment:
bash
# GitHub Actions: Approve via repository Actions tab (environment protection rules)
# Cloud Build: Find pending build and approve
gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECTCloud Run Specifics
For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see . For ADK docs on Cloud Run deployment, fetch via WebFetch.
references/cloud-run.mdhttps://google.github.io/adk-docs/deploy/cloud-run/index.mdAgent Engine Specifics
Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via and the class.
deploy.pyAdkAppNoCLI exists for Agent Engine. Deploy viagcloudordeploy.py. Query via the Pythonadk deploy agent_engineSDK.vertexai.Client
Deployments can take 5-10 minutes. If times out, check if the engine was created and manually populate with the engine resource ID (see reference for details).
make deploydeployment_metadata.jsonFor detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see . For ADK docs on Agent Engine deployment, fetch via WebFetch.
references/agent-engine.mdhttps://google.github.io/adk-docs/deploy/agent-engine/index.mdService Account Architecture
Scaffolded projects use two service accounts:
- (per environment) — Runtime identity for the deployed agent. Roles defined in
app_sa.deployment/terraform/iam.tf - (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in both staging and prod projects.
cicd_runner_sa
Check for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.
deployment/terraform/iam.tfCommon 403 errors:
- "Permission denied on Cloud Run" → missing deployment role in the target project
cicd_runner_sa - "Cannot act as service account" → Missing binding on
iam.serviceAccountUserapp_sa - "Secret access denied" → missing
app_sasecretmanager.secretAccessor - "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project
Secret Manager (for API Credentials)
Instead of passing sensitive keys as environment variables, use GCP Secret Manager.
bash
# Create a secret
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-
# Update an existing secret
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-Grant access: For Cloud Run, grant to . For Agent Engine, grant it to the platform-managed SA ().
secretmanager.secretAccessorapp_saservice-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.comPass secrets at deploy time (Agent Engine):
bash
make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"Format: or (defaults to latest). Access in code via .
ENV_VAR=SECRET_IDENV_VAR=SECRET_ID:VERSIONos.environ.get("API_KEY")Observability
See the adk-observability-guide skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).
Testing Your Deployed Agent
Agent Engine Deployment
Option 1: Testing Notebook
bash
jupyter notebook notebooks/adk_app_testing.ipynbOption 2: Python Script
python
import json
import vertexai
with open("deployment_metadata.json") as f:
engine_id = json.load(f)["remote_agent_engine_id"]
client = vertexai.Client(location="us-central1")
agent = client.agent_engines.get(name=engine_id)
async for event in agent.async_stream_query(message="Hello!", user_id="test"):
print(event)Option 3: Playground
bash
make playgroundCloud Run Deployment
Auth required by default. Cloud Run deploys with, so all requests need an--no-allow-unauthenticatedheader with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy withAuthorization: Bearer.--allow-unauthenticated
bash
# Test health endpoint
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health
# Test SSE streaming endpoint (ADK HTTP mode)
curl -X POST "https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/run_sse" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-identity-token)" \
-d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}'Load Tests
bash
make load-testSee for configuration, default settings, and CI/CD integration details.
tests/load_test/README.mdDeploying with a UI (IAP)
To expose your agent with a web UI protected by Google identity authentication:
bash
# Deploy with IAP (built-in framework UI)
make deploy IAP=true
# Deploy with custom frontend on a different port
make deploy IAP=true PORT=5173IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the Cloud Console IAP settings.
For Agent Engine with a custom frontend, use a decoupled deployment — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API.
Rollback & Recovery
The primary rollback mechanism is git-based: fix the issue, commit, and push to . The CI/CD pipeline will automatically build and deploy the new version through staging → production.
mainFor immediate Cloud Run rollback without a new commit, use revision traffic shifting:
bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
--to-revisions=REVISION_NAME=100 --region=REGIONAgent Engine doesn't support revision-based rollback — fix and redeploy via .
make deployCustom Infrastructure (Terraform)
For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult for:
references/terraform-patterns.md- Where to put custom Terraform files (dev vs CI/CD)
- Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
- IAM bindings for custom resources
- Terraform state management (remote vs local, importing resources)
- Common infrastructure patterns
Troubleshooting
| Issue | Solution |
|---|---|
| Terraform state locked | |
| GitHub Actions auth failed | Re-run |
| Cloud Build authorization pending | Use |
| Resource already exists | |
| Agent Engine deploy timeout / hangs | Deployments take 5-10 min; check if engine was created (see Agent Engine Specifics) |
| Secret not available | Verify |
| 403 on deploy | Check |
| 403 when testing Cloud Run | Default is |
| Cold starts too slow | Set |
| Cloud Run 503 errors | Check resource limits (memory/CPU), increase |