ADK Deployment Guide
Scaffolded project? Use the
commands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline.
No scaffold? See
Quick Deploy below, or the
ADK deployment docs.
For production infrastructure, scaffold with
.
Reference Files
For deeper details, consult these reference files in
:
- — Scaling defaults, Dockerfile, session types, networking
- — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
- — Custom infrastructure, IAM, state management, importing resources
- — Pub/Sub, Eventarc, BigQuery Remote Function triggers via custom endpoints
Observability: See the adk-observability-guide skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.
Deployment Target Decision Matrix
Choose the right deployment target based on your requirements:
| Criteria | Agent Engine | Cloud Run | GKE |
|---|
| Languages | Python | Python | Python (+ others via custom containers) |
| Scaling | Managed auto-scaling (configurable min/max, concurrency) | Fully configurable (min/max instances, concurrency, CPU allocation) | Full Kubernetes scaling (HPA, VPA, node auto-provisioning) |
| Networking | VPC-SC and PSC supported | Full VPC support, direct VPC egress, IAP, ingress rules | Full Kubernetes networking |
| Session state | Native (persistent, managed) | In-memory (dev), Cloud SQL, or Agent Engine session backend | Custom (any Kubernetes-compatible store) |
| Batch/event processing | Not supported | endpoint for Pub/Sub, Eventarc, BigQuery | Custom (Kubernetes Jobs, Pub/Sub) |
| Cost model | vCPU-hours + memory-hours (not billed when idle) | Per-instance-second + min instance costs | Node pool costs (always-on or auto-provisioned) |
| Setup complexity | Lower (managed, purpose-built for agents) | Medium (Dockerfile, Terraform, networking) | Higher (Kubernetes expertise required) |
| Best for | Managed infrastructure, minimal ops | Custom infra, event-driven workloads | Full control, open models, GPU workloads |
Ask the user which deployment target fits their needs. Each is a valid production choice with different trade-offs.
Quick Deploy (ADK CLI)
For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required.
bash
# Cloud Run
adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/
# Agent Engine
adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/
# GKE (requires existing cluster)
adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/
All commands support
to deploy the ADK dev UI. Cloud Run also accepts extra
flags after
(e.g.,
-- --no-allow-unauthenticated
).
See
or the
ADK deployment docs for full flag reference.
For CI/CD, observability, or production infrastructure, scaffold with
and use the sections below.
Dev Environment Setup & Deploy (Scaffolded Projects)
Setting Up Dev Infrastructure (Optional)
runs
in
deployment/terraform/dev/
. This provisions supporting infrastructure:
- Service accounts ( for the agent, used for runtime permissions)
- Artifact Registry repository (for container images)
- IAM bindings (granting the app SA necessary roles)
- Telemetry resources (Cloud Logging bucket, BigQuery dataset)
- Any custom resources defined in
deployment/terraform/dev/
This step is
optional —
works without it (Cloud Run creates the service on the fly via
gcloud run deploy --source .
). However, running it gives you proper service accounts, observability, and IAM setup.
Note: doesn't automatically use the Terraform-created
. Pass
explicitly or update the Makefile.
Deploying
- Notify the human: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
- Wait for explicit approval
- Once approved:
IMPORTANT: Never run
without explicit human approval.
Production Deployment — CI/CD Pipeline
Best for: Production applications, teams requiring staging → production promotion.
Prerequisites:
- Project must NOT be in a gitignored folder
- User must provide staging and production GCP project IDs
- GitHub repository name and owner
Steps:
-
If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (see
for full options):
bash
uvx agent-starter-pack enhance . --cicd-runner github_actions -y -s
-
Ensure you're logged in to GitHub CLI:
bash
gh auth login # (skip if already authenticated)
-
Run setup-cicd:
bash
uvx agent-starter-pack setup-cicd \
--staging-project YOUR_STAGING_PROJECT \
--prod-project YOUR_PROD_PROJECT \
--repository-name YOUR_REPO_NAME \
--repository-owner YOUR_GITHUB_USERNAME \
--auto-approve \
--create-repository
-
Push code to trigger deployments
Key Flags
| Flag | Description |
|---|
| GCP project ID for staging environment |
| GCP project ID for production environment |
| / | GitHub repository name and owner |
| Skip Terraform plan confirmation prompts |
| Create the GitHub repo if it doesn't exist |
| Separate GCP project for CI/CD infrastructure. Defaults to prod project |
| Store Terraform state locally instead of in GCS (see references/terraform-patterns.md
) |
Run
uvx agent-starter-pack setup-cicd --help
for the full flag reference (Cloud Build options, dev project, region, etc.).
Choosing a CI/CD Runner
| Runner | Pros | Cons |
|---|
| github_actions (Default) | No PAT needed, uses , WIF-based, fully automated | Requires GitHub CLI authentication |
| google_cloud_build | Native GCP integration | Requires interactive browser authorization (or PAT + app installation ID for programmatic mode) |
How Authentication Works (WIF)
Both runners use
Workload Identity Federation (WIF) — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants
impersonation. No long-lived service account keys needed. Terraform in
creates the pool, provider, and SA bindings automatically. If auth fails, re-run
in the CI/CD Terraform directory.
CI/CD Pipeline Stages
The pipeline has three stages:
- CI (PR checks) — Triggered on pull request. Runs unit and integration tests.
- Staging CD — Triggered on merge to . Builds container, deploys to staging, runs load tests.
Path filter: Staging CD uses
— it only triggers when files under
change. The first push after
won't trigger staging CD unless you modify something in
. If nothing happens after pushing, this is why.
- Production CD — Triggered after successful staging deploy via . Might require manual approval before deploying to production.
Approving: Go to GitHub Actions → the production workflow run → click "Review deployments" → approve the pending
environment. This is GitHub's environment protection rules, not a custom mechanism.
IMPORTANT:
creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:
bash
git add . && git commit -m "Initial agent implementation"
git push origin main
To approve production deployment:
bash
# GitHub Actions: Approve via repository Actions tab (environment protection rules)
# Cloud Build: Find pending build and approve
gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECT
Cloud Run Specifics
For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see
. For ADK docs on Cloud Run deployment, fetch
https://google.github.io/adk-docs/deploy/cloud-run/index.md
.
Agent Engine Specifics
Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via
and the
class.
No CLI exists for Agent Engine. Deploy via
or
. Query via the Python
SDK.
Deployments can take 5-10 minutes. If
times out, check if the engine was created and manually populate
with the engine resource ID (see reference for details).
For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see
references/agent-engine.md
. For ADK docs on Agent Engine deployment, fetch
https://google.github.io/adk-docs/deploy/agent-engine/index.md
.
Service Account Architecture
Scaffolded projects use two service accounts:
- (per environment) — Runtime identity for the deployed agent. Roles defined in
deployment/terraform/iam.tf
.
- (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in both staging and prod projects.
Check
deployment/terraform/iam.tf
for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.
Common 403 errors:
- "Permission denied on Cloud Run" → missing deployment role in the target project
- "Cannot act as service account" → Missing binding on
- "Secret access denied" → missing
secretmanager.secretAccessor
- "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project
Secret Manager (for API Credentials)
Instead of passing sensitive keys as environment variables, use GCP Secret Manager.
bash
# Create a secret
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-
# Update an existing secret
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-
Grant access: For Cloud Run, grant
secretmanager.secretAccessor
to
. For Agent Engine, grant it to the platform-managed SA (
service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com
).
Pass secrets at deploy time (Agent Engine):
bash
make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"
Format:
or
ENV_VAR=SECRET_ID:VERSION
(defaults to latest). Access in code via
os.environ.get("API_KEY")
.
Observability
See the adk-observability-guide skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).
Testing Your Deployed Agent
Agent Engine Deployment
Option 1: Testing Notebook
bash
jupyter notebook notebooks/adk_app_testing.ipynb
Option 2: Python Script
python
import json
import vertexai
with open("deployment_metadata.json") as f:
engine_id = json.load(f)["remote_agent_engine_id"]
client = vertexai.Client(location="us-central1")
agent = client.agent_engines.get(name=engine_id)
async for event in agent.async_stream_query(message="Hello!", user_id="test"):
print(event)
Option 3: Playground
Cloud Run Deployment
Auth required by default. Cloud Run deploys with
--no-allow-unauthenticated
, so all requests need an
header with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy with
.
bash
SERVICE_URL="https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app"
AUTH="Authorization: Bearer $(gcloud auth print-identity-token)"
# Test health endpoint
curl -H "$AUTH" "$SERVICE_URL/"
# Step 1: Create a session (required before sending messages)
curl -X POST "$SERVICE_URL/apps/app/users/test-user/sessions" \
-H "Content-Type: application/json" \
-H "$AUTH" \
-d '{}'
# → returns JSON with "id" — use this as SESSION_ID below
# Step 2: Send a message via SSE streaming
curl -X POST "$SERVICE_URL/run_sse" \
-H "Content-Type: application/json" \
-H "$AUTH" \
-d '{
"app_name": "app",
"user_id": "test-user",
"session_id": "SESSION_ID",
"new_message": {"role": "user", "parts": [{"text": "Hello!"}]}
}'
Common mistake: Using
{"message": "Hello!", "user_id": "...", "session_id": "..."}
returns
. The ADK HTTP server expects the
/
schema shown above, and the session must already exist.
Load Tests
See
tests/load_test/README.md
for configuration, default settings, and CI/CD integration details.
Deploying with a UI (IAP)
To expose your agent with a web UI protected by Google identity authentication:
bash
# Deploy with IAP (built-in framework UI)
make deploy IAP=true
# Deploy with custom frontend on a different port
make deploy IAP=true PORT=5173
IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the
Cloud Console IAP settings.
For Agent Engine with a custom frontend, use a decoupled deployment — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API.
Rollback & Recovery
The primary rollback mechanism is
git-based: fix the issue, commit, and push to
. The CI/CD pipeline will automatically build and deploy the new version through staging → production.
For immediate Cloud Run rollback without a new commit, use revision traffic shifting:
bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
--to-revisions=REVISION_NAME=100 --region=REGION
Agent Engine doesn't support revision-based rollback — fix and redeploy via
.
Custom Infrastructure (Terraform)
For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult
references/terraform-patterns.md
for:
- Where to put custom Terraform files (dev vs CI/CD)
- Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
- IAM bindings for custom resources
- Terraform state management (remote vs local, importing resources)
- Common infrastructure patterns
Troubleshooting
| Issue | Solution |
|---|
| Terraform state locked | terraform force-unlock -force LOCK_ID
in deployment/terraform/ |
| GitHub Actions auth failed | Re-run in CI/CD terraform dir; verify WIF pool/provider |
| Cloud Build authorization pending | Use runner instead |
| Resource already exists | (see references/terraform-patterns.md
) |
| Agent Engine deploy timeout / hangs | Deployments take 5-10 min; check if engine was created (see Agent Engine Specifics) |
| Secret not available | Verify granted to (not the default compute SA) |
| 403 on deploy | Check deployment/terraform/iam.tf
— needs deployment + SA impersonation roles in the target project |
| 403 when testing Cloud Run | Default is --no-allow-unauthenticated
; include Authorization: Bearer $(gcloud auth print-identity-token)
header |
| Cold starts too slow | Set in Cloud Run Terraform config |
| Cloud Run 503 errors | Check resource limits (memory/CPU), increase , or check container crash logs |
| 403 right after granting IAM role | IAM propagation is not instant — wait a couple of minutes before retrying. Don't keep re-granting the same role |
| Resource seems missing but Terraform created it | Run to check what Terraform actually manages. Resources created via + (e.g., BQ linked datasets) won't appear in CLI output |