adk-deploy-guide
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseADK Deployment Guide
ADK部署指南
Scaffolded project? Use thecommands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline.makeNo scaffold? See Quick Deploy below, or the ADK deployment docs. For production infrastructure, scaffold with./adk-scaffold
Reference Files
参考文件
For deeper details, consult these reference files in :
references/- — Scaling defaults, Dockerfile, session types, networking
cloud-run.md - — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
agent-engine.md - — Custom infrastructure, IAM, state management, importing resources
terraform-patterns.md - — Pub/Sub, Eventarc, BigQuery Remote Function triggers via custom
event-driven.mdendpointsfast_api_app.py
Observability: See the adk-observability-guide skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.
如需更详细的信息,请查阅目录下的以下参考文件:
references/- — 扩缩容默认配置、Dockerfile、会话类型、网络配置
cloud-run.md - — deploy.py命令行工具、AdkApp模式、Terraform资源、部署元数据、CI/CD差异
agent-engine.md - — 自定义基础设施、IAM、状态管理、资源导入
terraform-patterns.md - — 通过自定义
event-driven.md端点实现Pub/Sub、Eventarc、BigQuery远程函数触发fast_api_app.py
可观测性: 请查阅adk-observability-guide技能文档,了解Cloud Trace、提示-响应日志、BigQuery分析及第三方集成相关内容。
Deployment Target Decision Matrix
部署目标决策矩阵
Choose the right deployment target based on your requirements:
| Criteria | Agent Engine | Cloud Run | GKE |
|---|---|---|---|
| Languages | Python | Python | Python (+ others via custom containers) |
| Scaling | Managed auto-scaling (configurable min/max, concurrency) | Fully configurable (min/max instances, concurrency, CPU allocation) | Full Kubernetes scaling (HPA, VPA, node auto-provisioning) |
| Networking | VPC-SC and PSC supported | Full VPC support, direct VPC egress, IAP, ingress rules | Full Kubernetes networking |
| Session state | Native | In-memory (dev), Cloud SQL, or Agent Engine session backend | Custom (any Kubernetes-compatible store) |
| Batch/event processing | Not supported | | Custom (Kubernetes Jobs, Pub/Sub) |
| Cost model | vCPU-hours + memory-hours (not billed when idle) | Per-instance-second + min instance costs | Node pool costs (always-on or auto-provisioned) |
| Setup complexity | Lower (managed, purpose-built for agents) | Medium (Dockerfile, Terraform, networking) | Higher (Kubernetes expertise required) |
| Best for | Managed infrastructure, minimal ops | Custom infra, event-driven workloads | Full control, open models, GPU workloads |
Ask the user which deployment target fits their needs. Each is a valid production choice with different trade-offs.
根据你的需求选择合适的部署目标:
| 评估标准 | Agent Engine | Cloud Run | GKE |
|---|---|---|---|
| 支持语言 | Python | Python | Python(通过自定义容器支持其他语言) |
| 扩缩容 | 托管式自动扩缩容(可配置最小/最大实例数、并发数) | 完全可配置(最小/最大实例数、并发数、CPU分配) | 完整Kubernetes扩缩容(HPA、VPA、节点自动配置) |
| 网络配置 | 支持VPC-SC和PSC | 完整VPC支持、直接VPC出口、IAP、入口规则 | 完整Kubernetes网络功能 |
| 会话状态 | 原生 | 内存存储(开发环境)、Cloud SQL或Agent Engine会话后端 | 自定义存储(任何兼容Kubernetes的存储系统) |
| 批量/事件处理 | 不支持 | 提供 | 自定义处理(Kubernetes Jobs、Pub/Sub) |
| 成本模型 | vCPU小时数 + 内存小时数(空闲时不收费) | 按实例秒数计费 + 最小实例成本 | 节点池成本(始终运行或自动配置) |
| 搭建复杂度 | 较低(托管式、专为Agent设计) | 中等(需配置Dockerfile、Terraform、网络) | 较高(需具备Kubernetes专业知识) |
| 最佳适用场景 | 托管式基础设施、最小化运维工作 | 自定义基础设施、事件驱动型工作负载 | 完全控制能力、开源模型、GPU工作负载 |
建议询问用户哪种部署目标符合他们的需求。每种都是有效的生产环境选择,各有不同的权衡。
Quick Deploy (ADK CLI)
快速部署(ADK CLI)
For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required.
bash
undefined适用于未使用Agent Starter Pack脚手架的项目。无需Makefile、Terraform或Dockerfile。
bash
undefinedCloud Run
Cloud Run
adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/
adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/
Agent Engine
Agent Engine
adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/
adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/
GKE (requires existing cluster)
GKE(需已有集群)
adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/
All commands support `--with_ui` to deploy the ADK dev UI. Cloud Run also accepts extra `gcloud` flags after `--` (e.g., `-- --no-allow-unauthenticated`).
See `adk deploy --help` or the [ADK deployment docs](https://google.github.io/adk-docs/deploy/) for full flag reference.
> For CI/CD, observability, or production infrastructure, scaffold with `/adk-scaffold` and use the sections below.
---adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/
所有命令都支持`--with_ui`参数以部署ADK开发UI。Cloud Run还支持在`--`后添加额外的`gcloud`标志(例如:`-- --no-allow-unauthenticated`)。
如需完整的参数参考,请查看`adk deploy --help`或[ADK部署文档](https://google.github.io/adk-docs/deploy/)。
> 若需CI/CD、可观测性或生产环境基础设施,请使用`/adk-scaffold`生成脚手架并参考以下章节。
---Dev Environment Setup & Deploy (Scaffolded Projects)
开发环境搭建与部署(脚手架生成的项目)
Setting Up Dev Infrastructure (Optional)
搭建开发基础设施(可选)
make setup-dev-envterraform applydeployment/terraform/dev/- Service accounts (for the agent, used for runtime permissions)
app_sa - Artifact Registry repository (for container images)
- IAM bindings (granting the app SA necessary roles)
- Telemetry resources (Cloud Logging bucket, BigQuery dataset)
- Any custom resources defined in
deployment/terraform/dev/
This step is optional — works without it (Cloud Run creates the service on the fly via ). However, running it gives you proper service accounts, observability, and IAM setup.
make deploygcloud run deploy --source .bash
make setup-dev-envmake setup-dev-envdeployment/terraform/dev/terraform apply- 服务账号(用于Agent,提供运行时权限)
app_sa - Artifact Registry仓库(用于存储容器镜像)
- IAM绑定(为授予必要的角色)
app_sa - 遥测资源(Cloud Logging存储桶、BigQuery数据集)
- 中定义的任何自定义资源
deployment/terraform/dev/
此步骤为可选——即使不执行此步骤,也能正常工作(Cloud Run会通过自动创建服务)。但执行此步骤可获得规范的服务账号、可观测性及IAM配置。
make deploygcloud run deploy --source .bash
make setup-dev-envDeploying
部署流程
- Notify the human: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
- Wait for explicit approval
- Once approved:
make deploy
IMPORTANT: Never run without explicit human approval.
make deploy- 通知相关人员:"评估分数已达标,测试通过。是否准备部署到开发环境?"
- 等待明确的批准
- 获得批准后执行:
make deploy
重要提示:未经明确的人工批准,切勿运行。
make deployProduction Deployment — CI/CD Pipeline
生产环境部署 — CI/CD流水线
Best for: Production applications, teams requiring staging → production promotion.
Prerequisites:
- Project must NOT be in a gitignored folder
- User must provide staging and production GCP project IDs
- GitHub repository name and owner
Steps:
-
If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (seefor full options):
/adk-scaffoldbashuvx agent-starter-pack enhance . --cicd-runner github_actions -y -s -
Ensure you're logged in to GitHub CLI:bash
gh auth login # (skip if already authenticated) -
Run setup-cicd:bash
uvx agent-starter-pack setup-cicd \ --staging-project YOUR_STAGING_PROJECT \ --prod-project YOUR_PROD_PROJECT \ --repository-name YOUR_REPO_NAME \ --repository-owner YOUR_GITHUB_USERNAME \ --auto-approve \ --create-repository -
Push code to trigger deployments
最佳适用场景: 生产环境应用、需要从预发布环境到生产环境推广流程的团队。
前置条件:
- 项目不得位于被git忽略的目录中
- 用户需提供预发布和生产环境的GCP项目ID
- GitHub仓库名称和所有者信息
步骤:
-
如果是原型项目,请先使用Agent Starter Pack CLI添加Terraform/CI-CD文件(完整选项请查看):
/adk-scaffoldbashuvx agent-starter-pack enhance . --cicd-runner github_actions -y -s -
确保已登录GitHub CLI:bash
gh auth login # (已认证可跳过) -
运行setup-cicd命令:bash
uvx agent-starter-pack setup-cicd \ --staging-project YOUR_STAGING_PROJECT \ --prod-project YOUR_PROD_PROJECT \ --repository-name YOUR_REPO_NAME \ --repository-owner YOUR_GITHUB_USERNAME \ --auto-approve \ --create-repository -
推送代码以触发部署
Key setup-cicd
Flags
setup-cicdsetup-cicd
关键参数
setup-cicd| Flag | Description |
|---|---|
| GCP project ID for staging environment |
| GCP project ID for production environment |
| GitHub repository name and owner |
| Skip Terraform plan confirmation prompts |
| Create the GitHub repo if it doesn't exist |
| Separate GCP project for CI/CD infrastructure. Defaults to prod project |
| Store Terraform state locally instead of in GCS (see |
Run for the full flag reference (Cloud Build options, dev project, region, etc.).
uvx agent-starter-pack setup-cicd --help| 参数 | 描述 |
|---|---|
| 预发布环境的GCP项目ID |
| 生产环境的GCP项目ID |
| GitHub仓库名称和所有者 |
| 跳过Terraform计划确认提示 |
| 若仓库不存在则创建GitHub仓库 |
| 用于CI/CD基础设施的独立GCP项目。默认使用生产环境项目 |
| 将Terraform状态存储在本地而非GCS中(请查看 |
如需完整的参数参考(包括Cloud Build选项、开发环境项目、区域等),请运行。
uvx agent-starter-pack setup-cicd --helpChoosing a CI/CD Runner
选择CI/CD运行器
| Runner | Pros | Cons |
|---|---|---|
| github_actions (Default) | No PAT needed, uses | Requires GitHub CLI authentication |
| google_cloud_build | Native GCP integration | Requires interactive browser authorization (or PAT + app installation ID for programmatic mode) |
| 运行器 | 优势 | 劣势 |
|---|---|---|
| github_actions(默认) | 无需PAT,使用 | 需要GitHub CLI认证 |
| google_cloud_build | 原生GCP集成 | 需要交互式浏览器授权(或PAT + 应用安装ID以实现程序化授权) |
How Authentication Works (WIF)
认证机制说明(WIF)
Both runners use Workload Identity Federation (WIF) — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants impersonation. No long-lived service account keys needed. Terraform in creates the pool, provider, and SA bindings automatically. If auth fails, re-run in the CI/CD Terraform directory.
cicd_runner_sasetup-cicdterraform apply两种运行器均使用工作负载身份联邦(WIF)——GitHub/Cloud Build的OIDC令牌会被GCP工作负载身份池信任,从而授予 impersonation权限。无需长期有效的服务账号密钥。中的Terraform会自动创建身份池、提供者及服务账号绑定。若认证失败,请重新运行CI/CD Terraform目录下的。
cicd_runner_sasetup-cicdterraform applyCI/CD Pipeline Stages
CI/CD流水线阶段
The pipeline has three stages:
- CI (PR checks) — Triggered on pull request. Runs unit and integration tests.
- Staging CD — Triggered on merge to . Builds container, deploys to staging, runs load tests.
main - Production CD — Triggered after successful staging deploy. Requires manual approval before deploying to production.
IMPORTANT: creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:
setup-cicdbash
git add . && git commit -m "Initial agent implementation"
git push origin mainTo approve production deployment:
bash
undefined流水线包含三个阶段:
- CI(PR检查) — 拉取请求触发。运行单元测试和集成测试。
- 预发布环境CD — 合并到分支时触发。构建容器、部署到预发布环境、运行负载测试。
main - 生产环境CD — 预发布环境部署成功后触发。部署到生产环境前需要人工批准。
重要提示:仅创建基础设施,不会自动触发部署。Terraform会配置所有必要的GitHub密钥和变量(WIF凭据、项目ID、服务账号)。推送代码即可触发流水线:
setup-cicdbash
git add . && git commit -m "Initial agent implementation"
git push origin main批准生产环境部署的方式:
bash
undefinedGitHub Actions: Approve via repository Actions tab (environment protection rules)
GitHub Actions:通过仓库的Actions页面批准(环境保护规则)
Cloud Build: Find pending build and approve
Cloud Build:找到待处理的构建任务并批准
gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECT
---gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECT
---Cloud Run Specifics
Cloud Run专属配置
For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see . For ADK docs on Cloud Run deployment, fetch via WebFetch.
references/cloud-run.mdhttps://google.github.io/adk-docs/deploy/cloud-run/index.md如需详细的基础设施配置(扩缩容默认值、Dockerfile、FastAPI端点、会话类型、网络),请查看。如需ADK官方的Cloud Run部署文档,请通过WebFetch获取。
references/cloud-run.mdhttps://google.github.io/adk-docs/deploy/cloud-run/index.mdAgent Engine Specifics
Agent Engine专属配置
Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via and the class.
deploy.pyAdkAppNoCLI exists for Agent Engine. Deploy viagcloudordeploy.py. Query via the Pythonadk deploy agent_engineSDK.vertexai.Client
Deployments can take 5-10 minutes. If times out, check if the engine was created and manually populate with the engine resource ID (see reference for details).
make deploydeployment_metadata.jsonFor detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see . For ADK docs on Agent Engine deployment, fetch via WebFetch.
references/agent-engine.mdhttps://google.github.io/adk-docs/deploy/agent-engine/index.mdAgent Engine是用于部署Python ADK Agent的托管式Vertex AI服务。通过和类实现基于源码的部署(无需Dockerfile)。
deploy.pyAdkApp目前没有针对Agent Engine的CLI命令。 请通过gcloud或deploy.py进行部署。通过Pythonadk deploy agent_engineSDK进行查询。vertexai.Client
部署过程可能需要5-10分钟。若超时,请检查Engine是否已创建,并手动将Engine资源ID填入(详情请参考相关文档)。
make deploydeployment_metadata.json如需详细的基础设施配置(deploy.py参数、AdkApp模式、Terraform资源、部署元数据、会话/工件服务、CI/CD差异),请查看。如需ADK官方的Agent Engine部署文档,请通过WebFetch获取。
references/agent-engine.mdhttps://google.github.io/adk-docs/deploy/agent-engine/index.mdService Account Architecture
服务账号架构
Scaffolded projects use two service accounts:
- (per environment) — Runtime identity for the deployed agent. Roles defined in
app_sa.deployment/terraform/iam.tf - (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in both staging and prod projects.
cicd_runner_sa
Check for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.
deployment/terraform/iam.tfCommon 403 errors:
- "Permission denied on Cloud Run" → missing deployment role in the target project
cicd_runner_sa - "Cannot act as service account" → Missing binding on
iam.serviceAccountUserapp_sa - "Secret access denied" → missing
app_sasecretmanager.secretAccessor - "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project
脚手架生成的项目使用两个服务账号:
- (每个环境一个)—— 已部署Agent的运行时身份。角色定义在
app_sa中。deployment/terraform/iam.tf - (CI/CD项目)—— CI/CD流水线的身份(GitHub Actions / Cloud Build)。位于CI/CD项目中(默认使用生产环境项目),需要在预发布和生产环境项目中拥有权限。
cicd_runner_sa
请查看获取确切的角色绑定信息。跨项目权限(Cloud Run服务代理、Artifact Registry访问)也在此文件中配置。
deployment/terraform/iam.tf常见403错误排查:
- "Permission denied on Cloud Run" → 在目标项目中缺少部署角色
cicd_runner_sa - "Cannot act as service account" → 缺少
app_sa绑定iam.serviceAccountUser - "Secret access denied" → 缺少
app_sa角色secretmanager.secretAccessor - "Artifact Registry read denied" → Cloud Run服务代理在CI/CD项目中缺少读取权限
Secret Manager (for API Credentials)
Secret Manager(API凭据管理)
Instead of passing sensitive keys as environment variables, use GCP Secret Manager.
bash
undefined请勿将敏感密钥作为环境变量传递,请使用GCP Secret Manager。
bash
undefinedCreate a secret
创建密钥
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-
Update an existing secret
更新现有密钥
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-
**Grant access:** For Cloud Run, grant `secretmanager.secretAccessor` to `app_sa`. For Agent Engine, grant it to the platform-managed SA (`service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com`).
**Pass secrets at deploy time (Agent Engine):**
```bash
make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"Format: or (defaults to latest). Access in code via .
ENV_VAR=SECRET_IDENV_VAR=SECRET_ID:VERSIONos.environ.get("API_KEY")echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-
**授予访问权限:** 对于Cloud Run,为`app_sa`授予`secretmanager.secretAccessor`角色。对于Agent Engine,为平台托管的服务账号(`service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com`)授予该角色。
**部署时传递密钥(Agent Engine):**
```bash
make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"格式:或(默认使用最新版本)。在代码中通过访问。
ENV_VAR=SECRET_IDENV_VAR=SECRET_ID:VERSIONos.environ.get("API_KEY")Observability
可观测性
See the adk-observability-guide skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).
请查阅adk-observability-guide技能文档,了解可观测性配置(Cloud Trace、提示-响应日志、BigQuery分析、第三方集成)。
Testing Your Deployed Agent
测试已部署的Agent
Agent Engine Deployment
Agent Engine部署测试
Option 1: Testing Notebook
bash
jupyter notebook notebooks/adk_app_testing.ipynbOption 2: Python Script
python
import json
import vertexai
with open("deployment_metadata.json") as f:
engine_id = json.load(f)["remote_agent_engine_id"]
client = vertexai.Client(location="us-central1")
agent = client.agent_engines.get(name=engine_id)
async for event in agent.async_stream_query(message="Hello!", user_id="test"):
print(event)Option 3: Playground
bash
make playground选项1:测试笔记本
bash
jupyter notebook notebooks/adk_app_testing.ipynb选项2:Python脚本
python
import json
import vertexai
with open("deployment_metadata.json") as f:
engine_id = json.load(f)["remote_agent_engine_id"]
client = vertexai.Client(location="us-central1")
agent = client.agent_engines.get(name=engine_id)
async for event in agent.async_stream_query(message="Hello!", user_id="test"):
print(event)选项3:Playground
bash
make playgroundCloud Run Deployment
Cloud Run部署测试
Auth required by default. Cloud Run deploys with, so all requests need an--no-allow-unauthenticatedheader with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy withAuthorization: Bearer.--allow-unauthenticated
bash
undefined默认需要认证。 Cloud Run默认以参数部署,因此所有请求都需要带有身份令牌的--no-allow-unauthenticated头。收到403错误?你可能缺少此头信息。若要允许公共访问,请使用Authorization: Bearer参数重新部署。--allow-unauthenticated
bash
undefinedTest health endpoint
测试健康检查端点
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)"
https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health
https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)"
https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health
https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/health
Test SSE streaming endpoint (ADK HTTP mode)
测试SSE流端点(ADK HTTP模式)
curl -X POST "https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/run_sse"
-H "Content-Type: application/json"
-H "Authorization: Bearer $(gcloud auth print-identity-token)"
-d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}'
-H "Content-Type: application/json"
-H "Authorization: Bearer $(gcloud auth print-identity-token)"
-d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}'
undefinedcurl -X POST "https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app/run_sse"
-H "Content-Type: application/json"
-H "Authorization: Bearer $(gcloud auth print-identity-token)"
-d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}'
-H "Content-Type: application/json"
-H "Authorization: Bearer $(gcloud auth print-identity-token)"
-d '{"message": "Hello!", "user_id": "test", "session_id": "test-session"}'
undefinedLoad Tests
负载测试
bash
make load-testSee for configuration, default settings, and CI/CD integration details.
tests/load_test/README.mdbash
make load-test配置、默认设置及CI/CD集成详情请查看。
tests/load_test/README.mdDeploying with a UI (IAP)
带UI的部署(IAP)
To expose your agent with a web UI protected by Google identity authentication:
bash
undefined若要通过受Google身份认证保护的Web UI暴露你的Agent:
bash
undefinedDeploy with IAP (built-in framework UI)
使用IAP部署(内置框架UI)
make deploy IAP=true
make deploy IAP=true
Deploy with custom frontend on a different port
在不同端口部署自定义前端
make deploy IAP=true PORT=5173
IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the [Cloud Console IAP settings](https://cloud.google.com/run/docs/securing/identity-aware-proxy-cloud-run#manage_user_or_group_access).
For Agent Engine with a custom frontend, use a **decoupled deployment** — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API.
---make deploy IAP=true PORT=5173
IAP(身份感知代理)会保护Cloud Run服务——仅授权的Google账号可访问。部署完成后,请通过[Cloud Console的IAP设置](https://cloud.google.com/run/docs/securing/identity-aware-proxy-cloud-run#manage_user_or_group_access)授予用户访问权限。
对于带有自定义前端的Agent Engine,请使用**解耦部署**——将前端单独部署到Cloud Run或Cloud Storage,连接到Agent Engine后端API。
---Rollback & Recovery
回滚与恢复
The primary rollback mechanism is git-based: fix the issue, commit, and push to . The CI/CD pipeline will automatically build and deploy the new version through staging → production.
mainFor immediate Cloud Run rollback without a new commit, use revision traffic shifting:
bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
--to-revisions=REVISION_NAME=100 --region=REGIONAgent Engine doesn't support revision-based rollback — fix and redeploy via .
make deploy主要的回滚机制是基于git:修复问题、提交并推送到分支。CI/CD流水线会自动构建并通过预发布→生产环境的流程部署新版本。
main若无需新提交即可立即回滚Cloud Run部署,请使用版本流量切换:
bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
--to-revisions=REVISION_NAME=100 --region=REGIONAgent Engine不支持基于版本的回滚——修复问题后通过重新部署。
make deployCustom Infrastructure (Terraform)
自定义基础设施(Terraform)
For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult for:
references/terraform-patterns.md- Where to put custom Terraform files (dev vs CI/CD)
- Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
- IAM bindings for custom resources
- Terraform state management (remote vs local, importing resources)
- Common infrastructure patterns
如需自定义基础设施模式(Pub/Sub、BigQuery、Eventarc、Cloud SQL、IAM),请查阅获取以下信息:
references/terraform-patterns.md- 自定义Terraform文件的存放位置(开发环境vs CI/CD)
- 资源示例(Pub/Sub、BigQuery、Eventarc触发器)
- 自定义资源的IAM绑定
- Terraform状态管理(远程vs本地、资源导入)
- 常见基础设施模式
Troubleshooting
故障排查
| Issue | Solution |
|---|---|
| Terraform state locked | |
| GitHub Actions auth failed | Re-run |
| Cloud Build authorization pending | Use |
| Resource already exists | |
| Agent Engine deploy timeout / hangs | Deployments take 5-10 min; check if engine was created (see Agent Engine Specifics) |
| Secret not available | Verify |
| 403 on deploy | Check |
| 403 when testing Cloud Run | Default is |
| Cold starts too slow | Set |
| Cloud Run 503 errors | Check resource limits (memory/CPU), increase |
| 问题 | 解决方案 |
|---|---|
| Terraform状态锁定 | 在deployment/terraform/目录下运行 |
| GitHub Actions认证失败 | 重新运行CI/CD Terraform目录下的 |
| Cloud Build授权待处理 | 改用 |
| 资源已存在 | 使用 |
| Agent Engine部署超时/挂起 | 部署过程需要5-10分钟;检查Engine是否已创建(请查看Agent Engine专属配置章节) |
| 密钥不可用 | 验证 |
| 部署时出现403错误 | 检查 |
| 测试Cloud Run时出现403错误 | 默认启用 |
| 冷启动过慢 | 在Cloud Run Terraform配置中设置 |
| Cloud Run出现503错误 | 检查资源限制(内存/CPU)、增加 |