google-cloud-waf-performance-optimization
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGoogle Cloud Well-Architected Framework skill for the Performance Optimization pillar
Google Cloud Well-Architected Framework性能优化支柱技能
Overview
概述
The Performance Optimization pillar of the Google Cloud Well-Architected
Framework provides principles and recommendations to help you design, build, and
operate high-performing workloads. It focuses on efficiently allocating
resources, leveraging modular architectures, and using data-driven insights to
continuously monitor and improve performance as your business needs evolve.
Google Cloud架构完善框架的性能优化支柱提供了一系列原则与建议,帮助您设计、构建并运行高性能工作负载。它聚焦于高效分配资源、利用模块化架构,并借助数据驱动的洞察,随着业务需求的演进持续监控和提升性能。
Core principles
核心原则
The recommendations in the performance optimization pillar of the
Well-Architected Framework are aligned with the following core principles:
-
Plan resource allocation: Carefully select and configure the compute, storage, and networking resources that best match the specific requirements of your workload. Grounding document: https://docs.cloud.google.com/architecture/framework/performance-optimization/plan-resource-allocation
-
Take advantage of elasticity: Utilize automated scaling and serverless technologies to dynamically adjust resource capacity in response to real-time demand fluctuations. Grounding document: https://docs.cloud.google.com/architecture/framework/performance-optimization/elasticity
-
Promote modular design: Architect systems using independent, loosely coupled components to enhance scalability and allow individual parts to be optimized without affecting the entire system. Grounding document: https://docs.cloud.google.com/architecture/framework/performance-optimization/promote-modular-design
-
Continuously monitor and improve performance: Implement robust observability to identify bottlenecks and use performance data to drive iterative enhancements throughout the software development lifecycle. Grounding document: https://docs.cloud.google.com/architecture/framework/performance-optimization/continuously-monitor-and-improve-performance
架构完善框架性能优化支柱中的建议遵循以下核心原则:
-
规划资源分配:仔细选择并配置最匹配工作负载特定需求的计算、存储和网络资源。参考文档: https://docs.cloud.google.com/architecture/framework/performance-optimization/plan-resource-allocation
-
利用弹性伸缩:借助自动扩缩容和无服务器技术,根据实时需求波动动态调整资源容量。参考文档: https://docs.cloud.google.com/architecture/framework/performance-optimization/elasticity
-
推行模块化设计:采用独立、松耦合的组件构建系统,以增强可扩展性,并允许单独优化各个部分而不影响整个系统。参考文档: https://docs.cloud.google.com/architecture/framework/performance-optimization/promote-modular-design
-
持续监控与性能提升:实现强大的可观测性以识别瓶颈,并利用性能数据在软件开发生命周期中推动迭代改进。参考文档: https://docs.cloud.google.com/architecture/framework/performance-optimization/continuously-monitor-and-improve-performance
Relevant Google Cloud products
相关Google Cloud产品
The following are examples of Google Cloud products and features that are
relevant to performance optimization:
-
Compute and scaling
- Compute Engine (MIGs): Managed instance groups that support autoscaling and load balancing for VM-based workloads.
- Google Kubernetes Engine (GKE): Provides container orchestration with horizontal and vertical pod autoscaling.
- Cloud Run: A fully managed serverless platform that automatically scales containers to zero or up based on traffic.
-
Data and caching
- Cloud CDN: Low-latency content delivery network to cache static and dynamic content closer to end-users.
- Memorystore: Managed in-memory data store for Valkey and Redis to provide sub-millisecond data access.
- Bigtable: NoSQL database service for analytical and operational workloads requiring low latency and high throughput.
- Spanner: RDBMS that provides global consistency, high availability, and horizontal scaling for mission-critical transactional applications.
-
Performance analysis and monitoring
- Cloud Trace: Distributed tracing system that helps identify latency bottlenecks.
- Cloud Profiler: Continuous CPU and memory profiling to identify resource-heavy application code.
- Cloud Monitoring: Provides dashboards and alerts based on performance KPIs like latency and throughput.
以下是与性能优化相关的Google Cloud产品及功能示例:
-
计算与扩缩容
- Compute Engine (MIGs):支持自动扩缩容和负载均衡的托管实例组,适用于基于虚拟机的工作负载。
- Google Kubernetes Engine (GKE):提供容器编排功能,支持Pod的水平和垂直自动扩缩容。
- Cloud Run:全托管无服务器平台,可根据流量自动将容器扩缩至零或更高规模。
-
数据与缓存
- Cloud CDN:低延迟内容分发网络,可将静态和动态内容缓存到更接近终端用户的位置。
- Memorystore:Valkey和Redis的托管内存数据存储服务,提供亚毫秒级的数据访问速度。
- Bigtable:面向分析型和运营型工作负载的NoSQL数据库服务,具备低延迟和高吞吐量特性。
- Spanner:关系型数据库管理系统(RDBMS),为关键事务型应用提供全局一致性、高可用性和水平扩展性。
-
性能分析与监控
- Cloud Trace:分布式追踪系统,帮助识别延迟瓶颈。
- Cloud Profiler:持续的CPU和内存分析工具,用于识别资源密集型应用代码。
- Cloud Monitoring:基于延迟、吞吐量等性能关键指标(KPI)提供仪表盘和告警功能。
Workload assessment questions
工作负载评估问题
Ask appropriate questions to understand the performance-related requirements and
constraints of the workload and the user's organization. Choose questions from
the following list:
-
Plan resource allocation
- When initially provisioning compute resources for a new application, which approach do you use to determine the required capacity for expected peak loads?
- Which caching strategies (browser, in-memory, CDN, database) do you utilize to improve performance and responsiveness?
- How do you optimize the performance of your data storage solutions (e.g., SSD vs HDD, storage classes) for your applications?
-
Promote modular design
- Which architectural patterns (microservices, asynchronous messaging, stateless servers) do you employ to enhance performance and resilience?
- How do you design your application to minimize the impact of failures in one part of the system on other parts?
-
Continuously monitor and improve performance
- How frequently do you review and analyze the performance of your production applications and infrastructure?
- Which tools or techniques (APM, distributed tracing, load testing) do you use to proactively identify and diagnose performance bottlenecks?
- How do you incorporate performance considerations into your software development lifecycle (SDLC)?
-
Take advantage of elasticity
- Which methods do you use to manage and optimize the cost of your cloud resources while maintaining performance?
- How do you typically handle sudden spikes in traffic or workload on your applications?
提出合适的问题,以了解工作负载及用户组织的性能相关需求与约束。可从以下列表中选择问题:
-
规划资源分配
- 为新应用初始配置计算资源时,您采用何种方法确定预期峰值负载所需的容量?
- 您采用哪些缓存策略(浏览器缓存、内存缓存、CDN缓存、数据库缓存)来提升性能和响应速度?
- 您如何针对应用优化数据存储解决方案的性能(例如SSD与HDD对比、存储类别选择)?
-
推行模块化设计
- 您采用哪些架构模式(微服务、异步消息、无状态服务器)来提升性能和韧性?
- 您如何设计应用以最小化系统某一部分故障对其他部分的影响?
-
持续监控与性能提升
- 您多久会审查并分析生产应用及基础设施的性能?
- 您使用哪些工具或技术(应用性能监控APM、分布式追踪、负载测试)来主动识别和诊断性能瓶颈?
- 您如何将性能考量融入软件开发生命周期(SDLC)?
-
利用弹性伸缩
- 在维持性能的同时,您采用何种方法管理并优化云资源成本?
- 您通常如何处理应用流量或工作负载的突然激增?
Validation checklist
验证清单
Use the following checklist to evaluate the architecture's alignment with
performance optimization recommendations:
-
Resource allocation
- Initial provisioning is based on load testing or historical data rather than general estimates.
- Caching is implemented at multiple layers (CDN, in-memory, or browser) to offload backend systems.
- Storage types (SSD/HDD) and classes are selected based on the specific I/O requirements of the workload.
-
Modular design
- The architecture uses microservices or decoupled components to allow independent scaling.
- Circuit breakers or bulkheads are implemented to isolate failures and prevent performance degradation across the system.
-
Monitoring and continuous improvement
- Automated dashboards and alerts are configured for key performance indicators (KPIs).
- Distributed tracing and profiling tools are used to identify code-level bottlenecks.
- Performance testing (unit and integration) is integrated into the software development lifecycle.
-
Elasticity
- Auto-scaling rules are configured and validated to handle variable demand.
- The architecture leverages serverless or managed services to dynamically match capacity to load.
- Resource utilization is reviewed regularly to eliminate idle overhead and balance cost with performance.
使用以下清单评估架构与性能优化建议的契合度:
-
资源分配
- 初始配置基于负载测试或历史数据,而非大致估算。
- 在多个层级(CDN、内存或浏览器)实现缓存,以减轻后端系统压力。
- 根据工作负载的特定I/O需求选择存储类型(SSD/HDD)和类别。
-
模块化设计
- 架构采用微服务或解耦组件,以支持独立扩缩容。
- 实现断路器或舱壁模式,以隔离故障并防止性能下降扩散至整个系统。
-
监控与持续改进
- 为关键性能指标(KPI)配置自动化仪表盘和告警。
- 使用分布式追踪和分析工具识别代码级瓶颈。
- 将性能测试(单元测试和集成测试)融入软件开发生命周期。
-
弹性伸缩
- 配置并验证自动扩缩容规则以应对可变需求。
- 架构利用无服务器或托管服务,使容量动态匹配负载。
- 定期审查资源利用率,以消除闲置开销并平衡成本与性能。