application-performance-performance-optimization
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOptimize application performance end-to-end using specialized performance and optimization agents:
[Extended thinking: This workflow orchestrates a comprehensive performance optimization process across the entire application stack. Starting with deep profiling and baseline establishment, the workflow progresses through targeted optimizations in each system layer, validates improvements through load testing, and establishes continuous monitoring for sustained performance. Each phase builds on insights from previous phases, creating a data-driven optimization strategy that addresses real bottlenecks rather than theoretical improvements. The workflow emphasizes modern observability practices, user-centric performance metrics, and cost-effective optimization strategies.]
借助专业的性能优化Agent优化端到端应用性能:
[扩展思考:该工作流编排了覆盖整个应用技术栈的全面性能优化流程。从深度性能分析和基准建立开始,工作流逐步在每个系统层实施针对性优化,通过负载测试验证改进效果,并建立持续监控以维持性能表现。每个阶段都基于前一阶段的洞察,形成以数据为驱动的优化策略,解决实际瓶颈而非理论上的改进。该工作流强调现代可观测性实践、以用户为中心的性能指标以及高性价比的优化策略。]
Use this skill when
适用场景
- Coordinating performance optimization across backend, frontend, and infrastructure
- Establishing baselines and profiling to identify bottlenecks
- Designing load tests, performance budgets, or capacity plans
- Building observability for performance and reliability targets
- 协调后端、前端和基础设施的性能优化工作
- 建立性能基准并通过分析识别瓶颈
- 设计负载测试、性能预算或容量规划
- 为性能和可靠性目标构建可观测性体系
Do not use this skill when
不适用场景
- The task is a small localized fix with no broader performance goals
- There is no access to metrics, tracing, or profiling data
- The request is unrelated to performance or scalability
- 仅需进行小型局部修复,无更广泛性能目标的任务
- 无法获取指标、追踪或性能分析数据的场景
- 请求与性能或可扩展性无关的情况
Instructions
操作步骤
- Confirm performance goals, constraints, and target metrics.
- Establish baselines with profiling, tracing, and real-user data.
- Execute phased optimizations across the stack with measurable impact.
- Validate improvements and set guardrails to prevent regressions.
- 确认性能目标、约束条件和关键指标。
- 利用性能分析、追踪和真实用户数据建立性能基准。
- 分阶段实施跨技术栈的优化,并确保效果可衡量。
- 验证优化效果并设置防护机制以防止性能退化。
Safety
安全注意事项
- Avoid load testing production without approvals and safeguards.
- Roll out performance changes gradually with rollback plans.
- 未经批准和未采取防护措施时,请勿在生产环境进行负载测试。
- 逐步推出性能优化变更,并制定回滚计划。
Phase 1: Performance Profiling & Baseline
第一阶段:性能分析与基准建立
1. Comprehensive Performance Profiling
1. 全面性能分析
- Use Task tool with subagent_type="performance-engineer"
- Prompt: "Profile application performance comprehensively for: $ARGUMENTS. Generate flame graphs for CPU usage, heap dumps for memory analysis, trace I/O operations, and identify hot paths. Use APM tools like DataDog or New Relic if available. Include database query profiling, API response times, and frontend rendering metrics. Establish performance baselines for all critical user journeys."
- Context: Initial performance investigation
- Output: Detailed performance profile with flame graphs, memory analysis, bottleneck identification, baseline metrics
- 使用Task工具,设置subagent_type="performance-engineer"
- 提示词:"针对$ARGUMENTS进行全面的应用性能分析。生成CPU使用率火焰图、内存分析堆转储文件,追踪I/O操作并识别热点路径。如果有可用的APM工具(如DataDog或New Relic)请使用。包含数据库查询分析、API响应时间和前端渲染指标。为所有关键用户旅程建立性能基准。"
- 上下文:初始性能调查
- 输出:包含火焰图、内存分析、瓶颈识别和基准指标的详细性能分析报告
2. Observability Stack Assessment
2. 可观测性体系评估
- Use Task tool with subagent_type="observability-engineer"
- Prompt: "Assess current observability setup for: $ARGUMENTS. Review existing monitoring, distributed tracing with OpenTelemetry, log aggregation, and metrics collection. Identify gaps in visibility, missing metrics, and areas needing better instrumentation. Recommend APM tool integration and custom metrics for business-critical operations."
- Context: Performance profile from step 1
- Output: Observability assessment report, instrumentation gaps, monitoring recommendations
- 使用Task工具,设置subagent_type="observability-engineer"
- 提示词:"针对$ARGUMENTS评估当前可观测性设置。审查现有监控系统、基于OpenTelemetry的分布式追踪、日志聚合和指标收集情况。识别可见性缺口、缺失的指标以及需要更好 instrumentation 的领域。为业务关键操作推荐APM工具集成和自定义指标。"
- 上下文:步骤1的性能分析报告
- 输出:可观测性评估报告、 instrumentation 缺口、监控建议
3. User Experience Analysis
3. 用户体验分析
- Use Task tool with subagent_type="performance-engineer"
- Prompt: "Analyze user experience metrics for: $ARGUMENTS. Measure Core Web Vitals (LCP, FID, CLS), page load times, time to interactive, and perceived performance. Use Real User Monitoring (RUM) data if available. Identify user journeys with poor performance and their business impact."
- Context: Performance baselines from step 1
- Output: UX performance report, Core Web Vitals analysis, user impact assessment
- 使用Task工具,设置subagent_type="performance-engineer"
- 提示词:"针对$ARGUMENTS分析用户体验指标。测量Core Web Vitals(LCP、FID、CLS)、页面加载时间、可交互时间和感知性能。如果有可用的真实用户监控(RUM)数据请使用。识别性能不佳的用户旅程及其业务影响。"
- 上下文:步骤1的性能基准
- 输出:用户体验性能报告、Core Web Vitals分析、用户影响评估
Phase 2: Database & Backend Optimization
第二阶段:数据库与后端优化
4. Database Performance Optimization
4. 数据库性能优化
- Use Task tool with subagent_type="database-cloud-optimization::database-optimizer"
- Prompt: "Optimize database performance for: $ARGUMENTS based on profiling data: {context_from_phase_1}. Analyze slow query logs, create missing indexes, optimize execution plans, implement query result caching with Redis/Memcached. Review connection pooling, prepared statements, and batch processing opportunities. Consider read replicas and database sharding if needed."
- Context: Performance bottlenecks from phase 1
- Output: Optimized queries, new indexes, caching strategy, connection pool configuration
- 使用Task工具,设置subagent_type="database-cloud-optimization::database-optimizer"
- 提示词:"基于第一阶段的分析数据{context_from_phase_1},针对$ARGUMENTS优化数据库性能。分析慢查询日志,创建缺失的索引,优化执行计划,使用Redis/Memcached实现查询结果缓存。审查连接池、预编译语句和批量处理的优化空间。必要时考虑只读副本和数据库分片。"
- 上下文:第一阶段识别的性能瓶颈
- 输出:优化后的查询语句、新索引、缓存策略、连接池配置
5. Backend Code & API Optimization
5. 后端代码与API优化
- Use Task tool with subagent_type="backend-development::backend-architect"
- Prompt: "Optimize backend services for: $ARGUMENTS targeting bottlenecks: {context_from_phase_1}. Implement efficient algorithms, add application-level caching, optimize N+1 queries, use async/await patterns effectively. Implement pagination, response compression, GraphQL query optimization, and batch API operations. Add circuit breakers and bulkheads for resilience."
- Context: Database optimizations from step 4, profiling data from phase 1
- Output: Optimized backend code, caching implementation, API improvements, resilience patterns
- 使用Task工具,设置subagent_type="backend-development::backend-architect"
- 提示词:"针对第一阶段识别的瓶颈{context_from_phase_1},优化$ARGUMENTS的后端服务。实现高效算法,添加应用级缓存,优化N+1查询问题,有效使用async/await模式。实现分页、响应压缩、GraphQL查询优化和批量API操作。添加断路器和舱壁模式以提升弹性。"
- 上下文:步骤4的数据库优化结果、第一阶段的分析数据
- 输出:优化后的后端代码、缓存实现方案、API改进措施、弹性模式
6. Microservices & Distributed System Optimization
6. 微服务与分布式系统优化
- Use Task tool with subagent_type="performance-engineer"
- Prompt: "Optimize distributed system performance for: $ARGUMENTS. Analyze service-to-service communication, implement service mesh optimizations, optimize message queue performance (Kafka/RabbitMQ), reduce network hops. Implement distributed caching strategies and optimize serialization/deserialization."
- Context: Backend optimizations from step 5
- Output: Service communication improvements, message queue optimization, distributed caching setup
- 使用Task工具,设置subagent_type="performance-engineer"
- 提示词:"针对$ARGUMENTS优化分布式系统性能。分析服务间通信,实施服务网格优化,优化消息队列性能(Kafka/RabbitMQ),减少网络跳数。实现分布式缓存策略并优化序列化/反序列化过程。"
- 上下文:步骤5的后端优化结果
- 输出:服务通信改进方案、消息队列优化、分布式缓存配置
Phase 3: Frontend & CDN Optimization
第三阶段:前端与CDN优化
7. Frontend Bundle & Loading Optimization
7. 前端包与加载优化
- Use Task tool with subagent_type="frontend-developer"
- Prompt: "Optimize frontend performance for: $ARGUMENTS targeting Core Web Vitals: {context_from_phase_1}. Implement code splitting, tree shaking, lazy loading, and dynamic imports. Optimize bundle sizes with webpack/rollup analysis. Implement resource hints (prefetch, preconnect, preload). Optimize critical rendering path and eliminate render-blocking resources."
- Context: UX analysis from phase 1, backend optimizations from phase 2
- Output: Optimized bundles, lazy loading implementation, improved Core Web Vitals
- 使用Task工具,设置subagent_type="frontend-developer"
- 提示词:"针对第一阶段的Core Web Vitals{context_from_phase_1},优化$ARGUMENTS的前端性能。实施代码分割、Tree Shaking、懒加载和动态导入。通过webpack/rollup分析优化包体积。实现资源提示(prefetch、preconnect、preload)。优化关键渲染路径并消除阻塞渲染的资源。"
- 上下文:第一阶段的用户体验分析、第二阶段的后端优化结果
- 输出:优化后的前端包、懒加载实现方案、Core Web Vitals提升效果
8. CDN & Edge Optimization
8. CDN与边缘优化
- Use Task tool with subagent_type="cloud-infrastructure::cloud-architect"
- Prompt: "Optimize CDN and edge performance for: $ARGUMENTS. Configure CloudFlare/CloudFront for optimal caching, implement edge functions for dynamic content, set up image optimization with responsive images and WebP/AVIF formats. Configure HTTP/2 and HTTP/3, implement Brotli compression. Set up geographic distribution for global users."
- Context: Frontend optimizations from step 7
- Output: CDN configuration, edge caching rules, compression setup, geographic optimization
- 使用Task工具,设置subagent_type="cloud-infrastructure::cloud-architect"
- 提示词:"针对$ARGUMENTS优化CDN和边缘性能。配置CloudFlare/CloudFront以实现最佳缓存,为动态内容实现边缘函数,设置响应式图片和WebP/AVIF格式的图片优化。配置HTTP/2和HTTP/3,实施Brotli压缩。为全球用户设置地理分布。"
- 上下文:步骤7的前端优化结果
- 输出:CDN配置、边缘缓存规则、压缩设置、地理优化方案
9. Mobile & Progressive Web App Optimization
9. 移动端与渐进式Web应用优化
- Use Task tool with subagent_type="frontend-mobile-development::mobile-developer"
- Prompt: "Optimize mobile experience for: $ARGUMENTS. Implement service workers for offline functionality, optimize for slow networks with adaptive loading. Reduce JavaScript execution time for mobile CPUs. Implement virtual scrolling for long lists. Optimize touch responsiveness and smooth animations. Consider React Native/Flutter specific optimizations if applicable."
- Context: Frontend optimizations from steps 7-8
- Output: Mobile-optimized code, PWA implementation, offline functionality
- 使用Task工具,设置subagent_type="frontend-mobile-development::mobile-developer"
- 提示词:"针对$ARGUMENTS优化移动端体验。实现Service Worker以支持离线功能,通过自适应加载优化慢速网络下的表现。减少移动端CPU的JavaScript执行时间。为长列表实现虚拟滚动。优化触摸响应性和平滑动画。如果适用,考虑React Native/Flutter的特定优化。"
- 上下文:步骤7-8的前端优化结果
- 输出:移动端优化代码、PWA实现方案、离线功能
Phase 4: Load Testing & Validation
第四阶段:负载测试与验证
10. Comprehensive Load Testing
10. 全面负载测试
- Use Task tool with subagent_type="performance-engineer"
- Prompt: "Conduct comprehensive load testing for: $ARGUMENTS using k6/Gatling/Artillery. Design realistic load scenarios based on production traffic patterns. Test normal load, peak load, and stress scenarios. Include API testing, browser-based testing, and WebSocket testing if applicable. Measure response times, throughput, error rates, and resource utilization at various load levels."
- Context: All optimizations from phases 1-3
- Output: Load test results, performance under load, breaking points, scalability analysis
- 使用Task工具,设置subagent_type="performance-engineer"
- 提示词:"使用k6/Gatling/Artillery针对$ARGUMENTS进行全面负载测试。基于生产流量模式设计真实的负载场景。测试正常负载、峰值负载和压力场景。如果适用,包含API测试、基于浏览器的测试和WebSocket测试。测量不同负载水平下的响应时间、吞吐量、错误率和资源利用率。"
- 上下文:第一至第三阶段的所有优化结果
- 输出:负载测试结果、负载下的性能表现、临界点、可扩展性分析
11. Performance Regression Testing
11. 性能退化测试
- Use Task tool with subagent_type="performance-testing-review::test-automator"
- Prompt: "Create automated performance regression tests for: $ARGUMENTS. Set up performance budgets for key metrics, integrate with CI/CD pipeline using GitHub Actions or similar. Create Lighthouse CI tests for frontend, API performance tests with Artillery, and database performance benchmarks. Implement automatic rollback triggers for performance regressions."
- Context: Load test results from step 10, baseline metrics from phase 1
- Output: Performance test suite, CI/CD integration, regression prevention system
- 使用Task工具,设置subagent_type="performance-testing-review::test-automator"
- 提示词:"为$ARGUMENTS创建自动化性能退化测试。为关键指标设置性能预算,使用GitHub Actions或类似工具集成到CI/CD流水线。为前端创建Lighthouse CI测试,为API创建Artillery性能测试,并建立数据库性能基准。实现性能退化时的自动回滚触发机制。"
- 上下文:步骤10的负载测试结果、第一阶段的基准指标
- 输出:性能测试套件、CI/CD集成方案、退化预防系统
Phase 5: Monitoring & Continuous Optimization
第五阶段:监控与持续优化
12. Production Monitoring Setup
12. 生产环境监控设置
- Use Task tool with subagent_type="observability-engineer"
- Prompt: "Implement production performance monitoring for: $ARGUMENTS. Set up APM with DataDog/New Relic/Dynatrace, configure distributed tracing with OpenTelemetry, implement custom business metrics. Create Grafana dashboards for key metrics, set up PagerDuty alerts for performance degradation. Define SLIs/SLOs for critical services with error budgets."
- Context: Performance improvements from all previous phases
- Output: Monitoring dashboards, alert rules, SLI/SLO definitions, runbooks
- 使用Task工具,设置subagent_type="observability-engineer"
- 提示词:"为$ARGUMENTS部署生产环境性能监控。使用DataDog/New Relic/Dynatrace设置APM,配置基于OpenTelemetry的分布式追踪,实现自定义业务指标。为关键指标创建Grafana仪表盘,设置PagerDuty告警以应对性能下降。为关键服务定义SLIs/SLOs并设置错误预算。"
- 上下文:所有前序阶段的性能优化结果
- 输出:监控仪表盘、告警规则、SLI/SLO定义、运行手册
13. Continuous Performance Optimization
13. 持续性能优化
- Use Task tool with subagent_type="performance-engineer"
- Prompt: "Establish continuous optimization process for: $ARGUMENTS. Create performance budget tracking, implement A/B testing for performance changes, set up continuous profiling in production. Document optimization opportunities backlog, create capacity planning models, and establish regular performance review cycles."
- Context: Monitoring setup from step 12, all previous optimization work
- Output: Performance budget tracking, optimization backlog, capacity planning, review process
- 使用Task工具,设置subagent_type="performance-engineer"
- 提示词:"为$ARGUMENTS建立持续优化流程。创建性能预算跟踪机制,为性能变更实施A/B测试,在生产环境设置持续性能分析。记录优化机会待办事项,创建容量规划模型,并建立定期性能评审周期。"
- 上下文:步骤12的监控设置、所有前序优化工作
- 输出:性能预算跟踪、优化待办事项、容量规划、评审流程
Configuration Options
配置选项
- performance_focus: "latency" | "throughput" | "cost" | "balanced" (default: "balanced")
- optimization_depth: "quick-wins" | "comprehensive" | "enterprise" (default: "comprehensive")
- tools_available: ["datadog", "newrelic", "prometheus", "grafana", "k6", "gatling"]
- budget_constraints: Set maximum acceptable costs for infrastructure changes
- user_impact_tolerance: "zero-downtime" | "maintenance-window" | "gradual-rollout"
- performance_focus: "latency" | "throughput" | "cost" | "balanced"(默认值:"balanced")
- optimization_depth: "quick-wins" | "comprehensive" | "enterprise"(默认值:"comprehensive")
- tools_available: ["datadog", "newrelic", "prometheus", "grafana", "k6", "gatling"]
- budget_constraints: 设置基础设施变更的最大可接受成本
- user_impact_tolerance: "zero-downtime" | "maintenance-window" | "gradual-rollout"
Success Criteria
成功标准
- Response Time: P50 < 200ms, P95 < 1s, P99 < 2s for critical endpoints
- Core Web Vitals: LCP < 2.5s, FID < 100ms, CLS < 0.1
- Throughput: Support 2x current peak load with <1% error rate
- Database Performance: Query P95 < 100ms, no queries > 1s
- Resource Utilization: CPU < 70%, Memory < 80% under normal load
- Cost Efficiency: Performance per dollar improved by minimum 30%
- Monitoring Coverage: 100% of critical paths instrumented with alerting
Performance optimization target: $ARGUMENTS
- 响应时间: 关键端点的P50 < 200ms,P95 < 1s,P99 < 2s
- Core Web Vitals: LCP < 2.5s,FID < 100ms,CLS < 0.1
- 吞吐量: 支持2倍当前峰值负载,错误率<1%
- 数据库性能: 查询P95 < 100ms,无超过1s的查询
- 资源利用率: 正常负载下CPU < 70%,内存 < 80%
- 成本效率: 每美元性能提升至少30%
- 监控覆盖率: 100%关键路径已配置 instrumentation 和告警
性能优化目标:$ARGUMENTS