shift-right-testing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Shift-Right Testing

右移测试

<default_to_action> When testing in production or implementing progressive delivery:
  1. IMPLEMENT feature flags for progressive rollout (1% → 10% → 50% → 100%)
  2. DEPLOY with canary releases (compare metrics before full rollout)
  3. MONITOR with synthetic tests (proactive) + RUM (reactive)
  4. INJECT failures with chaos engineering (build resilience)
  5. ANALYZE production data to improve pre-production testing
Quick Shift-Right Techniques:
  • Feature flags → Control who sees what, instant rollback
  • Canary deployment → 5% traffic, compare error rates
  • Synthetic monitoring → Simulate users 24/7, catch issues before users
  • Chaos engineering → Netflix-style failure injection
  • RUM (Real User Monitoring) → Actual user experience data
Critical Success Factors:
  • Production is the ultimate test environment
  • Ship fast with safety nets, not slow with certainty
  • Use production data to improve shift-left testing </default_to_action>
<default_to_action> 在生产环境中测试或实施渐进式交付时:
  1. 部署Feature Flags以实现渐进式发布(1% → 10% → 50% → 100%)
  2. 通过金丝雀部署发布版本(在全面发布前对比指标)
  3. 结合合成监控(主动式)+ RUM(真实用户监控)进行监控
  4. 借助混沌工程注入故障(构建系统韧性)
  5. 分析生产环境数据以优化预生产阶段的测试
快速右移测试技巧:
  • Feature Flags → 控制用户可见内容,支持即时回滚
  • 金丝雀部署 → 分配5%流量,对比错误率
  • 合成监控 → 7×24小时模拟用户操作,在用户发现前捕获问题
  • 混沌工程 → 采用Netflix风格的故障注入
  • RUM(真实用户监控) → 获取真实用户体验数据
关键成功因素:
  • 生产环境是终极测试环境
  • 依托安全保障快速发布,而非追求确定性而缓慢交付
  • 利用生产环境数据优化左移测试 </default_to_action>

Quick Reference Card

速查卡片

When to Use

适用场景

  • Progressive feature rollouts
  • Production reliability validation
  • Performance monitoring at scale
  • Learning from real user behavior
  • 渐进式功能发布
  • 生产环境可靠性验证
  • 大规模性能监控
  • 从真实用户行为中学习

Shift-Right Techniques

右移测试技术手段

TechniquePurposeWhen
Feature FlagsControlled rolloutEvery feature
CanaryCompare new vs oldEvery deployment
Synthetic MonitoringProactive detection24/7
RUMReal user metricsAlways on
Chaos EngineeringResilience validationRegularly
A/B TestingUser behavior validationFeature decisions
技术手段目标适用时机
Feature Flags可控式发布所有功能上线
金丝雀部署对比新旧版本表现所有版本发布
合成监控主动检测问题7×24小时持续监控
RUM获取真实用户指标始终开启
混沌工程验证系统韧性定期执行
A/B测试验证用户行为偏好功能决策阶段

Progressive Rollout Pattern

渐进式发布模式

1% → 10% → 25% → 50% → 100%
↓      ↓      ↓      ↓
Check  Check  Check  Monitor
1% → 10% → 25% → 50% → 100%
↓      ↓      ↓      ↓
检查  检查  检查  监控

Key Metrics to Monitor

核心监控指标

MetricSLO TargetAlert Threshold
Error rate< 0.1%> 1%
p95 latency< 200ms> 500ms
Availability99.9%< 99.5%
Apdex> 0.95< 0.8

指标SLO目标告警阈值
错误率< 0.1%> 1%
p95延迟< 200ms> 500ms
可用性99.9%< 99.5%
Apdex> 0.95< 0.8

Feature Flags

Feature Flags

javascript
// Progressive rollout with LaunchDarkly/Unleash pattern
const newCheckout = featureFlags.isEnabled('new-checkout', {
  userId: user.id,
  percentage: 10,  // 10% of users
  allowlist: ['beta-testers']
});

if (newCheckout) {
  return <NewCheckoutFlow />;
} else {
  return <LegacyCheckoutFlow />;
}

// Instant rollback on issues
await featureFlags.disable('new-checkout');

javascript
// Progressive rollout with LaunchDarkly/Unleash pattern
const newCheckout = featureFlags.isEnabled('new-checkout', {
  userId: user.id,
  percentage: 10,  // 10% of users
  allowlist: ['beta-testers']
});

if (newCheckout) {
  return <NewCheckoutFlow />;
} else {
  return <LegacyCheckoutFlow />;
}

// Instant rollback on issues
await featureFlags.disable('new-checkout');

Canary Deployment

金丝雀部署

yaml
undefined
yaml
undefined

Flagger canary config

Flagger canary config

apiVersion: flagger.app/v1beta1 kind: Canary spec: targetRef: apiVersion: apps/v1 kind: Deployment name: checkout-service progressDeadlineSeconds: 60 analysis: interval: 1m threshold: 5 # Max failed checks maxWeight: 50 # Max traffic to canary stepWeight: 10 # Increment per interval metrics: - name: request-success-rate threshold: 99 - name: request-duration threshold: 500

---
apiVersion: flagger.app/v1beta1 kind: Canary spec: targetRef: apiVersion: apps/v1 kind: Deployment name: checkout-service progressDeadlineSeconds: 60 analysis: interval: 1m threshold: 5 # Max failed checks maxWeight: 50 # Max traffic to canary stepWeight: 10 # Increment per interval metrics: - name: request-success-rate threshold: 99 - name: request-duration threshold: 500

---

Synthetic Monitoring

合成监控

javascript
// Continuous production validation
await Task("Synthetic Tests", {
  endpoints: [
    { path: '/health', expected: 200, interval: '30s' },
    { path: '/api/products', expected: 200, interval: '1m' },
    { path: '/checkout', flow: 'full-purchase', interval: '5m' }
  ],
  locations: ['us-east', 'eu-west', 'ap-south'],
  alertOn: {
    statusCode: '!= 200',
    latency: '> 500ms',
    contentMismatch: true
  }
}, "qe-production-intelligence");

javascript
// Continuous production validation
await Task("Synthetic Tests", {
  endpoints: [
    { path: '/health', expected: 200, interval: '30s' },
    { path: '/api/products', expected: 200, interval: '1m' },
    { path: '/checkout', flow: 'full-purchase', interval: '5m' }
  ],
  locations: ['us-east', 'eu-west', 'ap-south'],
  alertOn: {
    statusCode: '!= 200',
    latency: '> 500ms',
    contentMismatch: true
  }
}, "qe-production-intelligence");

Chaos Engineering

混沌工程

typescript
// Controlled failure injection
await Task("Chaos Experiment", {
  hypothesis: 'System handles database latency gracefully',
  steadyState: {
    metric: 'error_rate',
    expected: '< 0.1%'
  },
  experiment: {
    type: 'network-latency',
    target: 'database',
    delay: '500ms',
    duration: '5m'
  },
  rollback: {
    automatic: true,
    trigger: 'error_rate > 5%'
  }
}, "qe-chaos-engineer");

typescript
// Controlled failure injection
await Task("Chaos Experiment", {
  hypothesis: 'System handles database latency gracefully',
  steadyState: {
    metric: 'error_rate',
    expected: '< 0.1%'
  },
  experiment: {
    type: 'network-latency',
    target: 'database',
    delay: '500ms',
    duration: '5m'
  },
  rollback: {
    automatic: true,
    trigger: 'error_rate > 5%'
  }
}, "qe-chaos-engineer");

Production → Pre-Production Feedback Loop

生产环境 → 预生产环境反馈循环

typescript
// Convert production incidents to regression tests
await Task("Incident Replay", {
  incident: {
    id: 'INC-2024-001',
    type: 'performance-degradation',
    conditions: { concurrent_users: 500, cart_items: 10 }
  },
  generateTests: true,
  addToRegression: true
}, "qe-production-intelligence");

// Output: New test added to prevent recurrence

typescript
// Convert production incidents to regression tests
await Task("Incident Replay", {
  incident: {
    id: 'INC-2024-001',
    type: 'performance-degradation',
    conditions: { concurrent_users: 500, cart_items: 10 }
  },
  generateTests: true,
  addToRegression: true
}, "qe-production-intelligence");

// Output: New test added to prevent recurrence

Agent Coordination Hints

Agent协作提示

Memory Namespace

内存命名空间

aqe/shift-right/
├── canary-results/*      - Canary deployment metrics
├── synthetic-tests/*     - Monitoring configurations
├── chaos-experiments/*   - Experiment results
├── production-insights/* - Issues → test conversions
└── rum-analysis/*        - Real user data patterns
aqe/shift-right/
├── canary-results/*      - Canary deployment metrics
├── synthetic-tests/*     - Monitoring configurations
├── chaos-experiments/*   - Experiment results
├── production-insights/* - Issues → test conversions
└── rum-analysis/*        - Real user data patterns

Fleet Coordination

集群协作

typescript
const shiftRightFleet = await FleetManager.coordinate({
  strategy: 'shift-right-testing',
  agents: [
    'qe-production-intelligence',  // RUM, incident replay
    'qe-chaos-engineer',           // Resilience testing
    'qe-performance-tester',       // Synthetic monitoring
    'qe-quality-analyzer'          // Metrics analysis
  ],
  topology: 'mesh'
});

typescript
const shiftRightFleet = await FleetManager.coordinate({
  strategy: 'shift-right-testing',
  agents: [
    'qe-production-intelligence',  // RUM, incident replay
    'qe-chaos-engineer',           // Resilience testing
    'qe-performance-tester',       // Synthetic monitoring
    'qe-quality-analyzer'          // Metrics analysis
  ],
  topology: 'mesh'
});

Related Skills

相关技能

  • shift-left-testing - Pre-production testing
  • chaos-engineering-resilience - Failure injection deep dive
  • performance-testing - Load testing
  • agentic-quality-engineering - Agent coordination

  • shift-left-testing - 预生产阶段测试
  • chaos-engineering-resilience - 故障注入深度实践
  • performance-testing - 负载测试
  • agentic-quality-engineering - Agent协作

Remember

要点回顾

Production is the ultimate test environment. Feature flags enable instant rollback. Canary catches issues before 100% rollout. Synthetic monitoring detects problems before users. Chaos engineering builds resilience. RUM shows real user experience.
With Agents: Agents monitor production, replay incidents as tests, run chaos experiments, and convert production insights to pre-production tests. Use agents to maintain continuous production quality.
生产环境是终极测试环境。 Feature Flags支持即时回滚。金丝雀部署在全面发布前捕获问题。合成监控在用户发现前检测故障。混沌工程构建系统韧性。RUM呈现真实用户体验。
借助Agent: Agent可监控生产环境、将事件复现为测试用例、执行混沌实验,并将生产环境洞察转化为预生产阶段的测试。利用Agent维持生产环境的持续质量。