evernote-incident-runbook

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Evernote Incident Runbook

Evernote事件响应手册

Overview

概述

Step-by-step procedures for responding to Evernote integration incidents including outages, rate limits, authentication failures, and data issues.
针对Evernote集成事件(包括故障、速率限制、认证失败和数据问题)的分步响应流程。

Prerequisites

前置条件

  • Access to monitoring dashboards
  • Production logs access
  • Evernote API credentials
  • Communication channels for escalation
  • 有权访问监控仪表盘
  • 有权访问生产日志
  • Evernote API凭证
  • 用于升级处理的沟通渠道

Incident Classification

事件分级

SeverityImpactResponse TimeExample
P1 - CriticalAll users affected15 minComplete API outage
P2 - HighMajor feature broken30 minOAuth failures
P3 - MediumPartial degradation2 hoursHigh error rate
P4 - LowMinor issues1 daySlow response times
严重等级影响范围响应时间示例
P1 - 关键所有用户受影响15分钟API完全故障
P2 - 高主要功能失效30分钟OAuth认证失败
P3 - 中部分性能下降2小时高错误率
P4 - 低轻微问题1天响应缓慢

Incident Response Procedures

事件响应流程

INC-01: Complete API Outage

INC-01:API完全故障

Symptoms:
  • All Evernote API calls failing
  • 5xx errors from Evernote
  • Connection timeouts
Investigation:
bash
undefined
症状:
  • 所有Evernote API调用失败
  • 来自Evernote的5xx错误
  • 连接超时
调查:
bash
undefined

Step 1: Check Evernote service status

Step 1: Check Evernote service status

Step 2: Check status page

Step 2: Check status page


```javascript
// Step 3: Run diagnostic
async function diagnoseOutage() {
  const results = {
    timestamp: new Date().toISOString(),
    checks: []
  };

  // DNS resolution
  try {
    const dns = require('dns').promises;
    const addresses = await dns.resolve4('www.evernote.com');
    results.checks.push({ name: 'DNS', status: 'ok', addresses });
  } catch (error) {
    results.checks.push({ name: 'DNS', status: 'failed', error: error.message });
  }

  // TCP connectivity
  try {
    const net = require('net');
    await new Promise((resolve, reject) => {
      const socket = net.connect(443, 'www.evernote.com');
      socket.setTimeout(5000);
      socket.on('connect', () => { socket.destroy(); resolve(); });
      socket.on('error', reject);
      socket.on('timeout', () => { socket.destroy(); reject(new Error('Timeout')); });
    });
    results.checks.push({ name: 'TCP', status: 'ok' });
  } catch (error) {
    results.checks.push({ name: 'TCP', status: 'failed', error: error.message });
  }

  // HTTPS request
  try {
    const https = require('https');
    await new Promise((resolve, reject) => {
      const req = https.get('https://www.evernote.com/', { timeout: 10000 }, (res) => {
        results.checks.push({ name: 'HTTPS', status: 'ok', statusCode: res.statusCode });
        resolve();
      });
      req.on('error', reject);
    });
  } catch (error) {
    results.checks.push({ name: 'HTTPS', status: 'failed', error: error.message });
  }

  return results;
}
Mitigation:
  1. Activate circuit breaker to prevent cascading failures
  2. Enable graceful degradation (show cached data)
  3. Display user-friendly error message
  4. Monitor Evernote status page
javascript
// Activate graceful degradation
const degradationMode = {
  enabled: true,
  reason: 'Evernote service unavailable',
  startTime: Date.now(),

  shouldServeFromCache: true,
  shouldBlockWrites: true,
  userMessage: 'Note syncing is temporarily unavailable. Your changes will sync when service is restored.'
};

// Apply to endpoints
app.use((req, res, next) => {
  if (degradationMode.enabled) {
    res.locals.degradationMode = degradationMode;
  }
  next();
});
Resolution:
  1. Monitor Evernote status for resolution
  2. Gradually re-enable API calls
  3. Trigger sync for affected users
  4. Document incident timeline


```javascript
// Step 3: Run diagnostic
async function diagnoseOutage() {
  const results = {
    timestamp: new Date().toISOString(),
    checks: []
  };

  // DNS resolution
  try {
    const dns = require('dns').promises;
    const addresses = await dns.resolve4('www.evernote.com');
    results.checks.push({ name: 'DNS', status: 'ok', addresses });
  } catch (error) {
    results.checks.push({ name: 'DNS', status: 'failed', error: error.message });
  }

  // TCP connectivity
  try {
    const net = require('net');
    await new Promise((resolve, reject) => {
      const socket = net.connect(443, 'www.evernote.com');
      socket.setTimeout(5000);
      socket.on('connect', () => { socket.destroy(); resolve(); });
      socket.on('error', reject);
      socket.on('timeout', () => { socket.destroy(); reject(new Error('Timeout')); });
    });
    results.checks.push({ name: 'TCP', status: 'ok' });
  } catch (error) {
    results.checks.push({ name: 'TCP', status: 'failed', error: error.message });
  }

  // HTTPS request
  try {
    const https = require('https');
    await new Promise((resolve, reject) => {
      const req = https.get('https://www.evernote.com/', { timeout: 10000 }, (res) => {
        results.checks.push({ name: 'HTTPS', status: 'ok', statusCode: res.statusCode });
        resolve();
      });
      req.on('error', reject);
    });
  } catch (error) {
    results.checks.push({ name: 'HTTPS', status: 'failed', error: error.message });
  }

  return results;
}
缓解措施:
  1. 激活断路器以防止级联故障
  2. 启用优雅降级(展示缓存数据)
  3. 显示用户友好的错误提示
  4. 监控Evernote状态页面
javascript
// Activate graceful degradation
const degradationMode = {
  enabled: true,
  reason: 'Evernote service unavailable',
  startTime: Date.now(),

  shouldServeFromCache: true,
  shouldBlockWrites: true,
  userMessage: 'Note syncing is temporarily unavailable. Your changes will sync when service is restored.'
};

// Apply to endpoints
app.use((req, res, next) => {
  if (degradationMode.enabled) {
    res.locals.degradationMode = degradationMode;
  }
  next();
});
解决步骤:
  1. 监控Evernote状态以确认故障恢复
  2. 逐步重新启用API调用
  3. 触发受影响用户的同步操作
  4. 记录事件时间线

INC-02: Rate Limit Crisis

INC-02:速率限制危机

Symptoms:
  • Frequent RATE_LIMIT_REACHED errors
  • rateLimitDuration > 300 seconds
  • Multiple users affected
Investigation:
javascript
// Check rate limit metrics
async function analyzeRateLimits() {
  const metrics = await prometheusQuery(`
    sum(increase(evernote_rate_limits_total[1h])) by (api_key)
  `);

  const apiCallRate = await prometheusQuery(`
    sum(rate(evernote_api_calls_total[5m])) by (operation)
  `);

  return {
    rateLimitsLastHour: metrics,
    currentCallRate: apiCallRate,
    suspectedCauses: identifyCauses(apiCallRate)
  };
}

function identifyCauses(callRate) {
  const causes = [];

  // Check for polling abuse
  if (callRate['NoteStore.getSyncState'] > 1) {
    causes.push('Excessive sync state polling');
  }

  // Check for inefficient requests
  if (callRate['NoteStore.getResource'] > 10) {
    causes.push('Individual resource fetching (should batch)');
  }

  return causes;
}
Mitigation:
javascript
// Emergency rate limiting
class EmergencyRateLimiter {
  constructor() {
    this.globalPause = false;
    this.pauseUntil = 0;
  }

  async activateEmergencyPause(durationSeconds) {
    this.globalPause = true;
    this.pauseUntil = Date.now() + (durationSeconds * 1000);

    console.warn(`Emergency rate limit pause activated for ${durationSeconds}s`);

    // Notify operations
    await alertOps('Emergency rate limit pause', {
      duration: durationSeconds,
      reason: 'Excessive rate limits detected'
    });

    // Auto-deactivate
    setTimeout(() => {
      this.globalPause = false;
      console.info('Emergency rate limit pause deactivated');
    }, durationSeconds * 1000);
  }

  async checkBeforeRequest() {
    if (this.globalPause) {
      const waitTime = this.pauseUntil - Date.now();
      if (waitTime > 0) {
        throw new Error(`Rate limit emergency: retry in ${Math.ceil(waitTime / 1000)}s`);
      }
    }
  }
}
Resolution:
  1. Identify and fix inefficient API usage
  2. Increase cache TTLs
  3. Implement request coalescing
  4. Request rate limit boost from Evernote

症状:
  • 频繁出现RATE_LIMIT_REACHED错误
  • rateLimitDuration > 300秒
  • 多用户受影响
调查:
javascript
// Check rate limit metrics
async function analyzeRateLimits() {
  const metrics = await prometheusQuery(`
    sum(increase(evernote_rate_limits_total[1h])) by (api_key)
  `);

  const apiCallRate = await prometheusQuery(`
    sum(rate(evernote_api_calls_total[5m])) by (operation)
  `);

  return {
    rateLimitsLastHour: metrics,
    currentCallRate: apiCallRate,
    suspectedCauses: identifyCauses(apiCallRate)
  };
}

function identifyCauses(callRate) {
  const causes = [];

  // Check for polling abuse
  if (callRate['NoteStore.getSyncState'] > 1) {
    causes.push('Excessive sync state polling');
  }

  // Check for inefficient requests
  if (callRate['NoteStore.getResource'] > 10) {
    causes.push('Individual resource fetching (should batch)');
  }

  return causes;
}
缓解措施:
javascript
// Emergency rate limiting
class EmergencyRateLimiter {
  constructor() {
    this.globalPause = false;
    this.pauseUntil = 0;
  }

  async activateEmergencyPause(durationSeconds) {
    this.globalPause = true;
    this.pauseUntil = Date.now() + (durationSeconds * 1000);

    console.warn(`Emergency rate limit pause activated for ${durationSeconds}s`);

    // Notify operations
    await alertOps('Emergency rate limit pause', {
      duration: durationSeconds,
      reason: 'Excessive rate limits detected'
    });

    // Auto-deactivate
    setTimeout(() => {
      this.globalPause = false;
      console.info('Emergency rate limit pause deactivated');
    }, durationSeconds * 1000);
  }

  async checkBeforeRequest() {
    if (this.globalPause) {
      const waitTime = this.pauseUntil - Date.now();
      if (waitTime > 0) {
        throw new Error(`Rate limit emergency: retry in ${Math.ceil(waitTime / 1000)}s`);
      }
    }
  }
}
解决步骤:
  1. 识别并修复低效的API使用方式
  2. 增加缓存TTL时长
  3. 实现请求合并
  4. 向Evernote申请提升速率限制

INC-03: Authentication Failures

INC-03:认证失败

Symptoms:
  • Users receiving auth errors
  • OAuth flow failing
  • Token rejections
Investigation:
javascript
// Auth diagnostic
async function diagnoseAuthIssue(userId) {
  const user = await db.users.findById(userId);
  const token = await db.tokens.findByUserId(userId);

  const diagnosis = {
    userId,
    hasToken: !!token,
    tokenExpired: token ? (Date.now() > token.expiresAt) : null,
    tokenExpiresIn: token ? Math.floor((token.expiresAt - Date.now()) / 1000 / 60 / 60) + ' hours' : null
  };

  // Test token validity
  if (token && !diagnosis.tokenExpired) {
    try {
      const client = new Evernote.Client({ token: token.accessToken, sandbox: false });
      const userStore = client.getUserStore();
      await userStore.getUser();
      diagnosis.tokenValid = true;
    } catch (error) {
      diagnosis.tokenValid = false;
      diagnosis.tokenError = {
        code: error.errorCode,
        parameter: error.parameter
      };
    }
  }

  return diagnosis;
}
Common Causes & Fixes:
Error CodeCauseFix
4 (INVALID_AUTH)Token revokedRe-authenticate user
5 (AUTH_EXPIRED)Token expiredRe-authenticate user
3 (PERMISSION_DENIED)Insufficient permissionsCheck API key permissions
Resolution:
javascript
// Batch re-auth notification
async function notifyUsersToReauth(userIds) {
  for (const userId of userIds) {
    await sendNotification(userId, {
      type: 'REAUTH_REQUIRED',
      message: 'Please reconnect your Evernote account to continue syncing.',
      action: { type: 'REDIRECT', url: '/auth/evernote' }
    });

    await db.tokens.markInvalid(userId);
  }
}

症状:
  • 用户收到认证错误提示
  • OAuth流程失败
  • 令牌被拒绝
调查:
javascript
// Auth diagnostic
async function diagnoseAuthIssue(userId) {
  const user = await db.users.findById(userId);
  const token = await db.tokens.findByUserId(userId);

  const diagnosis = {
    userId,
    hasToken: !!token,
    tokenExpired: token ? (Date.now() > token.expiresAt) : null,
    tokenExpiresIn: token ? Math.floor((token.expiresAt - Date.now()) / 1000 / 60 / 60) + ' hours' : null
  };

  // Test token validity
  if (token && !diagnosis.tokenExpired) {
    try {
      const client = new Evernote.Client({ token: token.accessToken, sandbox: false });
      const userStore = client.getUserStore();
      await userStore.getUser();
      diagnosis.tokenValid = true;
    } catch (error) {
      diagnosis.tokenValid = false;
      diagnosis.tokenError = {
        code: error.errorCode,
        parameter: error.parameter
      };
    }
  }

  return diagnosis;
}
常见原因与修复方案:
错误代码原因修复方案
4 (INVALID_AUTH)令牌已被撤销引导用户重新认证
5 (AUTH_EXPIRED)令牌已过期引导用户重新认证
3 (PERMISSION_DENIED)权限不足检查API密钥权限
解决步骤:
javascript
// Batch re-auth notification
async function notifyUsersToReauth(userIds) {
  for (const userId of userIds) {
    await sendNotification(userId, {
      type: 'REAUTH_REQUIRED',
      message: 'Please reconnect your Evernote account to continue syncing.',
      action: { type: 'REDIRECT', url: '/auth/evernote' }
    });

    await db.tokens.markInvalid(userId);
  }
}

INC-04: Data Sync Issues

INC-04:数据同步问题

Symptoms:
  • Notes not appearing
  • Sync state stuck
  • Missing changes
Investigation:
javascript
// Sync state diagnostic
async function diagnoseSyncIssue(userId) {
  const syncState = await db.syncState.findByUserId(userId);
  const client = await getClientForUser(userId);

  const remoteSyncState = await client.noteStore.getSyncState();

  return {
    localUSN: syncState.lastUpdateCount,
    remoteUSN: remoteSyncState.updateCount,
    behind: remoteSyncState.updateCount - syncState.lastUpdateCount,
    lastSyncAt: syncState.lastSyncAt,
    needsSync: remoteSyncState.updateCount > syncState.lastUpdateCount,
    fullSyncRequired: syncState.fullSyncRequired
  };
}

// Force resync
async function forceResync(userId) {
  await db.syncState.update(userId, {
    lastUpdateCount: 0,
    fullSyncRequired: true,
    lastSyncAt: null
  });

  // Queue sync job
  await syncQueue.add('full-sync', { userId, priority: 'high' });

  return { status: 'queued', message: 'Full resync initiated' };
}

症状:
  • 笔记未显示
  • 同步状态停滞
  • 变更丢失
调查:
javascript
// Sync state diagnostic
async function diagnoseSyncIssue(userId) {
  const syncState = await db.syncState.findByUserId(userId);
  const client = await getClientForUser(userId);

  const remoteSyncState = await client.noteStore.getSyncState();

  return {
    localUSN: syncState.lastUpdateCount,
    remoteUSN: remoteSyncState.updateCount,
    behind: remoteSyncState.updateCount - syncState.lastUpdateCount,
    lastSyncAt: syncState.lastSyncAt,
    needsSync: remoteSyncState.updateCount > syncState.lastUpdateCount,
    fullSyncRequired: syncState.fullSyncRequired
  };
}

// Force resync
async function forceResync(userId) {
  await db.syncState.update(userId, {
    lastUpdateCount: 0,
    fullSyncRequired: true,
    lastSyncAt: null
  });

  // Queue sync job
  await syncQueue.add('full-sync', { userId, priority: 'high' });

  return { status: 'queued', message: 'Full resync initiated' };
}

Incident Communication Templates

事件沟通模板

Status Page Update

状态页面更新

markdown
undefined
markdown
undefined

[Investigating] Evernote Integration Issue

[调查中] Evernote集成问题

Time: [TIMESTAMP] Status: Investigating
We are currently investigating issues with Evernote note synchronization. Some users may experience delays in note updates.
We will provide updates as we learn more.
undefined
时间: [时间戳] 状态: 调查中
我们目前正在调查Evernote笔记同步问题。 部分用户可能会遇到笔记更新延迟。
我们会在获取更多信息后及时更新。
undefined

User Notification

用户通知

markdown
undefined
markdown
undefined

Temporary Sync Delay

临时同步延迟

Hi [USER],
We're experiencing a temporary delay in syncing notes with Evernote. Your local changes are saved and will sync automatically when the issue is resolved.
No action is needed on your part.
Expected resolution: Within 2 hours
Thank you for your patience.
undefined
您好 [用户],
我们目前遇到Evernote笔记同步的临时延迟。 您的本地变更已保存,将在服务恢复后自动同步。
您无需进行任何操作。
预计恢复时间:2小时内
感谢您的耐心等待。
undefined

Resolution Update

恢复更新

markdown
undefined
markdown
undefined

[Resolved] Evernote Integration Issue

[已解决] Evernote集成问题

Time: [TIMESTAMP] Status: Resolved
The Evernote synchronization issue has been resolved. All notes should now be syncing normally.
Root Cause: [BRIEF DESCRIPTION] Duration: [X hours Y minutes] Impact: [NUMBER] users affected
We apologize for any inconvenience.
undefined
时间: [时间戳] 状态: 已解决
Evernote同步问题已修复。 所有笔记现在应该可以正常同步。
根本原因: [简要说明] 持续时长: [X小时Y分钟] 影响范围: [N]位用户受影响
对于给您带来的不便,我们深表歉意。
undefined

Post-Incident Checklist

事后检查清单

markdown
undefined
markdown
undefined

Post-Incident Review

事件复盘

Timeline

时间线

  • Document incident timeline
  • Record all actions taken
  • Note what worked/didn't work
  • 记录事件时间线
  • 记录所有执行的操作
  • 记录有效/无效的处理措施

Root Cause

根本原因

  • Identify root cause
  • Determine contributing factors
  • Assess detection time
  • 确定根本原因
  • 分析促成因素
  • 评估检测时长

Prevention

预防措施

  • Define preventive measures
  • Update monitoring/alerts
  • Improve runbooks
  • 制定预防方案
  • 更新监控/告警配置
  • 完善响应手册

Follow-up

后续跟进

  • Schedule post-mortem meeting
  • Assign action items
  • Update documentation
undefined
  • 安排复盘会议
  • 分配行动项
  • 更新文档
undefined

Output

输出内容

  • Incident classification guide
  • Step-by-step response procedures
  • Diagnostic scripts
  • Mitigation strategies
  • Communication templates
  • Post-incident checklist
  • 事件分级指南
  • 分步响应流程
  • 诊断脚本
  • 缓解策略
  • 沟通模板
  • 事后检查清单

Resources

参考资源

Next Steps

下一步

For data handling best practices, see
evernote-data-handling
.
有关数据处理最佳实践,请查看
evernote-data-handling