real-time-collaboration-engine

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Real-Time Collaboration Engine

实时协作引擎

Expert in building Google Docs-style collaborative editing with WebSockets, conflict resolution, and presence awareness.
擅长构建类似Google Docs的、基于WebSockets、冲突解决机制和存在感知功能的协作编辑系统。

When to Use

适用场景

Use for:
  • Collaborative text/code editors
  • Shared whiteboards and design tools
  • Multi-user video editing timelines
  • Real-time data dashboards
  • Multiplayer game state sync
NOT for:
  • Simple chat applications (use basic WebSocket)
  • Request-response APIs (use REST/GraphQL)
  • Single-user applications
  • Read-only data streaming (use Server-Sent Events)
适用场景:
  • 协作文本/代码编辑器
  • 共享白板与设计工具
  • 多用户视频编辑时间线
  • 实时数据仪表盘
  • 多人游戏状态同步
不适用场景:
  • 简单聊天应用(使用基础WebSocket即可)
  • 请求响应式API(使用REST/GraphQL)
  • 单用户应用
  • 只读数据流(使用Server-Sent Events)

Quick Decision Tree

快速决策树

Need real-time collaboration?
├── Text editing? → Operational Transform (OT)
├── JSON data structures? → CRDTs
├── Cursor tracking only? → Simple WebSocket + presence
├── Offline-first? → CRDTs (better offline merge)
└── No conflicts possible? → Basic broadcast

Need real-time collaboration?
├── Text editing? → Operational Transform (OT)
├── JSON data structures? → CRDTs
├── Cursor tracking only? → Simple WebSocket + presence
├── Offline-first? → CRDTs (better offline merge)
└── No conflicts possible? → Basic broadcast

Technology Selection

技术选型

Conflict Resolution Strategies (2024)

2024年冲突解决策略

StrategyBest ForComplexityOffline Support
Operational Transform (OT)Text, ordered sequencesHighLimited
CRDTsJSON objects, setsMediumExcellent
Last-Write-WinsSimple stateLowBasic
Three-Way MergeGit-style editingHighGood
Timeline:
  • 2010: Google Wave uses OT
  • 2014: Figma adopts CRDTs
  • 2019: Yjs (CRDT library) released
  • 2022: Automerge 2.0 (CRDT library) released
  • 2024: PartyKit simplifies real-time infrastructure

策略最佳适用场景复杂度离线支持
Operational Transform (OT)文本、有序序列有限
CRDTsJSON对象、集合优秀
Last-Write-Wins简单状态基础
Three-Way MergeGit风格编辑良好
时间线:
  • 2010年:Google Wave使用OT
  • 2014年:Figma采用CRDTs
  • 2019年:Yjs(CRDT库)发布
  • 2022年:Automerge 2.0(CRDT库)发布
  • 2024年:PartyKit简化实时基础设施

Common Anti-Patterns

常见反模式

Anti-Pattern 1: Broadcasting Every Keystroke

反模式1:广播每一次按键操作

Novice thinking: "Send every change immediately for real-time feel"
Problem: Network floods with tiny messages, poor performance.
Wrong approach:
typescript
// ❌ Sends message on every keystroke
function Editor() {
  const handleChange = (text: string) => {
    socket.emit('text-change', { text });  // Every keystroke!
  };

  return <textarea onChange={(e) => handleChange(e.target.value)} />;
}
Why wrong: 100 WPM typing = 500 messages/minute = network congestion.
Correct approach:
typescript
// ✅ Batches changes every 200ms
function Editor() {
  const [pendingChanges, setPendingChanges] = useState<Change[]>([]);

  useEffect(() => {
    const interval = setInterval(() => {
      if (pendingChanges.length > 0) {
        socket.emit('text-batch', { changes: pendingChanges });
        setPendingChanges([]);
      }
    }, 200);

    return () => clearInterval(interval);
  }, [pendingChanges]);

  const handleChange = (change: Change) => {
    setPendingChanges(prev => [...prev, change]);
  };

  return <textarea onChange={handleChange} />;
}
Impact: 500 messages/minute → 5 messages/second (90% reduction).

新手思路:"立即发送每一次变更以实现实时体验"
问题:网络被大量微小消息淹没,性能低下。
错误实现:
typescript
// ❌ Sends message on every keystroke
function Editor() {
  const handleChange = (text: string) => {
    socket.emit('text-change', { text });  // Every keystroke!
  };

  return <textarea onChange={(e) => handleChange(e.target.value)} />;
}
错误原因:100词/分钟的打字速度 = 500条消息/分钟 = 网络拥塞。
正确实现:
typescript
// ✅ Batches changes every 200ms
function Editor() {
  const [pendingChanges, setPendingChanges] = useState<Change[]>([]);

  useEffect(() => {
    const interval = setInterval(() => {
      if (pendingChanges.length > 0) {
        socket.emit('text-batch', { changes: pendingChanges });
        setPendingChanges([]);
      }
    }, 200);

    return () => clearInterval(interval);
  }, [pendingChanges]);

  const handleChange = (change: Change) => {
    setPendingChanges(prev => [...prev, change]);
  };

  return <textarea onChange={handleChange} />;
}
效果:500条消息/分钟 → 5条消息/秒(减少90%)。

Anti-Pattern 2: No Conflict Resolution Strategy

反模式2:未配置冲突解决策略

Problem: Concurrent edits cause data loss or corruption.
Symptom: Users see their changes disappear, documents become inconsistent.
Wrong approach:
typescript
// ❌ Last write wins, overwrites concurrent changes
socket.on('text-change', ({ userId, text }) => {
  setDocument(text);  // Loses concurrent edits!
});
Why wrong: If User A and B edit simultaneously, one change is lost.
Correct approach (OT):
typescript
// ✅ Operational Transform for text
import { TextOperation } from 'ot.js';

socket.on('operation', ({ userId, operation, revision }) => {
  const transformed = transformOperation(
    operation,
    pendingOperations,
    revision
  );

  applyOperation(transformed);
  incrementRevision();
});

function transformOperation(
  incoming: Operation,
  pending: Operation[],
  baseRevision: number
): Operation {
  // Transform incoming against pending operations
  let transformed = incoming;
  for (const op of pending) {
    transformed = TextOperation.transform(transformed, op)[0];
  }
  return transformed;
}
Correct approach (CRDT):
typescript
// ✅ CRDT for JSON objects
import * as Y from 'yjs';

const ydoc = new Y.Doc();
const ytext = ydoc.getText('document');

// Automatically handles conflicts
ytext.insert(0, 'Hello');

// Sync with peers
const provider = new WebsocketProvider('ws://localhost:1234', 'room', ydoc);
Impact: Concurrent edits merge correctly, no data loss.

问题:并发编辑导致数据丢失或损坏。
症状:用户发现自己的变更消失,文档内容不一致。
错误实现:
typescript
// ❌ Last write wins, overwrites concurrent changes
socket.on('text-change', ({ userId, text }) => {
  setDocument(text);  // Loses concurrent edits!
});
错误原因:如果用户A和B同时编辑,其中一方的变更会丢失。
正确实现(OT):
typescript
// ✅ Operational Transform for text
import { TextOperation } from 'ot.js';

socket.on('operation', ({ userId, operation, revision }) => {
  const transformed = transformOperation(
    operation,
    pendingOperations,
    revision
  );

  applyOperation(transformed);
  incrementRevision();
});

function transformOperation(
  incoming: Operation,
  pending: Operation[],
  baseRevision: number
): Operation {
  // Transform incoming against pending operations
  let transformed = incoming;
  for (const op of pending) {
    transformed = TextOperation.transform(transformed, op)[0];
  }
  return transformed;
}
正确实现(CRDT):
typescript
// ✅ CRDT for JSON objects
import * as Y from 'yjs';

const ydoc = new Y.Doc();
const ytext = ydoc.getText('document');

// Automatically handles conflicts
ytext.insert(0, 'Hello');

// Sync with peers
const provider = new WebsocketProvider('ws://localhost:1234', 'room', ydoc);
效果:并发编辑可正确合并,无数据丢失。

Anti-Pattern 3: Not Handling Disconnections

反模式3:未处理断开连接场景

Problem: User goes offline, loses work or sees stale state.
Wrong approach:
typescript
// ❌ No offline handling
socket.on('disconnect', () => {
  console.log('Disconnected');  // That's it?!
});
Why wrong: Pending changes lost, no reconnection strategy, bad UX.
Correct approach:
typescript
// ✅ Queue changes offline, sync on reconnect
const [isOnline, setIsOnline] = useState(true);
const [offlineQueue, setOfflineQueue] = useState<Change[]>([]);

socket.on('disconnect', () => {
  setIsOnline(false);
  showToast('Offline - changes will sync when reconnected');
});

socket.on('connect', () => {
  setIsOnline(true);

  // Send queued changes
  if (offlineQueue.length > 0) {
    socket.emit('sync-offline-changes', { changes: offlineQueue });
    setOfflineQueue([]);
  }
});

const handleChange = (change: Change) => {
  if (isOnline) {
    socket.emit('change', change);
  } else {
    setOfflineQueue(prev => [...prev, change]);
  }
};
Timeline context:
  • 2015: Offline-first apps rare
  • 2020: PWAs make offline UX standard
  • 2024: Users expect seamless offline editing

问题:用户离线后,工作内容丢失或看到过期状态。
错误实现:
typescript
// ❌ No offline handling
socket.on('disconnect', () => {
  console.log('Disconnected');  // That's it?!
});
错误原因:待处理的变更丢失,无重连策略,用户体验差。
正确实现:
typescript
// ✅ Queue changes offline, sync on reconnect
const [isOnline, setIsOnline] = useState(true);
const [offlineQueue, setOfflineQueue] = useState<Change[]>([]);

socket.on('disconnect', () => {
  setIsOnline(false);
  showToast('Offline - changes will sync when reconnected');
});

socket.on('connect', () => {
  setIsOnline(true);

  // Send queued changes
  if (offlineQueue.length > 0) {
    socket.emit('sync-offline-changes', { changes: offlineQueue });
    setOfflineQueue([]);
  }
});

const handleChange = (change: Change) => {
  if (isOnline) {
    socket.emit('change', change);
  } else {
    setOfflineQueue(prev => [...prev, change]);
  }
};
时间线背景:
  • 2015年:离线优先应用罕见
  • 2020年:PWAs让离线体验成为标准
  • 2024年:用户期望无缝的离线编辑体验

Anti-Pattern 4: Client-Only State Sync

反模式4:仅客户端状态同步

Problem: No server authority, clients get out of sync.
Wrong approach:
typescript
// ❌ Clients broadcast to each other directly
socket.on('peer-change', ({ userId, change }) => {
  applyChange(change);  // No validation, no server state
});
Why wrong: Malicious client can send invalid data, no recovery from desync.
Correct approach:
typescript
// ✅ Server is source of truth
// Client
socket.emit('operation', { operation, clientRevision });

socket.on('ack', ({ serverRevision }) => {
  if (serverRevision !== expectedRevision) {
    // Desync detected, request full state
    socket.emit('request-full-state');
  }
});

// Server
io.on('connection', (socket) => {
  socket.on('operation', ({ operation, clientRevision }) => {
    // Validate operation
    if (!isValid(operation)) {
      socket.emit('error', { message: 'Invalid operation' });
      return;
    }

    // Apply to server state
    const serverRevision = applyOperation(operation);

    // Broadcast to all clients
    io.emit('operation', { operation, serverRevision });
  });
});
Impact: Data integrity guaranteed, can recover from client bugs.

问题:无服务器权威节点,客户端状态不同步。
错误实现:
typescript
// ❌ Clients broadcast to each other directly
socket.on('peer-change', ({ userId, change }) => {
  applyChange(change);  // No validation, no server state
});
错误原因:恶意客户端可发送无效数据,无法从不同步状态恢复。
正确实现:
typescript
// ✅ Server is source of truth
// Client
socket.emit('operation', { operation, clientRevision });

socket.on('ack', ({ serverRevision }) => {
  if (serverRevision !== expectedRevision) {
    // Desync detected, request full state
    socket.emit('request-full-state');
  }
});

// Server
io.on('connection', (socket) => {
  socket.on('operation', ({ operation, clientRevision }) => {
    // Validate operation
    if (!isValid(operation)) {
      socket.emit('error', { message: 'Invalid operation' });
      return;
    }

    // Apply to server state
    const serverRevision = applyOperation(operation);

    // Broadcast to all clients
    io.emit('operation', { operation, serverRevision });
  });
});
效果:保证数据完整性,可从客户端错误中恢复。

Anti-Pattern 5: No Presence Awareness

反模式5:无存在感知功能

Problem: Users can't see who's editing what, causing edit conflicts.
Symptom: Two people editing same section unknowingly.
Wrong approach:
typescript
// ❌ No awareness of other users
function Editor() {
  return <textarea />;  // Flying blind!
}
Correct approach:
typescript
// ✅ Show active users and cursors
import { usePresence } from './usePresence';

function Editor() {
  const { users, updateCursor } = usePresence();

  const handleCursorMove = (position: number) => {
    socket.emit('cursor-move', { userId: myId, position });
  };

  return (
    <div>
      {/* Show who's online */}
      <UserList users={users} />

      {/* Show remote cursors */}
      <EditorWithCursors
        content={content}
        cursors={users.map(u => u.cursor)}
        onCursorMove={handleCursorMove}
      />
    </div>
  );
}
Features:
  • Active user list with avatars
  • Cursor positions color-coded by user
  • Selection ranges highlighted
  • "User X is typing..." indicators

问题:用户无法查看谁在编辑哪个部分,导致编辑冲突。
症状:两人在不知情的情况下编辑同一区域。
错误实现:
typescript
// ❌ No awareness of other users
function Editor() {
  return <textarea />;  // Flying blind!
}
正确实现:
typescript
// ✅ Show active users and cursors
import { usePresence } from './usePresence';

function Editor() {
  const { users, updateCursor } = usePresence();

  const handleCursorMove = (position: number) => {
    socket.emit('cursor-move', { userId: myId, position });
  };

  return (
    <div>
      {/* Show who's online */}
      <UserList users={users} />

      {/* Show remote cursors */}
      <EditorWithCursors
        content={content}
        cursors={users.map(u => u.cursor)}
        onCursorMove={handleCursorMove}
      />
    </div>
  );
}
特性:
  • 带头像的活跃用户列表
  • 按用户颜色区分的光标位置
  • 选中区域高亮
  • “用户X正在输入...”提示

Implementation Patterns

实现模式

Pattern 1: WebSocket Setup with Reconnection

模式1:带重连的WebSocket配置

typescript
import { io } from 'socket.io-client';

const socket = io('ws://localhost:3000', {
  reconnection: true,
  reconnectionDelay: 1000,
  reconnectionDelayMax: 5000,
  reconnectionAttempts: Infinity,
  transports: ['websocket', 'polling']  // Fallback
});

socket.on('connect', () => {
  console.log('Connected:', socket.id);
});

socket.on('disconnect', (reason) => {
  if (reason === 'io server disconnect') {
    // Server disconnected, manually reconnect
    socket.connect();
  }
});

socket.on('connect_error', (error) => {
  console.error('Connection error:', error);
});
typescript
import { io } from 'socket.io-client';

const socket = io('ws://localhost:3000', {
  reconnection: true,
  reconnectionDelay: 1000,
  reconnectionDelayMax: 5000,
  reconnectionAttempts: Infinity,
  transports: ['websocket', 'polling']  // Fallback
});

socket.on('connect', () => {
  console.log('Connected:', socket.id);
});

socket.on('disconnect', (reason) => {
  if (reason === 'io server disconnect') {
    // Server disconnected, manually reconnect
    socket.connect();
  }
});

socket.on('connect_error', (error) => {
  console.error('Connection error:', error);
});

Pattern 2: Operational Transform (Text)

模式2:文本的操作转换(OT)

typescript
import { TextOperation } from 'ot.js';

class OTEditor {
  private revision = 0;
  private pendingOperations: TextOperation[] = [];

  applyLocalOperation(op: TextOperation): void {
    // Apply immediately (optimistic update)
    this.applyToEditor(op);

    // Send to server
    this.sendOperation(op);

    // Store as pending
    this.pendingOperations.push(op);
  }

  receiveRemoteOperation(op: TextOperation, serverRevision: number): void {
    // Transform against pending operations
    let transformed = op;
    for (const pending of this.pendingOperations) {
      [transformed, pending] = TextOperation.transform(transformed, pending);
    }

    // Apply transformed operation
    this.applyToEditor(transformed);
    this.revision = serverRevision;
  }

  acknowledgeOperation(serverRevision: number): void {
    // Remove acknowledged operation from pending
    this.pendingOperations.shift();
    this.revision = serverRevision;
  }
}
typescript
import { TextOperation } from 'ot.js';

class OTEditor {
  private revision = 0;
  private pendingOperations: TextOperation[] = [];

  applyLocalOperation(op: TextOperation): void {
    // Apply immediately (optimistic update)
    this.applyToEditor(op);

    // Send to server
    this.sendOperation(op);

    // Store as pending
    this.pendingOperations.push(op);
  }

  receiveRemoteOperation(op: TextOperation, serverRevision: number): void {
    // Transform against pending operations
    let transformed = op;
    for (const pending of this.pendingOperations) {
      [transformed, pending] = TextOperation.transform(transformed, pending);
    }

    // Apply transformed operation
    this.applyToEditor(transformed);
    this.revision = serverRevision;
  }

  acknowledgeOperation(serverRevision: number): void {
    // Remove acknowledged operation from pending
    this.pendingOperations.shift();
    this.revision = serverRevision;
  }
}

Pattern 3: CRDT with Yjs

模式3:基于Yjs的CRDT实现

typescript
import * as Y from 'yjs';
import { WebsocketProvider } from 'y-websocket';

// Create shared document
const ydoc = new Y.Doc();

// Define shared types
const ytext = ydoc.getText('content');
const ymap = ydoc.getMap('metadata');
const yarray = ydoc.getArray('users');

// Connect to sync server
const provider = new WebsocketProvider(
  'ws://localhost:1234',
  'room-name',
  ydoc
);

// Listen to changes
ytext.observe(event => {
  console.log('Text changed:', event.changes);
});

// Make changes (automatically synced)
ytext.insert(0, 'Hello ');
ytext.insert(6, 'World!');

// Undo/redo support
const undoManager = new Y.UndoManager(ytext);
undoManager.undo();
undoManager.redo();
typescript
import * as Y from 'yjs';
import { WebsocketProvider } from 'y-websocket';

// Create shared document
const ydoc = new Y.Doc();

// Define shared types
const ytext = ydoc.getText('content');
const ymap = ydoc.getMap('metadata');
const yarray = ydoc.getArray('users');

// Connect to sync server
const provider = new WebsocketProvider(
  'ws://localhost:1234',
  'room-name',
  ydoc
);

// Listen to changes
ytext.observe(event => {
  console.log('Text changed:', event.changes);
});

// Make changes (automatically synced)
ytext.insert(0, 'Hello ');
ytext.insert(6, 'World!');

// Undo/redo support
const undoManager = new Y.UndoManager(ytext);
undoManager.undo();
undoManager.redo();

Pattern 4: Presence Awareness

模式4:存在感知功能

typescript
import { Awareness } from 'y-protocols/awareness';

const awareness = provider.awareness;

// Set local state
awareness.setLocalState({
  user: {
    name: 'Alice',
    color: '#ff0000',
    cursor: { line: 10, ch: 5 }
  }
});

// Listen to changes
awareness.on('change', ({ added, updated, removed }) => {
  // Update UI with user cursors/selections
  const states = awareness.getStates();
  states.forEach((state, clientId) => {
    if (clientId !== awareness.clientID) {
      renderCursor(state.user.cursor, state.user.color);
    }
  });
});
typescript
import { Awareness } from 'y-protocols/awareness';

const awareness = provider.awareness;

// Set local state
awareness.setLocalState({
  user: {
    name: 'Alice',
    color: '#ff0000',
    cursor: { line: 10, ch: 5 }
  }
});

// Listen to changes
awareness.on('change', ({ added, updated, removed }) => {
  // Update UI with user cursors/selections
  const states = awareness.getStates();
  states.forEach((state, clientId) => {
    if (clientId !== awareness.clientID) {
      renderCursor(state.user.cursor, state.user.color);
    }
  });
});

Pattern 5: Optimistic Updates with Rollback

模式5:带回滚的乐观更新

typescript
class OptimisticEditor {
  private optimisticChanges = new Map<string, Change>();

  async applyChange(change: Change): Promise<void> {
    const changeId = generateId();

    // Apply immediately (optimistic)
    this.applyToUI(change);
    this.optimisticChanges.set(changeId, change);

    try {
      // Send to server
      const result = await this.sendToServer(change);

      // Success - remove from optimistic
      this.optimisticChanges.delete(changeId);

    } catch (error) {
      // Failed - rollback
      this.rollback(changeId);
      this.showError('Could not apply change');
    }
  }

  private rollback(changeId: string): void {
    const change = this.optimisticChanges.get(changeId);
    if (change) {
      this.revertInUI(change);
      this.optimisticChanges.delete(changeId);
    }
  }
}

typescript
class OptimisticEditor {
  private optimisticChanges = new Map<string, Change>();

  async applyChange(change: Change): Promise<void> {
    const changeId = generateId();

    // Apply immediately (optimistic)
    this.applyToUI(change);
    this.optimisticChanges.set(changeId, change);

    try {
      // Send to server
      const result = await this.sendToServer(change);

      // Success - remove from optimistic
      this.optimisticChanges.delete(changeId);

    } catch (error) {
      // Failed - rollback
      this.rollback(changeId);
      this.showError('Could not apply change');
    }
  }

  private rollback(changeId: string): void {
    const change = this.optimisticChanges.get(changeId);
    if (change) {
      this.revertInUI(change);
      this.optimisticChanges.delete(changeId);
    }
  }
}

Production Checklist

生产环境检查清单

□ WebSocket connection with auto-reconnect
□ Offline queue for pending changes
□ Conflict resolution strategy (OT or CRDT)
□ Server authority (clients can't desync)
□ Presence awareness (cursors, active users)
□ Optimistic updates with rollback
□ Change batching (not per-keystroke)
□ Message compression for large payloads
□ Authentication and authorization
□ Rate limiting (prevent spam)
□ Heartbeat/ping-pong to detect dead connections
□ Graceful degradation (falls back to polling if WebSocket fails)

□ WebSocket connection with auto-reconnect
□ Offline queue for pending changes
□ Conflict resolution strategy (OT or CRDT)
□ Server authority (clients can't desync)
□ Presence awareness (cursors, active users)
□ Optimistic updates with rollback
□ Change batching (not per-keystroke)
□ Message compression for large payloads
□ Authentication and authorization
□ Rate limiting (prevent spam)
□ Heartbeat/ping-pong to detect dead connections
□ Graceful degradation (falls back to polling if WebSocket fails)

When to Use vs Avoid

适用与规避场景

ScenarioStrategy
Text editing (Google Docs)✅ Operational Transform
JSON objects (Figma)✅ CRDTs (Yjs, Automerge)
Simple cursor sharing✅ Basic WebSocket + presence
Chat messages✅ Simple append-only (no OT/CRDT)
Video timeline editing✅ CRDTs for timeline, OT for text
Read-only dashboards❌ Use Server-Sent Events instead

场景策略
文本编辑(如Google Docs)✅ 操作转换(OT)
JSON对象(如Figma)✅ CRDTs(Yjs、Automerge)
简单光标共享✅ 基础WebSocket + 存在感知
聊天消息✅ 简单追加模式(无需OT/CRDT)
视频时间线编辑✅ 时间线用CRDT,文本用OT
只读仪表盘❌ 改用Server-Sent Events

References

参考资料

  • /references/ot-vs-crdt.md
    - Deep comparison of conflict resolution strategies
  • /references/websocket-scaling.md
    - Scaling to millions of concurrent connections
  • /references/presence-patterns.md
    - Cursor tracking, user awareness, activity indicators
  • /references/ot-vs-crdt.md
    - 冲突解决策略深度对比
  • /references/websocket-scaling.md
    - 扩展至百万级并发连接
  • /references/presence-patterns.md
    - 光标追踪、用户感知、活动提示

Scripts

脚本

  • scripts/collaboration_tester.ts
    - Simulate concurrent edits, test conflict resolution
  • scripts/latency_simulator.ts
    - Test behavior under high latency/packet loss

This skill guides: Real-time collaboration | WebSocket architecture | Operational Transform | CRDTs | Presence awareness | Conflict resolution
  • scripts/collaboration_tester.ts
    - 模拟并发编辑,测试冲突解决
  • scripts/latency_simulator.ts
    - 测试高延迟/丢包下的表现

本技能涵盖:实时协作 | WebSocket架构 | 操作转换 | CRDTs | 存在感知 | 冲突解决