pr-docx

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

DOCX Import/Export for Plate Editor

Plate编辑器的DOCX导入/导出

CRITICAL UNDERSTANDING

核心认知

This skill provides comprehensive guidance for DOCX import/export with absolute requirements that must be followed. Read this entire skill before making any changes to DOCX-related code.
本技能为DOCX导入/导出提供全面指导,包含必须遵守的绝对要求。在对DOCX相关代码进行任何修改前,请完整阅读本技能内容。

The Non-Negotiable Principle: "NO MATTER WHAT"

不可妥协原则:“无论如何”

"NO MATTER WHAT" is an absolutist requirement. Content is SACRED. Metadata is secondary.
“无论如何”是一项绝对要求。内容是核心,元数据是次要的。

Priority Hierarchy

优先级层级

PRIORITY 1 (REQUIRED): LOCATION
  → We MUST know WHERE the comment/change applies
  → Without location, we cannot place the annotation - ONLY valid skip

PRIORITY 2 (REQUIRED): CONTENT
  → The comment TEXT or changed TEXT must be preserved

PRIORITY 3 (BEST EFFORT): METADATA
  → Author → Use if available, else "imported-unknown"
  → Date → Use if available, else Date.now()
PRIORITY 1(必填):位置
  → 我们必须知道评论/修订应用于何处
  → 没有位置信息,就无法放置注释——这是唯一可跳过的情况

PRIORITY 2(必填):内容
  → 必须保留评论文本或修订后的文本

PRIORITY 3(尽力而为):元数据
  → 作者 → 如有可用则使用,否则用"imported-unknown"
  → 日期 → 如有可用则使用,否则用Date.now()

The Golden Rules

黄金规则

ScenarioActionSkip?
Has location, has author, has dateImport fullyNO
Has location, NO authorImport with
"imported-unknown"
NO
Has location, NO dateImport with
Date.now()
NO
Has location, NO text (comment)Import with empty textNO
Tracked change: NO start OR endLog warning, clean upYES
Comment: NO startLog warning, clean upYES
Comment: Has start, NO endUse start as point commentNO - infer end
场景操作是否跳过?
有位置、有作者、有日期完整导入
有位置、无作者使用
"imported-unknown"
导入
有位置、无日期使用
Date.now()
导入
有位置、无评论文本导入空文本
修订跟踪:无起始或结束位置记录警告并清理
评论:无起始位置记录警告并清理
评论:有起始位置、无结束位置将起始位置作为点评论使用否——推断结束位置

Special: Comments With Partial Markers (Golden Rule)

特殊情况:带部分标记的评论(黄金规则)

A comment needs ANY location marker to be preserved. The Golden Rule:
ScenarioAction
Has start, no endend = start → point comment
Has end, no startstart = end → point comment
Has neitherSkip (only valid skip)
typescript
// Golden rule: if we have ANY location marker, preserve the comment
if (!startTokenRange && !endTokenRange) {
  // Only skip when we have NO markers at all
  if (process.env.NODE_ENV !== "production") {
    console.warn("[DOCX Import] Skipping comment with no location markers:", comment.id);
  }
  continue;
}

// Use whichever marker we have, fallback to the other for point comments
const effectiveStartTokenRange = startTokenRange ?? endTokenRange;
const effectiveEndTokenRange = endTokenRange ?? startTokenRange;
评论只要有任何位置标记就必须保留。黄金规则如下:
场景操作
有起始位置、无结束位置结束位置 = 起始位置 → 点评论
有结束位置、无起始位置起始位置 = 结束位置 → 点评论
两者都没有跳过(唯一合法的跳过情况)
typescript
// Golden rule: if we have ANY location marker, preserve the comment
if (!startTokenRange && !endTokenRange) {
  // Only skip when we have NO markers at all
  if (process.env.NODE_ENV !== "production") {
    console.warn("[DOCX Import] Skipping comment with no location markers:", comment.id);
  }
  continue;
}

// Use whichever marker we have, fallback to the other for point comments
const effectiveStartTokenRange = startTokenRange ?? endTokenRange;
const effectiveEndTokenRange = endTokenRange ?? startTokenRange;

In Code Terms

代码层面的实现

typescript
// COMMENTS: Always import if we have location
if (!startTokenRange || !endTokenRange) {
  // NO LOCATION = only valid skip
  console.warn("Skipping - no location:", comment.id);
  continue;
}
// Everything else? IMPORT with defaults
const userId = comment.authorName ?? "imported-unknown";
const date = comment.date ? Date.parse(comment.date) : Date.now();
// ... create discussion

// TRACKED CHANGES: Same principle
if (!changeRange) {
  console.warn("Skipping - no location:", change.id);
  continue;
}
const suggestion = {
  userId: change.author ?? "imported-unknown",
  createdAt: change.date ? Date.parse(change.date) : Date.now(),
  // ... rest
};
typescript
// COMMENTS: Always import if we have location
if (!startTokenRange || !endTokenRange) {
  // NO LOCATION = only valid skip
  console.warn("Skipping - no location:", comment.id);
  continue;
}
// Everything else? IMPORT with defaults
const userId = comment.authorName ?? "imported-unknown";
const date = comment.date ? Date.parse(comment.date) : Date.now();
// ... create discussion

// TRACKED CHANGES: Same principle
if (!changeRange) {
  console.warn("Skipping - no location:", change.id);
  continue;
}
const suggestion = {
  userId: change.author ?? "imported-unknown",
  createdAt: change.date ? Date.parse(change.date) : Date.now(),
  // ... rest
};

What This Means

这意味着什么

  1. Every tracked change MUST be preserved - if we have location
  2. Every comment MUST be preserved - if we have location
  3. Authors do NOT need to exist - use names directly, no lookup
  4. Dates do NOT need to exist - use current timestamp as fallback
  5. No silent failures - log warnings but STILL import with defaults
  6. Round-trip fidelity - Import → Export → Import must preserve
  1. 只要有位置信息,所有修订跟踪内容必须保留
  2. 只要有位置信息,所有评论必须保留
  3. 作者信息并非必需——直接使用名称,无需查找
  4. 日期信息并非必需——使用当前时间戳作为备选
  5. 禁止静默失败——记录警告但仍使用默认值导入
  6. 往返保真度——导入→导出→导入必须保持内容一致

CRITICAL: Precision vs. Preservation

关键:精度与保留的权衡

┌─────────────────────────────────────────────────────────┐
│  PRESERVATION > PRECISION                               │
│                                                         │
│  Better to import with imperfect metadata               │
│  than lose content for "cleaner" code.                  │
└─────────────────────────────────────────────────────────┘
Any change that increases risk of losing comments/tracked changes for precision MUST:
  1. Have mandatory fallback logging:
    typescript
    if (!meetsStrictCriteria(change)) {
      console.warn("[DOCX Import] Precision check failed, using fallback:", {
        id: change.id,
        reason: "...",
        originalData: change,
      });
      importWithDefaults(change);  // STILL IMPORT IT
    }
  2. Be implemented ONLY after careful research:
    • Review DOCX specification for edge cases
    • Test with Word, LibreOffice, AND Google Docs exports
    • Verify NO content is lost in any scenario
    • Document WHY the precision is needed
  3. Never skip without logging:
    typescript
    // ❌ WRONG - Silent skip
    if (!valid) continue;
    
    // ✅ CORRECT - Fallback with logging
    if (!valid) {
      console.warn("[DOCX Import] Using fallback for:", id);
      importWithDefaults(item);
    }
Review checklist for precision changes:
  • Has fallback that preserves content?
  • Logs when fallback is used?
  • Tested with malformed documents?
  • Is precision necessary or just "nice to have"?
  • Could this cause silent data loss?
┌─────────────────────────────────────────────────────────┐
│  内容保留 > 精度要求                                   │
│                                                         │
│  与其为了“更简洁”的代码而丢失内容,                      │
│  不如以不完美的元数据完成导入。                        │
└─────────────────────────────────────────────────────────┘
任何为了提升精度而增加丢失评论/修订跟踪内容风险的修改必须满足:
  1. 必须包含强制备选日志记录:
    typescript
    if (!meetsStrictCriteria(change)) {
      console.warn("[DOCX Import] Precision check failed, using fallback:", {
        id: change.id,
        reason: "...",
        originalData: change,
      });
      importWithDefaults(change);  // STILL IMPORT IT
    }
  2. 仅在经过仔细研究后实现:
    • 查阅DOCX规范中的边缘情况
    • 使用Word、LibreOffice和Google Docs导出的文档进行测试
    • 验证任何场景下都不会丢失内容
    • 记录为何需要提升精度
  3. 禁止无日志的跳过操作:
    typescript
    // ❌ WRONG - Silent skip
    if (!valid) continue;
    
    // ✅ CORRECT - Fallback with logging
    if (!valid) {
      console.warn("[DOCX Import] Using fallback for:", id);
      importWithDefaults(item);
    }
精度修改的审查清单:
  • 有没有保留内容的备选方案?
  • 使用备选方案时是否会记录日志?
  • 是否用格式错误的文档测试过?
  • 精度是必需的还是只是“锦上添花”?
  • 这是否会导致静默数据丢失?

Architecture Overview

架构概述

                    IMPORT FLOW
┌──────────┐    ┌─────────────┐    ┌────────────────┐    ┌─────────────┐
│  .docx   │───►│ mammoth.js  │───►│ HTML + Tokens  │───►│Plate Editor │
│  file    │    │ body-reader │    │ [[DOCX_*:...]] │    │ Suggestions │
└──────────┘    │ doc-to-html │    └────────────────┘    │ Comments    │
                └─────────────┘                          └─────────────┘

                    EXPORT FLOW
┌─────────────┐    ┌────────────────┐    ┌───────────────┐    ┌──────────┐
│Plate Editor │───►│ Serialize to   │───►│ docx-export   │───►│  .docx   │
│ Suggestions │    │ Word-safe HTML │    │ kit           │    │  file    │
│ Comments    │    │ <ins>/<del>    │    └───────────────┘    └──────────┘
└─────────────┘    │ Word comments  │
                   └────────────────┘
                    导入流程
┌──────────┐    ┌─────────────┐    ┌────────────────┐    ┌─────────────┐
│  .docx   │───►│ mammoth.js  │───►│ HTML + 令牌    │───►│Plate编辑器 │
│ 文件     │    │ body-reader │    │ [[DOCX_*:...]] │    │ 建议功能    │
└──────────┘    │ doc-to-html │    └────────────────┘    │ 评论功能    │
                └─────────────┘                          └─────────────┘

                    导出流程
┌─────────────┐    ┌────────────────┐    ┌───────────────┐    ┌──────────┐
│Plate编辑器 │───►│ 序列化为兼容Word的HTML │───►│ docx-export   │───►│  .docx   │
│ 建议功能    │    │ <ins>/<del>标签 │    │ kit           │    │ 文件     │
│ 评论功能    │    │ Word评论格式 │    └───────────────┘    └──────────┘
└─────────────┘    └────────────────┘

Token System

令牌系统

mammoth.js emits tokens that import-toolbar-button.tsx parses:
mammoth.js会生成令牌,由import-toolbar-button.tsx进行解析:

CRITICAL: Token Positioning with findHtmlPath().wrap()

关键:使用findHtmlPath().wrap()定位令牌

Tokens MUST be positioned inline with text for Plate to find them.
In mammoth.js, element handlers should use
findHtmlPath(element, htmlPaths.empty).wrap()
to ensure tokens are emitted in the correct position within the document structure:
javascript
// ✅ CORRECT - Token positioned inline with content
commentRangeStart: function (element, messages, options) {
  return findHtmlPath(element, htmlPaths.empty).wrap(function () {
    var token = DOCX_COMMENT_START_TOKEN_PREFIX + payload + DOCX_COMMENT_TOKEN_SUFFIX;
    return [Html.text(token)];
  });
},

// ❌ WRONG - Token may appear outside paragraph structure
commentRangeStart: function (element, messages, options) {
  var token = DOCX_COMMENT_START_TOKEN_PREFIX + payload + DOCX_COMMENT_TOKEN_SUFFIX;
  return [Html.text(token)];  // No wrap = wrong position
},
Why this matters for Plate:
WITHOUT wrap():
  <p>Hello</p>[[DOCX_CMT_START:...]]<p>world</p>
  └─ Token outside paragraph
  └─ After deserialization: token in wrong node or lost
  └─ searchRange() fails → comment not imported

WITH wrap():
  <p>Hello[[DOCX_CMT_START:...]]world</p>
  └─ Token inline with text
  └─ After deserialization: token in same text node as content
  └─ searchRange() succeeds → comment imported correctly
The Flow:
  1. mammoth.js emits
    [[DOCX_CMT_START:{...}]]
    token inline with text
  2. cleanDocx()
    +
    html.deserialize()
    creates Plate nodes
  3. Token text is in the same node as the annotated content
  4. searchRange()
    finds the token boundaries
  5. Comment marks are applied to the correct range
Rule: All token-emitting handlers must use findHtmlPath().wrap()
  • commentRangeStart
    findHtmlPath(element, htmlPaths.empty).wrap()
  • commentRangeEnd
    findHtmlPath(element, htmlPaths.empty).wrap()
  • inserted
    → Already wraps children correctly
  • deleted
    → Already wraps children correctly
TokenPurpose
[[DOCX_INS_START:{...}]]
Start of insertion (tracked change)
[[DOCX_INS_END:id]]
End of insertion
[[DOCX_DEL_START:{...}]]
Start of deletion (tracked change)
[[DOCX_DEL_END:id]]
End of deletion
[[DOCX_CMT_START:{...}]]
Start of comment range
[[DOCX_CMT_END:id]]
End of comment range
Payload structure (JSON, URL-encoded):
json
{
  "id": "unique-id",
  "author": "Author Name",
  "date": "2024-01-15T10:30:00Z"
}
For comments, additional fields:
json
{
  "id": "0",
  "authorName": "John Doe",
  "authorInitials": "JD",
  "date": "2024-01-15T10:30:00Z",
  "text": "Comment content here"
}
令牌必须与文本内联放置,以便Plate能找到它们。
在mammoth.js中,元素处理程序应使用
findHtmlPath(element, htmlPaths.empty).wrap()
确保令牌在文档结构中的正确位置生成:
javascript
// ✅ CORRECT - Token positioned inline with content
commentRangeStart: function (element, messages, options) {
  return findHtmlPath(element, htmlPaths.empty).wrap(function () {
    var token = DOCX_COMMENT_START_TOKEN_PREFIX + payload + DOCX_COMMENT_TOKEN_SUFFIX;
    return [Html.text(token)];
  });
},

// ❌ WRONG - Token may appear outside paragraph structure
commentRangeStart: function (element, messages, options) {
  var token = DOCX_COMMENT_START_TOKEN_PREFIX + payload + DOCX_COMMENT_TOKEN_SUFFIX;
  return [Html.text(token)];  // No wrap = wrong position
},
这对Plate的重要性:
不使用wrap():
  <p>Hello</p>[[DOCX_CMT_START:...]]<p>world</p>
  └─ 令牌位于段落外部
  └─ 反序列化后:令牌处于错误节点或丢失
  └─ searchRange()失败 → 评论无法导入

使用wrap():
  <p>Hello[[DOCX_CMT_START:...]]world</p>
  └─ 令牌与文本内联
  └─ 反序列化后:令牌与注释内容在同一文本节点
  └─ searchRange()成功 → 评论正确导入
流程:
  1. mammoth.js生成与文本内联的
    [[DOCX_CMT_START:{...}]]
    令牌
  2. cleanDocx()
    +
    html.deserialize()
    创建Plate节点
  3. 令牌文本与注释内容位于同一节点
  4. searchRange()
    找到令牌边界
  5. 评论标记应用于正确范围
规则:所有生成令牌的处理程序必须使用findHtmlPath().wrap()
  • commentRangeStart
    findHtmlPath(element, htmlPaths.empty).wrap()
  • commentRangeEnd
    findHtmlPath(element, htmlPaths.empty).wrap()
  • inserted
    → 已正确包裹子元素
  • deleted
    → 已正确包裹子元素
令牌用途
[[DOCX_INS_START:{...}]]
插入内容的起始标记(修订跟踪)
[[DOCX_INS_END:id]]
插入内容的结束标记
[[DOCX_DEL_START:{...}]]
删除内容的起始标记(修订跟踪)
[[DOCX_DEL_END:id]]
删除内容的结束标记
[[DOCX_CMT_START:{...}]]
评论范围的起始标记
[[DOCX_CMT_END:id]]
评论范围的结束标记
负载结构(JSON,URL编码):
json
{
  "id": "unique-id",
  "author": "Author Name",
  "date": "2024-01-15T10:30:00Z"
}
评论的额外字段:
json
{
  "id": "0",
  "authorName": "John Doe",
  "authorInitials": "JD",
  "date": "2024-01-15T10:30:00Z",
  "text": "Comment content here"
}

Packages Overview

包概述

The codebase uses multiple packages for DOCX handling. Understanding their roles prevents conflicts:
PackageLocationPurposeDirection
mammoth.js
packages/mammoth.js/
DOCX → HTML conversionImport
html-to-docx
plugin/docx-export/packages/html-to-docx/
HTML → DOCX conversionExport
docxjs
packages/docxjs/
DOCX preview/renderingPreview
@platejs/docx
node_modules/@platejs/docx/
Plate DOCX utilitiesBoth
代码库使用多个包处理DOCX。了解它们的作用可避免冲突:
位置用途方向
mammoth.js
packages/mammoth.js/
DOCX → HTML转换导入
html-to-docx
plugin/docx-export/packages/html-to-docx/
HTML → DOCX转换导出
docxjs
packages/docxjs/
DOCX预览/渲染预览
@platejs/docx
node_modules/@platejs/docx/
Plate DOCX工具集双向

mammoth.js (Custom Fork)

mammoth.js(定制分支)

Purpose: Convert DOCX to HTML with embedded tokens for tracked changes and comments.
Key modifications:
  • lib/docx/body-reader.js
    - Parses
    w:ins
    ,
    w:del
    ,
    w:commentRangeStart
    ,
    w:commentRangeEnd
  • lib/document-to-html.js
    - Emits
    [[DOCX_*:...]]
    tokens
  • lib/documents.js
    - Document model with
    inserted
    ,
    deleted
    ,
    commentRangeStart
    types
Does NOT support export - only import.
用途: 将DOCX转换为包含修订跟踪和评论令牌的HTML。
关键修改:
  • lib/docx/body-reader.js
    - 解析
    w:ins
    w:del
    w:commentRangeStart
    w:commentRangeEnd
    元素
  • lib/document-to-html.js
    - 生成
    [[DOCX_*:...]]
    令牌
  • lib/documents.js
    - 包含
    inserted
    deleted
    commentRangeStart
    类型的文档模型
不支持导出 - 仅用于导入。

html-to-docx

html-to-docx

Purpose: Convert HTML to DOCX format for export.
Current limitations:
  • No
    <w:ins>
    /
    <w:del>
    generation (tracked changes)
  • No
    comments.xml
    generation (Word comments)
  • Basic HTML → Word conversion only
Future enhancement needed: Add tracked changes and comments support for round-trip fidelity.
用途: 将HTML转换为DOCX格式用于导出。
当前限制:
  • 不生成
    <w:ins>
    /
    <w:del>
    标签(修订跟踪)
  • 不生成
    comments.xml
    (Word评论)
  • 仅支持基础HTML→Word转换
未来需要增强: 添加修订跟踪和评论支持以实现往返保真度。

docxjs (docx-preview)

docxjs(docx-preview)

Purpose: Render DOCX files for preview in browser.
Key options:
typescript
{
  renderChanges: false,  // Can render tracked changes
  renderComments: false, // Can render comments
  breakPages: true,
  // ...
}
Does NOT modify files - read-only preview.
用途: 在浏览器中渲染DOCX文件用于预览。
关键选项:
typescript
{
  renderChanges: false,  // 可渲染修订跟踪内容
  renderComments: false, // 可渲染评论
  breakPages: true,
  // ...
}
不修改文件 - 仅用于只读预览。

Package Interaction

包交互流程

IMPORT:  .docx ──mammoth.js──► HTML+tokens ──Plate──► Editor
EXPORT:  Editor ──serialize──► HTML ──html-to-docx──► .docx
PREVIEW: .docx ──docxjs──► DOM (read-only)
No conflicts: Each package has a distinct role. Modifications to one don't affect others.
导入:  .docx ──mammoth.js──► HTML+令牌 ──Plate──► 编辑器
导出:  编辑器 ──序列化──► HTML ──html-to-docx──► .docx
预览:  .docx ──docxjs──► DOM(只读)
无冲突: 每个包都有明确的职责。修改其中一个不会影响其他包。

Key Files

关键文件

Import Pipeline

导入流程

  • packages/mammoth.js/lib/docx/body-reader.js
    - Parses DOCX XML elements
  • packages/mammoth.js/lib/document-to-html.js
    - Emits tokens for tracked changes/comments
  • src/components/editor/ui/import-toolbar-button.tsx
    - Parses tokens, creates suggestions/comments
  • src/components/editor/utils/searchRanges.ts
    - Finds token boundaries in editor content
  • packages/mammoth.js/lib/docx/body-reader.js
    - 解析DOCX XML元素
  • packages/mammoth.js/lib/document-to-html.js
    - 生成修订跟踪/评论令牌
  • src/components/editor/ui/import-toolbar-button.tsx
    - 解析令牌,创建建议/评论
  • src/components/editor/utils/searchRanges.ts
    - 在编辑器内容中查找令牌边界

Export Pipeline

导出流程

  • src/registry/components/editor/plugins/docx-export-kit.tsx
    - DOCX blob generation
  • src/components/editor/ui/docx-export-toolbar-button.tsx
    - Export button UI
  • src/components/editor/ui/export-toolbar-button-fixed.tsx
    - Multi-format export
  • src/registry/components/editor/plugins/docx-export-kit.tsx
    - 生成DOCX blob
  • src/components/editor/ui/docx-export-toolbar-button.tsx
    - 导出按钮UI
  • src/components/editor/ui/export-toolbar-button-fixed.tsx
    - 多格式导出

Plate Plugins

Plate插件

  • src/components/editor/plugins/suggestion-kit-app.tsx
    - Suggestion system
  • src/components/editor/plugins/comment-kit-app.tsx
    - Comment/discussion system
  • src/components/editor/plugins/suggestion-kit-app.tsx
    - 建议系统
  • src/components/editor/plugins/comment-kit-app.tsx
    - 评论/讨论系统

Logging

日志

  • src/lib/logger.ts
    - Unified Logfire logger (prod: Logfire only, dev: Logfire + console)
  • src/lib/logger.ts
    - 统一Logfire日志器(生产环境:仅Logfire,开发环境:Logfire + 控制台)

Implementation Rules

实现规则

Rule 1: Always Handle Orphan Tokens

规则1:始终处理孤立令牌

typescript
if (!startTokenRange || !endTokenRange) {
  // MUST clean up orphan tokens
  if (startTokenRange) editor.tf.delete({ at: startTokenRange });
  if (endTokenRange) editor.tf.delete({ at: endTokenRange });
  continue; // But don't fail the whole import
}
typescript
if (!startTokenRange || !endTokenRange) {
  // MUST clean up orphan tokens
  if (startTokenRange) editor.tf.delete({ at: startTokenRange });
  if (endTokenRange) editor.tf.delete({ at: endTokenRange });
  continue; // But don't fail the whole import
}

Rule 2: Never Require User Lookup

规则2:绝不要求用户查找

typescript
// CORRECT - Use author name directly
const userId = authorName ?? "imported-unknown";

// WRONG - Don't do this
const user = await findUserByEmail(authorEmail);
const userId = user?.id; // ❌ Will fail for external authors
typescript
// CORRECT - Use author name directly
const userId = authorName ?? "imported-unknown";

// WRONG - Don't do this
const user = await findUserByEmail(authorEmail);
const userId = user?.id; // ❌ Will fail for external authors

Rule 3: Always Use rangeRef for Node Operations

规则3:始终使用rangeRef进行节点操作

typescript
// Ranges can become stale after node-splitting operations
const startTokenRef = editor.api.rangeRef(startTokenRange);
const endTokenRef = editor.api.rangeRef(endTokenRange);

// After operations, get current ranges
const currentStart = startTokenRef.current;
const currentEnd = endTokenRef.current;

// Always unref when done
startTokenRef.unref();
endTokenRef.unref();
typescript
// Ranges can become stale after node-splitting operations
const startTokenRef = editor.api.rangeRef(startTokenRange);
const endTokenRef = editor.api.rangeRef(endTokenRange);

// After operations, get current ranges
const currentStart = startTokenRef.current;
const currentEnd = endTokenRef.current;

// Always unref when done
startTokenRef.unref();
endTokenRef.unref();

Rule 4: Check for Null Comments in mammoth.js

规则4:在mammoth.js中检查空评论

javascript
var comment = comments[reference.commentId];
if (!comment) {
  messages.push(results.warning("Comment not found: " + reference.commentId));
  comment = { commentId: reference.commentId, body: [], authorInitials: "" };
}
javascript
var comment = comments[reference.commentId];
if (!comment) {
  messages.push(results.warning("Comment not found: " + reference.commentId));
  comment = { commentId: reference.commentId, body: [], authorInitials: "" };
}

Rule 5: Always Log with Logfire, Never Crash

规则5:始终使用Logfire记录日志,绝不崩溃

Use
src/lib/logger.ts
which wraps Logfire with environment-aware console output.
typescript
import { logger } from "@/lib/logger";

// For warnings (e.g., skipped items, fallbacks used)
// Uses logfire.warning() under the hood
logger.warning("[DOCX Import] Failed to parse token", {
  rawPayload,
  error: e,
});

// For errors (e.g., failed operations)
// Uses logfire.error() under the hood
logger.error("[DOCX Import] Failed to create comment", e, {
  commentId: comment.id,
  documentId,
});

// For info (e.g., successful operations)
// Uses logfire.info() under the hood
logger.info("[DOCX Import] Import completed", {
  commentsCreated,
  insertions,
  deletions,
});
Logfire API Reference:
Logger MethodLogfire MethodUse Case
logger.warning()
logfire.warning()
Recoverable issues, fallbacks used
logger.error()
logfire.error()
Failed operations, exceptions
logger.info()
logfire.info()
Success messages, metrics
Logging behavior:
  • Production: Logs to Logfire only (no console spam)
  • Development: Logs to both Logfire AND console (for debugging)
使用
src/lib/logger.ts
,它封装了Logfire并支持环境感知的控制台输出。
typescript
import { logger } from "@/lib/logger";

// For warnings (e.g., skipped items, fallbacks used)
// Uses logfire.warning() under the hood
logger.warning("[DOCX Import] Failed to parse token", {
  rawPayload,
  error: e,
});

// For errors (e.g., failed operations)
// Uses logfire.error() under the hood
logger.error("[DOCX Import] Failed to create comment", e, {
  commentId: comment.id,
  documentId,
});

// For info (e.g., successful operations)
// Uses logfire.info() under the hood
logger.info("[DOCX Import] Import completed", {
  commentsCreated,
  insertions,
  deletions,
});
Logfire API参考:
日志器方法Logfire方法使用场景
logger.warning()
logfire.warning()
可恢复问题、使用备选方案时
logger.error()
logfire.error()
操作失败、异常情况
logger.info()
logfire.info()
成功消息、指标统计
日志行为:
  • 生产环境:仅记录到Logfire(无控制台信息)
  • 开发环境:同时记录到Logfire和控制台(用于调试)

Plate Mark Structures

Plate标记结构

Suggestion Marks

建议标记

typescript
{
  [KEYS.suggestion]: true,
  [getSuggestionKey(id)]: {
    id: string,
    type: "insert" | "remove",
    userId: string,
    createdAt: number
  }
}
typescript
{
  [KEYS.suggestion]: true,
  [getSuggestionKey(id)]: {
    id: string,
    type: "insert" | "remove",
    userId: string,
    createdAt: number
  }
}

Comment Marks

评论标记

typescript
{
  [KEYS.comment]: true,
  [getCommentKey(discussionId)]: true,
  [getTransientCommentKey()]: true // During creation
}
typescript
{
  [KEYS.comment]: true,
  [getCommentKey(discussionId)]: true,
  [getTransientCommentKey()]: true // During creation
}

Common Debugging Scenarios

常见调试场景

Tokens Not Being Parsed

令牌未被解析

  1. Check mammoth.js output in browser console
  2. Verify token prefixes match exactly between files
  3. Check that
    cleanDocx()
    isn't stripping tokens
  1. 在浏览器控制台中检查mammoth.js的输出
  2. 验证令牌前缀在所有文件中完全匹配
  3. 检查
    cleanDocx()
    是否没有剥离令牌

Suggestions Not Appearing

建议未显示

  1. Verify
    KEYS.suggestion
    is set to
    true
  2. Check
    getSuggestionKey(id)
    contains full object, not just ID
  3. Ensure
    type
    is exactly
    "insert"
    or
    "remove"
  1. 验证
    KEYS.suggestion
    是否设置为
    true
  2. 检查
    getSuggestionKey(id)
    是否包含完整对象,而非仅ID
  3. 确保
    type
    严格为
    "insert"
    "remove"

Comments Not Saved

评论未保存

  1. Check
    createDiscussionWithComment
    API call
  2. Verify
    documentId
    is passed correctly
  3. Check TRPC mutation response for errors
  1. 检查
    createDiscussionWithComment
    API调用
  2. 验证
    documentId
    是否正确传递
  3. 检查TRPC突变响应是否有错误

Ranges Becoming Stale

范围失效

  1. Use
    rangeRef
    before any node-modifying operations
  2. Call
    unref()
    after operations complete
  3. Re-fetch ranges after
    setNodes
    with
    split: true
  1. 在任何修改节点的操作前使用
    rangeRef
  2. 操作完成后调用
    unref()
  3. 使用
    split: true
    调用
    setNodes
    后重新获取范围

Testing Checklist

测试清单

When modifying DOCX import/export:
  • Test with Word document containing only insertions
  • Test with Word document containing only deletions
  • Test with Word document containing mixed tracked changes
  • Test with Word document containing single comment
  • Test with Word document containing multiple comments
  • Test with Word document containing both tracked changes AND comments
  • Test with document from different sources (Word, LibreOffice, Google Docs)
  • Test round-trip: Import → Make changes → Export → Import again
  • Verify no tokens are visible in final editor content
  • Verify all authors are attributed correctly
修改DOCX导入/导出时:
  • 测试仅包含插入内容的Word文档
  • 测试仅包含删除内容的Word文档
  • 测试包含混合修订跟踪内容的Word文档
  • 测试包含单个评论的Word文档
  • 测试包含多个评论的Word文档
  • 测试同时包含修订跟踪和评论的Word文档
  • 测试来自不同来源的文档(Word、LibreOffice、Google Docs)
  • 测试往返流程:导入→修改→导出→再次导入
  • 验证最终编辑器内容中无可见令牌
  • 验证所有作者归属正确

Detailed References

详细参考

  • Import implementation: See references/import-pipeline.md
  • Export implementation: See references/export-pipeline.md
  • mammoth.js modifications: See references/mammoth-modifications.md
  • Packages overview: See references/packages-overview.md
  • 导入实现:查看references/import-pipeline.md
  • 导出实现:查看references/export-pipeline.md
  • mammoth.js修改:查看references/mammoth-modifications.md
  • 包概述:查看references/packages-overview.md

Emergency Fixes

紧急修复

If tokens appear in editor content:

如果令牌出现在编辑器内容中:

typescript
// Force cleanup of all remaining tokens
const tokenPatterns = [
  /\[\[DOCX_INS_START:.*?\]\]/g,
  /\[\[DOCX_INS_END:.*?\]\]/g,
  /\[\[DOCX_DEL_START:.*?\]\]/g,
  /\[\[DOCX_DEL_END:.*?\]\]/g,
  /\[\[DOCX_CMT_START:.*?\]\]/g,
  /\[\[DOCX_CMT_END:.*?\]\]/g,
];
// Search and delete each match
typescript
// Force cleanup of all remaining tokens
const tokenPatterns = [
  /\[\[DOCX_INS_START:.*?\]\]/g,
  /\[\[DOCX_INS_END:.*?\]\]/g,
  /\[\[DOCX_DEL_START:.*?\]\]/g,
  /\[\[DOCX_DEL_END:.*?\]\]/g,
  /\[\[DOCX_CMT_START:.*?\]\]/g,
  /\[\[DOCX_CMT_END:.*?\]\]/g,
];
// Search and delete each match

If suggestions aren't displaying:

如果建议未显示:

  1. Check suggestion plugin is configured correctly
  2. Verify
    SuggestionLeaf
    is rendering
  3. Check browser console for rendering errors
  1. 检查建议插件配置是否正确
  2. 验证
    SuggestionLeaf
    是否正在渲染
  3. 检查浏览器控制台是否有渲染错误