mongodb-schema-design

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

MongoDB Schema Design

MongoDB Schema设计

Data modeling patterns and anti-patterns for MongoDB, maintained by MongoDB. Contains 33 rules across 5 categories, prioritized by impact. Bad schema is the root cause of most MongoDB performance and cost issues—queries and indexes cannot fix a fundamentally wrong model.
MongoDB官方维护的数据建模模式与反模式指南,涵盖5个类别共33条规则,按影响优先级排序。糟糕的Schema是绝大多数MongoDB性能和成本问题的根源——查询和索引无法从根本上修复错误的数据模型。

When to Apply

适用场景

Reference these guidelines when:
  • Designing a new MongoDB schema from scratch
  • Migrating from SQL/relational databases to MongoDB
  • Reviewing existing data models for performance issues
  • Troubleshooting slow queries or growing document sizes
  • Deciding between embedding and referencing
  • Modeling relationships (one-to-one, one-to-many, many-to-many)
  • Implementing tree/hierarchical structures
  • Seeing Atlas Schema Suggestions or Performance Advisor warnings
  • Hitting the 16MB document limit
  • Adding schema validation to existing collections
在以下场景参考这些指导原则:
  • 从零开始设计新的MongoDB Schema
  • 从SQL/关系型数据库迁移到MongoDB
  • 评审现有数据模型的性能问题
  • 排查慢查询或文档体积持续增长问题
  • 决定使用嵌入还是引用模式
  • 建模数据关系(一对一、一对多、多对多)
  • 实现树/层级结构
  • 收到Atlas Schema建议或性能顾问警告
  • 触发16MB文档大小上限
  • 为现有集合添加Schema校验

Rule Categories by Priority

按优先级排序的规则类别

PriorityCategoryImpactPrefixRules
1Schema Anti-PatternsCRITICAL
antipattern-
7
2Schema FundamentalsHIGH
fundamental-
5
3Relationship PatternsHIGH
relationship-
6
4Design PatternsMEDIUM
pattern-
12
5Schema ValidationMEDIUM
validation-
3
优先级类别影响程度前缀规则数量
1Schema反模式严重
antipattern-
7
2Schema基础准则
fundamental-
5
3关系模式
relationship-
6
4设计模式
pattern-
12
5Schema校验
validation-
3

Quick Reference

快速参考

1. Schema Anti-Patterns (CRITICAL) - 7 rules

1. Schema反模式(严重)- 7条规则

  • antipattern-unbounded-arrays
    - Never allow arrays to grow without limit
  • antipattern-bloated-documents
    - Keep documents under 16KB for working set
  • antipattern-massive-arrays
    - Arrays over 1000 elements hurt performance
  • antipattern-unnecessary-collections
    - Fewer collections, more embedding
  • antipattern-excessive-lookups
    - Reduce $lookup by denormalizing
  • antipattern-schema-drift
    - Enforce consistent structure across documents
  • antipattern-unnecessary-indexes
    - Audit and remove unused or redundant indexes
  • antipattern-unbounded-arrays
    - 绝对不要允许数组无限制增长
  • antipattern-bloated-documents
    - 工作集中的文档大小控制在16KB以内
  • antipattern-massive-arrays
    - 超过1000个元素的数组会影响性能
  • antipattern-unnecessary-collections
    - 减少集合数量,优先使用嵌入
  • antipattern-excessive-lookups
    - 通过反规范化减少$lookup使用
  • antipattern-schema-drift
    - 保证所有文档的结构一致性
  • antipattern-unnecessary-indexes
    - 审计并移除未使用或冗余的索引

2. Schema Fundamentals (HIGH) - 5 rules

2. Schema基础准则(高)- 5条规则

  • fundamental-embed-vs-reference
    - Decision framework for relationships
  • fundamental-data-together
    - Data accessed together stored together
  • fundamental-document-model
    - Embrace documents, avoid SQL patterns
  • fundamental-schema-validation
    - Enforce structure with JSON Schema
  • fundamental-16mb-awareness
    - Design around BSON document limit
  • fundamental-embed-vs-reference
    - 数据关系决策框架
  • fundamental-data-together
    - 共同访问的数据存储在一起
  • fundamental-document-model
    - 拥抱文档模型,避免套用SQL模式
  • fundamental-schema-validation
    - 使用JSON Schema强制校验结构
  • fundamental-16mb-awareness
    - 设计时考虑BSON文档大小上限

3. Relationship Patterns (HIGH) - 6 rules

3. 关系模式(高)- 6条规则

  • relationship-one-to-one
    - Embed for simplicity, reference for independence
  • relationship-one-to-few
    - Embed bounded arrays (addresses, phone numbers)
  • relationship-one-to-many
    - Reference for large/unbounded relationships
  • relationship-one-to-squillions
    - Reference massive child sets, store summaries
  • relationship-many-to-many
    - Two-way referencing for bidirectional access
  • relationship-tree-structures
    - Parent/child/materialized path patterns
  • relationship-one-to-one
    - 优先嵌入保证简洁,独立访问则使用引用
  • relationship-one-to-few
    - 嵌入有界数组(地址、电话号码)
  • relationship-one-to-many
    - 大型/无界关系使用引用
  • relationship-one-to-squillions
    - 超大量子集合使用引用,存储汇总数据
  • relationship-many-to-many
    - 双向引用实现双向访问
  • relationship-tree-structures
    - 父/子/物化路径模式

4. Design Patterns (MEDIUM) - 12 rules

4. 设计模式(中)- 12条规则

  • pattern-archive
    - Move historical data to separate storage for performance
  • pattern-attribute
    - Collapse many optional fields into key-value attributes
  • pattern-bucket
    - Group time-series or IoT data into buckets
  • pattern-time-series-collections
    - Use native time series collections when available
  • pattern-extended-reference
    - Cache frequently-accessed related data
  • pattern-subset
    - Store hot data in main doc, cold data elsewhere
  • pattern-computed
    - Pre-calculate expensive aggregations
  • pattern-outlier
    - Handle documents with exceptional array sizes
  • pattern-polymorphic
    - Store heterogeneous documents with a type discriminator
  • pattern-schema-versioning
    - Evolve schemas safely with version fields
  • pattern-archive
    - 将历史数据迁移到独立存储提升性能
  • pattern-attribute
    - 将大量可选字段折叠为键值属性
  • pattern-bucket
    - 将时间序列或IoT数据分组存入桶中
  • pattern-time-series-collections
    - 可用时使用原生时间序列集合
  • pattern-extended-reference
    - 缓存高频访问的关联数据
  • pattern-subset
    - 热数据存在主文档,冷数据存储在其他位置
  • pattern-computed
    - 预计算高成本聚合结果
  • pattern-outlier
    - 处理数组大小异常的特殊文档
  • pattern-polymorphic
    - 使用类型区分符存储异构文档
  • pattern-schema-versioning
    - 通过版本字段安全演进Schema

5. Schema Validation (MEDIUM) - 3 rules

5. Schema校验(中)- 3条规则

  • validation-json-schema
    - Validate data types and structure at database level
  • validation-action-levels
    - Choose warn vs error mode for validation
  • validation-rollout-strategy
    - Introduce validation safely in production
  • validation-json-schema
    - 在数据库层面校验数据类型和结构
  • validation-action-levels
    - 选择校验的警告/错误模式
  • validation-rollout-strategy
    - 生产环境安全启用校验的策略

Key Principle

核心原则

"Data that is accessed together should be stored together."
This is MongoDB's core philosophy. Embedding related data eliminates joins, reduces round trips, and enables atomic updates. Reference only when you must.
"共同访问的数据应该存储在一起。"
这是MongoDB的核心理念。嵌入关联数据可以消除连接,减少网络往返,支持原子更新。仅在必要时使用引用。

Decision Framework

决策框架

RelationshipCardinalityAccess PatternRecommendation
One-to-One1:1Always togetherEmbed
One-to-Few1:N (N < 100)Usually togetherEmbed array
One-to-Many1:N (N > 100)Often separateReference
Many-to-ManyM:NVariesTwo-way reference
关系类型基数访问模式推荐方案
一对一1:1总是共同访问嵌入
一对少1:N (N < 100)通常共同访问嵌入数组
一对多1:N (N > 100)经常单独访问引用
多对多M:N不固定双向引用

How to Use

使用方法

Read individual rule files for detailed explanations and code examples:
rules/antipattern-unbounded-arrays.md
rules/relationship-one-to-many.md
rules/_sections.md
Each rule file contains:
  • Brief explanation of why it matters
  • Incorrect code example with explanation
  • Correct code example with explanation
  • "When NOT to use" exceptions
  • Performance impact and metrics
  • Verification diagnostics

阅读单独的规则文件获取详细说明和代码示例:
rules/antipattern-unbounded-arrays.md
rules/relationship-one-to-many.md
rules/_sections.md
每个规则文件包含:
  • 规则重要性的简要说明
  • 错误代码示例及说明
  • 正确代码示例及说明
  • "不适用场景"例外情况
  • 性能影响和指标
  • 校验诊断方法

How These Rules Work

规则运行机制

Recommendations with Verification

带校验的建议

Every rule in this skill provides:
  1. A recommendation based on best practices
  2. A verification checklist of things that should be confirmed
  3. Commands to verify so you can check before implementing
  4. MCP integration for automatic verification when connected
本技能的每条规则都提供:
  1. 最佳实践建议 基于行业最佳实践的推荐
  2. 校验清单 需要确认的事项列表
  3. 校验命令 可以在实施前检查的命令
  4. MCP集成 连接后支持自动校验

Why Verification Matters

校验的重要性

I analyze code patterns, but I can't see your actual data without a database connection. This means I might suggest:
  • Fixing an "unbounded array" that's actually small and bounded
  • Restructuring a schema that works well for your access patterns
  • Adding validation when documents already conform to the schema
Always verify before implementing. Each rule includes verification commands.
我可以分析代码模式,但如果没有数据库连接,我无法查看你的实际数据。 这意味着我可能会建议:
  • 修复实际上体积很小且有界的"无界数组"
  • 重构实际上适配你访问模式的Schema
  • 为文档已经符合规范的集合添加校验
实施前请务必校验。 每条规则都包含校验命令。

MongoDB MCP Integration

MongoDB MCP集成

For automatic verification, connect the MongoDB MCP Server:
Option 1: Connection String
json
{
  "mcpServers": {
    "mongodb": {
      "command": "npx",
      "args": ["-y", "mongodb-mcp-server", "--readOnly"],
      "env": {
        "MDB_MCP_CONNECTION_STRING": "mongodb+srv://user:pass@cluster.mongodb.net/mydb"
      }
    }
  }
}
Option 2: Local MongoDB
json
{
  "mcpServers": {
    "mongodb": {
      "command": "npx",
      "args": ["-y", "mongodb-mcp-server", "--readOnly"],
      "env": {
        "MDB_MCP_CONNECTION_STRING": "mongodb://localhost:27017/mydb"
      }
    }
  }
}
⚠️ Security: Use
--readOnly
for safety. Remove only if you need write operations.
When connected, I can automatically:
  • Infer schema via
    mcp__mongodb__collection-schema
  • Measure document/array sizes via
    mcp__mongodb__aggregate
  • Check collection statistics via
    mcp__mongodb__db-stats
要实现自动校验,请连接MongoDB MCP Server
方案1:连接字符串
json
{
  "mcpServers": {
    "mongodb": {
      "command": "npx",
      "args": ["-y", "mongodb-mcp-server", "--readOnly"],
      "env": {
        "MDB_MCP_CONNECTION_STRING": "mongodb+srv://user:pass@cluster.mongodb.net/mydb"
      }
    }
  }
}
方案2:本地MongoDB
json
{
  "mcpServers": {
    "mongodb": {
      "command": "npx",
      "args": ["-y", "mongodb-mcp-server", "--readOnly"],
      "env": {
        "MDB_MCP_CONNECTION_STRING": "mongodb://localhost:27017/mydb"
      }
    }
  }
}
⚠️ 安全提示: 优先使用
--readOnly
参数保证安全,仅在需要写操作时移除。
当连接后,我可以自动:
  • 通过
    mcp__mongodb__collection-schema
    推导Schema
  • 通过
    mcp__mongodb__aggregate
    测量文档/数组大小
  • 通过
    mcp__mongodb__db-stats
    查看集合统计信息

⚠️ Action Policy

⚠️ 操作策略

I will NEVER execute write operations without your explicit approval.
Operation TypeMCP ToolsAction
Read (Safe)
find
,
aggregate
,
collection-schema
,
db-stats
,
count
I may run automatically to verify
Write (Requires Approval)
update-many
,
insert-many
,
create-collection
I will show the command and wait for your "yes"
Destructive (Requires Approval)
delete-many
,
drop-collection
,
drop-database
I will warn you and require explicit confirmation
When I recommend schema changes or data modifications:
  1. I'll explain what I want to do and why
  2. I'll show you the exact command
  3. I'll wait for your approval before executing
  4. If you say "go ahead" or "yes", only then will I run it
Your database, your decision. I'm here to advise, not to act unilaterally.
未经你明确批准,我绝对不会执行写操作。
操作类型MCP工具执行逻辑
读操作(安全)
find
,
aggregate
,
collection-schema
,
db-stats
,
count
我可能自动运行用于校验
写操作(需要批准)
update-many
,
insert-many
,
create-collection
我会展示命令并等待你回复"yes"
破坏性操作(需要批准)
delete-many
,
drop-collection
,
drop-database
我会发出警告并要求明确确认
当我推荐Schema变更或数据修改时:
  1. 我会说明操作内容原因
  2. 我会展示具体命令
  3. 我会等待你的批准后再执行
  4. 只有当你回复"go ahead"或"yes"时,我才会运行命令
你的数据库你做主。 我仅提供建议,不会擅自操作。

Working Together

协作说明

If you're not sure about a recommendation:
  1. Run the verification commands I provide
  2. Share the output with me
  3. I'll adjust my recommendation based on your actual data
We're a team—let's get this right together.

如果你对某个建议不确定:
  1. 运行我提供的校验命令
  2. 将输出结果分享给我
  3. 我会根据你的实际数据调整建议
我们是团队,一起把事情做好。

Full Compiled Document

完整编译文档

For the complete guide with all rules expanded:
AGENTS.md
查看包含所有规则扩展内容的完整指南:
AGENTS.md