mongodb

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

MongoDB Agent Skill

MongoDB Agent 技能指南

A comprehensive guide for working with MongoDB - a document-oriented database platform that provides powerful querying, horizontal scaling, high availability, and enterprise-grade security.
这是一份MongoDB全面使用指南——MongoDB是一款面向文档的数据库平台,提供强大的查询能力、水平扩展、高可用性及企业级安全特性。

When to Use This Skill

何时使用本技能

Use this skill when you need to:
  • Design MongoDB schemas and data models
  • Write CRUD operations and complex queries
  • Build aggregation pipelines for data transformation
  • Optimize query performance with indexes
  • Configure replication for high availability
  • Set up sharding for horizontal scaling
  • Implement security (authentication, authorization, encryption)
  • Deploy MongoDB (Atlas, self-managed, Kubernetes)
  • Integrate MongoDB with applications (15+ official drivers)
  • Troubleshoot performance issues or errors
  • Implement Atlas Search or Vector Search
  • Work with time series data or change streams
当你需要以下操作时,可使用本技能:
  • 设计MongoDB Schema与数据模型
  • 编写CRUD操作及复杂查询
  • 构建用于数据转换的聚合管道
  • 通过索引优化查询性能
  • 配置复制以实现高可用性
  • 搭建分片以实现水平扩展
  • 实施安全机制(认证、授权、加密)
  • 部署MongoDB(Atlas、自建、Kubernetes)
  • 通过15+官方驱动将MongoDB与应用集成
  • 排查性能问题或错误
  • 实施Atlas Search或向量搜索
  • 处理时间序列数据或变更流

Documentation Coverage

文档覆盖范围

This skill synthesizes 24,618 documentation links across 172 major MongoDB sections, covering:
  • MongoDB versions 5.0 through 8.1 (upcoming)
  • 15+ official driver languages
  • 50+ integration tools (Kafka, Spark, BI Connector, Kubernetes Operator)
  • Complete deployment spectrum (Atlas cloud, self-managed, Kubernetes)

本技能整合了24,618个文档链接,覆盖172个MongoDB主要章节,包括:
  • MongoDB 5.0至8.1版本(含即将发布的8.1)
  • 15+种官方驱动语言
  • 50+种集成工具(Kafka、Spark、BI Connector、Kubernetes Operator)
  • 完整部署场景(Atlas云服务、自建、Kubernetes)

I. CORE DATABASE OPERATIONS

一、核心数据库操作

A. CRUD Operations

A. CRUD操作

Read Operations

读取操作

javascript
// Find documents
db.collection.find({ status: "active" })
db.collection.findOne({ _id: ObjectId("...") })

// Query operators
db.users.find({ age: { $gte: 18, $lt: 65 } })
db.posts.find({ tags: { $in: ["mongodb", "database"] } })
db.products.find({ price: { $exists: true } })

// Projection (select specific fields)
db.users.find({ status: "active" }, { name: 1, email: 1 })

// Cursor operations
db.collection.find().sort({ createdAt: -1 }).limit(10).skip(20)
javascript
// 查询文档
db.collection.find({ status: "active" })
db.collection.findOne({ _id: ObjectId("...") })

// 查询操作符
db.users.find({ age: { $gte: 18, $lt: 65 } })
db.posts.find({ tags: { $in: ["mongodb", "database"] } })
db.products.find({ price: { $exists: true } })

// 投影(选择特定字段)
db.users.find({ status: "active" }, { name: 1, email: 1 })

// 游标操作
db.collection.find().sort({ createdAt: -1 }).limit(10).skip(20)

Write Operations

写入操作

javascript
// Insert
db.collection.insertOne({ name: "Alice", age: 30 })
db.collection.insertMany([{ name: "Bob" }, { name: "Charlie" }])

// Update
db.users.updateOne(
  { _id: userId },
  { $set: { status: "verified" } }
)
db.users.updateMany(
  { lastLogin: { $lt: cutoffDate } },
  { $set: { status: "inactive" } }
)

// Replace entire document
db.users.replaceOne({ _id: userId }, newUserDoc)

// Delete
db.users.deleteOne({ _id: userId })
db.users.deleteMany({ status: "deleted" })

// Upsert (update or insert if not exists)
db.users.updateOne(
  { email: "user@example.com" },
  { $set: { name: "User", lastSeen: new Date() } },
  { upsert: true }
)
javascript
// 插入
db.collection.insertOne({ name: "Alice", age: 30 })
db.collection.insertMany([{ name: "Bob" }, { name: "Charlie" }])

// 更新
db.users.updateOne(
  { _id: userId },
  { $set: { status: "verified" } }
)
db.users.updateMany(
  { lastLogin: { $lt: cutoffDate } },
  { $set: { status: "inactive" } }
)

// 替换整个文档
db.users.replaceOne({ _id: userId }, newUserDoc)

// 删除
db.users.deleteOne({ _id: userId })
db.users.deleteMany({ status: "deleted" })

//  Upsert(不存在则插入,存在则更新)
db.users.updateOne(
  { email: "user@example.com" },
  { $set: { name: "User", lastSeen: new Date() } },
  { upsert: true }
)

Atomic Operations

原子操作

javascript
// Increment counter
db.posts.updateOne(
  { _id: postId },
  { $inc: { views: 1 } }
)

// Add to array (if not exists)
db.users.updateOne(
  { _id: userId },
  { $addToSet: { interests: "mongodb" } }
)

// Push to array
db.posts.updateOne(
  { _id: postId },
  { $push: { comments: { author: "Alice", text: "Great!" } } }
)

// Find and modify atomically
db.counters.findAndModify({
  query: { _id: "sequence" },
  update: { $inc: { value: 1 } },
  new: true,
  upsert: true
})
javascript
// 递增计数器
db.posts.updateOne(
  { _id: postId },
  { $inc: { views: 1 } }
)

// 向数组添加元素(不存在则添加)
db.users.updateOne(
  { _id: userId },
  { $addToSet: { interests: "mongodb" } }
)

// 向数组推送元素
db.posts.updateOne(
  { _id: postId },
  { $push: { comments: { author: "Alice", text: "Great!" } } }
)

// 原子查询并修改
db.counters.findAndModify({
  query: { _id: "sequence" },
  update: { $inc: { value: 1 } },
  new: true,
  upsert: true
})

B. Query Operators (100+)

B. 查询操作符(100+种)

Comparison Operators

比较操作符

javascript
$eq, $ne, $gt, $gte, $lt, $lte
$in, $nin
javascript
$eq, $ne, $gt, $gte, $lt, $lte
$in, $nin

Logical Operators

逻辑操作符

javascript
$and, $or, $not, $nor

// Example
db.products.find({
  $and: [
    { price: { $gte: 100 } },
    { stock: { $gt: 0 } }
  ]
})
javascript
$and, $or, $not, $nor

// 示例
db.products.find({
  $and: [
    { price: { $gte: 100 } },
    { stock: { $gt: 0 } }
  ]
})

Array Operators

数组操作符

javascript
$all, $elemMatch, $size
$firstN, $lastN, $maxN, $minN

// Example: Find docs with all tags
db.posts.find({ tags: { $all: ["mongodb", "database"] } })

// Match array element with multiple conditions
db.products.find({
  reviews: {
    $elemMatch: { rating: { $gte: 4 }, verified: true }
  }
})
javascript
$all, $elemMatch, $size
$firstN, $lastN, $maxN, $minN

// 示例:查询包含所有指定标签的文档
db.posts.find({ tags: { $all: ["mongodb", "database"] } })

// 匹配满足多个条件的数组元素
db.products.find({
  reviews: {
    $elemMatch: { rating: { $gte: 4 }, verified: true }
  }
})

Existence & Type

存在性与类型操作符

javascript
$exists, $type

// Find documents with optional field
db.users.find({ phoneNumber: { $exists: true } })

// Type checking
db.data.find({ value: { $type: "string" } })
javascript
$exists, $type

// 查询包含可选字段的文档
db.users.find({ phoneNumber: { $exists: true } })

// 类型检查
db.data.find({ value: { $type: "string" } })

C. Aggregation Pipeline

C. 聚合管道

MongoDB's most powerful feature for data transformation and analysis.
MongoDB最强大的数据转换与分析功能。

Core Pipeline Stages (40+)

核心管道阶段(40+种)

javascript
db.orders.aggregate([
  // Stage 1: Filter documents
  { $match: { status: "completed", total: { $gte: 100 } } },

  // Stage 2: Join with customers
  { $lookup: {
    from: "customers",
    localField: "customerId",
    foreignField: "_id",
    as: "customer"
  }},

  // Stage 3: Unwind array
  { $unwind: "$items" },

  // Stage 4: Group and aggregate
  { $group: {
    _id: "$items.category",
    totalRevenue: { $sum: "$items.total" },
    orderCount: { $sum: 1 },
    avgOrderValue: { $avg: "$total" }
  }},

  // Stage 5: Sort results
  { $sort: { totalRevenue: -1 } },

  // Stage 6: Limit results
  { $limit: 10 },

  // Stage 7: Reshape output
  { $project: {
    category: "$_id",
    revenue: "$totalRevenue",
    orders: "$orderCount",
    avgValue: { $round: ["$avgOrderValue", 2] },
    _id: 0
  }}
])
javascript
db.orders.aggregate([
  // 阶段1:过滤文档
  { $match: { status: "completed", total: { $gte: 100 } } },

  // 阶段2:关联customers集合
  { $lookup: {
    from: "customers",
    localField: "customerId",
    foreignField: "_id",
    as: "customer"
  }},

  // 阶段3:展开数组
  { $unwind: "$items" },

  // 阶段4:分组并聚合
  { $group: {
    _id: "$items.category",
    totalRevenue: { $sum: "$items.total" },
    orderCount: { $sum: 1 },
    avgOrderValue: { $avg: "$total" }
  }},

  // 阶段5:排序结果
  { $sort: { totalRevenue: -1 } },

  // 阶段6:限制结果数量
  { $limit: 10 },

  // 阶段7:重塑输出结构
  { $project: {
    category: "$_id",
    revenue: "$totalRevenue",
    orders: "$orderCount",
    avgValue: { $round: ["$avgOrderValue", 2] },
    _id: 0
  }}
])

Common Pipeline Patterns

常见管道模式

Time-Based Aggregation:
javascript
db.events.aggregate([
  { $match: { timestamp: { $gte: startDate, $lt: endDate } } },
  { $group: {
    _id: {
      year: { $year: "$timestamp" },
      month: { $month: "$timestamp" },
      day: { $dayOfMonth: "$timestamp" }
    },
    count: { $sum: 1 }
  }}
])
Faceted Search (Multiple Aggregations):
javascript
db.products.aggregate([
  { $match: { category: "electronics" } },
  { $facet: {
    priceRanges: [
      { $bucket: {
        groupBy: "$price",
        boundaries: [0, 100, 500, 1000, 5000],
        default: "5000+",
        output: { count: { $sum: 1 } }
      }}
    ],
    topBrands: [
      { $group: { _id: "$brand", count: { $sum: 1 } } },
      { $sort: { count: -1 } },
      { $limit: 5 }
    ],
    avgPrice: [
      { $group: { _id: null, avg: { $avg: "$price" } } }
    ]
  }}
])
Window Functions:
javascript
db.sales.aggregate([
  { $setWindowFields: {
    partitionBy: "$region",
    sortBy: { date: 1 },
    output: {
      runningTotal: { $sum: "$amount", window: { documents: ["unbounded", "current"] } },
      movingAvg: { $avg: "$amount", window: { documents: [-7, 0] } }
    }
  }}
])
基于时间的聚合:
javascript
db.events.aggregate([
  { $match: { timestamp: { $gte: startDate, $lt: endDate } } },
  { $group: {
    _id: {
      year: { $year: "$timestamp" },
      month: { $month: "$timestamp" },
      day: { $dayOfMonth: "$timestamp" }
    },
    count: { $sum: 1 }
  }}
])
分面搜索(多聚合任务):
javascript
db.products.aggregate([
  { $match: { category: "electronics" } },
  { $facet: {
    priceRanges: [
      { $bucket: {
        groupBy: "$price",
        boundaries: [0, 100, 500, 1000, 5000],
        default: "5000+",
        output: { count: { $sum: 1 } }
      }}
    ],
    topBrands: [
      { $group: { _id: "$brand", count: { $sum: 1 } } },
      { $sort: { count: -1 } },
      { $limit: 5 }
    ],
    avgPrice: [
      { $group: { _id: null, avg: { $avg: "$price" } } }
    ]
  }}
])
窗口函数:
javascript
db.sales.aggregate([
  { $setWindowFields: {
    partitionBy: "$region",
    sortBy: { date: 1 },
    output: {
      runningTotal: { $sum: "$amount", window: { documents: ["unbounded", "current"] } },
      movingAvg: { $avg: "$amount", window: { documents: [-7, 0] } }
    }
  }}
])

Aggregation Operators (150+)

聚合操作符(150+种)

Math Operators:
javascript
$add, $subtract, $multiply, $divide, $mod
$abs, $ceil, $floor, $round, $sqrt, $pow
$log, $log10, $ln, $exp
String Operators:
javascript
$concat, $substr, $toLower, $toUpper
$trim, $ltrim, $rtrim, $split
$regexMatch, $regexFind, $regexFindAll
Array Operators:
javascript
$arrayElemAt, $slice, $first, $last, $reverse
$sortArray, $filter, $map, $reduce
$zip, $concatArrays
Date/Time Operators:
javascript
$dateAdd, $dateDiff, $dateFromString, $dateToString
$dayOfMonth, $month, $year, $dayOfWeek
$week, $hour, $minute, $second
Type Conversion:
javascript
$toInt, $toString, $toDate, $toDouble
$toDecimal, $toObjectId, $toBool

数学操作符:
javascript
$add, $subtract, $multiply, $divide, $mod
$abs, $ceil, $floor, $round, $sqrt, $pow
$log, $log10, $ln, $exp
字符串操作符:
javascript
$concat, $substr, $toLower, $toUpper
$trim, $ltrim, $rtrim, $split
$regexMatch, $regexFind, $regexFindAll
数组操作符:
javascript
$arrayElemAt, $slice, $first, $last, $reverse
$sortArray, $filter, $map, $reduce
$zip, $concatArrays
日期/时间操作符:
javascript
$dateAdd, $dateDiff, $dateFromString, $dateToString
$dayOfMonth, $month, $year, $dayOfWeek
$week, $hour, $minute, $second
类型转换操作符:
javascript
$toInt, $toString, $toDate, $toDouble
$toDecimal, $toObjectId, $toBool

II. INDEXING & PERFORMANCE

二、索引与性能优化

A. Index Types

A. 索引类型

Single Field Index

单字段索引

javascript
db.users.createIndex({ email: 1 })  // ascending
db.posts.createIndex({ createdAt: -1 })  // descending
javascript
db.users.createIndex({ email: 1 })  // 升序
db.posts.createIndex({ createdAt: -1 })  // 降序

Compound Index

复合索引

javascript
// Order matters! Index on { status: 1, createdAt: -1 }
db.orders.createIndex({ status: 1, createdAt: -1 })

// Supports queries on:
// - { status: "..." }
// - { status: "...", createdAt: ... }
// Does NOT efficiently support: { createdAt: ... } alone
javascript
// 顺序很重要!创建{ status: 1, createdAt: -1 }索引
db.orders.createIndex({ status: 1, createdAt: -1 })

// 支持以下查询:
// - { status: "..." }
// - { status: "...", createdAt: ... }
// 不高效支持单独查询{ createdAt: ... }

Text Index (Full-Text Search)

文本索引(全文搜索)

javascript
db.articles.createIndex({ title: "text", body: "text" })

// Search
db.articles.find({ $text: { $search: "mongodb database" } })

// With relevance score
db.articles.find(
  { $text: { $search: "mongodb" } },
  { score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } })
javascript
db.articles.createIndex({ title: "text", body: "text" })

// 搜索
db.articles.find({ $text: { $search: "mongodb database" } })

// 带相关性得分的搜索
db.articles.find(
  { $text: { $search: "mongodb" } },
  { score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } })

Geospatial Indexes

地理空间索引

javascript
// 2dsphere for earth-like geometry
db.places.createIndex({ location: "2dsphere" })

// Find nearby
db.places.find({
  location: {
    $near: {
      $geometry: { type: "Point", coordinates: [lon, lat] },
      $maxDistance: 5000  // meters
    }
  }
})
javascript
// 2dsphere索引用于类地球几何数据
db.places.createIndex({ location: "2dsphere" })

// 查询附近地点
db.places.find({
  location: {
    $near: {
      $geometry: { type: "Point", coordinates: [lon, lat] },
      $maxDistance: 5000  // 米
    }
  }
})

Wildcard Index

通配符索引

javascript
// Index all fields in subdocuments
db.products.createIndex({ "attributes.$**": 1 })

// Supports queries on any field under attributes
db.products.find({ "attributes.color": "red" })
javascript
// 为子文档中的所有字段创建索引
db.products.createIndex({ "attributes.$**": 1 })

// 支持查询attributes下的任意字段
db.products.find({ "attributes.color": "red" })

Partial Index

部分索引

javascript
// Index only documents matching filter
db.orders.createIndex(
  { customerId: 1 },
  { partialFilterExpression: { status: "active" } }
)
javascript
// 仅为匹配过滤条件的文档创建索引
db.orders.createIndex(
  { customerId: 1 },
  { partialFilterExpression: { status: "active" } }
)

TTL Index (Auto-delete)

TTL索引(自动删除)

javascript
// Delete documents 24 hours after createdAt
db.sessions.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 86400 }
)
javascript
// 在createdAt字段的时间24小时后自动删除文档
db.sessions.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 86400 }
)

Hashed Index (for sharding)

哈希索引(用于分片)

javascript
db.users.createIndex({ userId: "hashed" })
javascript
db.users.createIndex({ userId: "hashed" })

B. Query Optimization

B. 查询优化

Explain Query Plans

解释查询计划

javascript
// Basic explain
db.users.find({ email: "user@example.com" }).explain()

// Execution stats (shows actual performance)
db.users.find({ age: { $gte: 18 } }).explain("executionStats")

// Key metrics to check:
// - executionTimeMillis
// - totalDocsExamined vs. nReturned (should be close)
// - stage: "IXSCAN" (using index) vs. "COLLSCAN" (full scan - BAD)
javascript
// 基础解释
db.users.find({ email: "user@example.com" }).explain()

// 执行统计(显示实际性能数据)
db.users.find({ age: { $gte: 18 } }).explain("executionStats")

// 需要关注的关键指标:
// - executionTimeMillis(执行时间)
// - totalDocsExamined与nReturned(应尽可能接近)
// - stage:"IXSCAN"(使用索引) vs "COLLSCAN"(全表扫描 - 不推荐)

Covered Queries

覆盖查询

javascript
// Create index
db.users.createIndex({ email: 1, name: 1 })

// Query covered by index (no document fetch needed)
db.users.find(
  { email: "user@example.com" },
  { email: 1, name: 1, _id: 0 }  // project only indexed fields
)
javascript
// 创建索引
db.users.createIndex({ email: 1, name: 1 })

// 索引覆盖的查询(无需读取文档)
db.users.find(
  { email: "user@example.com" },
  { email: 1, name: 1, _id: 0 }  // 仅投影索引包含的字段
)

Index Hints

索引提示

javascript
// Force specific index
db.users.find({ status: "active", city: "NYC" })
  .hint({ status: 1, createdAt: -1 })
javascript
// 强制使用特定索引
db.users.find({ status: "active", city: "NYC" })
  .hint({ status: 1, createdAt: -1 })

Index Management

索引管理

javascript
// List all indexes
db.collection.getIndexes()

// Drop index
db.collection.dropIndex("indexName")

// Hide index (test before dropping)
db.collection.hideIndex("indexName")
db.collection.unhideIndex("indexName")

// Index stats
db.collection.aggregate([{ $indexStats: {} }])

javascript
// 列出所有索引
db.collection.getIndexes()

// 删除索引
db.collection.dropIndex("indexName")

// 隐藏索引(删除前测试)
db.collection.hideIndex("indexName")
db.collection.unhideIndex("indexName")

// 索引统计
db.collection.aggregate([{ $indexStats: {} }])

III. DATA MODELING PATTERNS

三、数据建模模式

A. Relationship Patterns

A. 关系模式

One-to-One (Embedded)

一对一(嵌入)

javascript
// User with single address
{
  _id: ObjectId("..."),
  name: "Alice",
  email: "alice@example.com",
  address: {
    street: "123 Main St",
    city: "NYC",
    zipcode: "10001"
  }
}
javascript
// 包含单个地址的用户文档
{
  _id: ObjectId("..."),
  name: "Alice",
  email: "alice@example.com",
  address: {
    street: "123 Main St",
    city: "NYC",
    zipcode: "10001"
  }
}

One-to-Few (Embedded Array)

一对少(嵌入数组)

javascript
// Blog post with comments (< 100 comments)
{
  _id: ObjectId("..."),
  title: "MongoDB Guide",
  comments: [
    { author: "Bob", text: "Great post!", date: ISODate("...") },
    { author: "Charlie", text: "Thanks!", date: ISODate("...") }
  ]
}
javascript
// 包含评论的博客文章(评论数<100)
{
  _id: ObjectId("..."),
  title: "MongoDB Guide",
  comments: [
    { author: "Bob", text: "Great post!", date: ISODate("...") },
    { author: "Charlie", text: "Thanks!", date: ISODate("...") }
  ]
}

One-to-Many (Referenced)

一对多(引用)

javascript
// Author collection
{ _id: ObjectId("author1"), name: "Alice" }

// Books collection (many books per author)
{ _id: ObjectId("book1"), title: "Book 1", authorId: ObjectId("author1") }
{ _id: ObjectId("book2"), title: "Book 2", authorId: ObjectId("author1") }
javascript
// 作者集合
{ _id: ObjectId("author1"), name: "Alice" }

// 书籍集合(一个作者对应多本书)
{ _id: ObjectId("book1"), title: "Book 1", authorId: ObjectId("author1") }
{ _id: ObjectId("book2"), title: "Book 2", authorId: ObjectId("author1") }

Many-to-Many (Array of References)

多对多(引用数组)

javascript
// Users collection
{
  _id: ObjectId("user1"),
  name: "Alice",
  groupIds: [ObjectId("group1"), ObjectId("group2")]
}

// Groups collection
{
  _id: ObjectId("group1"),
  name: "MongoDB Users",
  memberIds: [ObjectId("user1"), ObjectId("user2")]
}
javascript
// 用户集合
{
  _id: ObjectId("user1"),
  name: "Alice",
  groupIds: [ObjectId("group1"), ObjectId("group2")]
}

// 群组集合
{
  _id: ObjectId("group1"),
  name: "MongoDB Users",
  memberIds: [ObjectId("user1"), ObjectId("user2")]
}

B. Advanced Patterns

B. 高级模式

Time Series Pattern

时间序列模式

javascript
// High-frequency sensor data
{
  _id: ObjectId("..."),
  sensorId: "sensor-123",
  timestamp: ISODate("2025-01-01T00:00:00Z"),
  readings: [
    { time: 0, temp: 23.5, humidity: 45 },
    { time: 60, temp: 23.6, humidity: 46 },
    { time: 120, temp: 23.4, humidity: 45 }
  ]
}

// Create time series collection
db.createCollection("sensor_data", {
  timeseries: {
    timeField: "timestamp",
    metaField: "sensorId",
    granularity: "minutes"
  }
})
javascript
// 高频传感器数据
{
  _id: ObjectId("..."),
  sensorId: "sensor-123",
  timestamp: ISODate("2025-01-01T00:00:00Z"),
  readings: [
    { time: 0, temp: 23.5, humidity: 45 },
    { time: 60, temp: 23.6, humidity: 46 },
    { time: 120, temp: 23.4, humidity: 45 }
  ]
}

// 创建时间序列集合
db.createCollection("sensor_data", {
  timeseries: {
    timeField: "timestamp",
    metaField: "sensorId",
    granularity: "minutes"
  }
})

Computed Pattern (Cache Results)

计算模式(缓存结果)

javascript
// User document with pre-computed stats
{
  _id: ObjectId("..."),
  username: "alice",
  stats: {
    postCount: 150,
    followerCount: 2500,
    lastUpdated: ISODate("...")
  }
}

// Update stats periodically or with triggers
javascript
// 包含预计算统计数据的用户文档
{
  _id: ObjectId("..."),
  username: "alice",
  stats: {
    postCount: 150,
    followerCount: 2500,
    lastUpdated: ISODate("...")
  }
}

// 定期或通过触发器更新统计数据

Schema Versioning

Schema版本控制

javascript
// Support schema evolution
{
  _id: ObjectId("..."),
  schemaVersion: 2,
  // v2 fields
  name: { first: "Alice", last: "Smith" },
  // Migration code handles v1 format
}
javascript
// 支持Schema演进
{
  _id: ObjectId("..."),
  schemaVersion: 2,
  // v2版本字段
  name: { first: "Alice", last: "Smith" },
  // 迁移代码处理v1格式
}

C. Schema Validation

C. Schema验证

javascript
db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["email", "name"],
      properties: {
        email: {
          bsonType: "string",
          pattern: "^.+@.+$",
          description: "must be a valid email"
        },
        age: {
          bsonType: "int",
          minimum: 0,
          maximum: 120
        },
        status: {
          enum: ["active", "inactive", "pending"]
        }
      }
    }
  },
  validationLevel: "strict",  // or "moderate"
  validationAction: "error"   // or "warn"
})

javascript
db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["email", "name"],
      properties: {
        email: {
          bsonType: "string",
          pattern: "^.+@.+$",
          description: "必须是有效的邮箱地址"
        },
        age: {
          bsonType: "int",
          minimum: 0,
          maximum: 120
        },
        status: {
          enum: ["active", "inactive", "pending"]
        }
      }
    }
  },
  validationLevel: "strict",  // 或"moderate"
  validationAction: "error"   // 或"warn"
})

IV. REPLICATION & HIGH AVAILABILITY

四、复制与高可用性

A. Replica Sets

A. 副本集

Architecture:
  • Primary: Accepts writes, replicates to secondaries
  • Secondaries: Replicate primary's oplog, can serve reads
  • Arbiter: Votes in elections, holds no data
Configuration:
javascript
rs.initiate({
  _id: "myReplicaSet",
  members: [
    { _id: 0, host: "mongo1:27017" },
    { _id: 1, host: "mongo2:27017" },
    { _id: 2, host: "mongo3:27017" }
  ]
})

// Check status
rs.status()

// Add member
rs.add("mongo4:27017")

// Remove member
rs.remove("mongo4:27017")
架构:
  • 主节点(Primary):接受写入操作,将数据复制到从节点
  • 从节点(Secondaries):复制主节点的oplog,可提供读取服务
  • 仲裁节点(Arbiter):参与选举投票,不存储数据
配置:
javascript
rs.initiate({
  _id: "myReplicaSet",
  members: [
    { _id: 0, host: "mongo1:27017" },
    { _id: 1, host: "mongo2:27017" },
    { _id: 2, host: "mongo3:27017" }
  ]
})

// 检查状态
rs.status()

// 添加节点
rs.add("mongo4:27017")

// 移除节点
rs.remove("mongo4:27017")

B. Write Concern

B. 写入关注

Controls acknowledgment of write operations:
javascript
// Wait for majority acknowledgment (durable)
db.users.insertOne(
  { name: "Alice" },
  { writeConcern: { w: "majority", wtimeout: 5000 } }
)

// Common levels:
// w: 1 - primary acknowledges (default)
// w: "majority" - majority of nodes acknowledge (recommended for production)
// w: <number> - specific number of nodes
// w: 0 - no acknowledgment (fire and forget)
控制写入操作的确认机制:
javascript
// 等待多数节点确认(持久化)
db.users.insertOne(
  { name: "Alice" },
  { writeConcern: { w: "majority", wtimeout: 5000 } }
)

// 常见级别:
// w: 1 - 仅主节点确认(默认)
// w: "majority" - 多数节点确认(生产环境推荐)
// w: <数字> - 指定数量的节点确认
// w: 0 - 不要求确认(即发即弃)

C. Read Preference

C. 读取偏好

Controls where reads are served from:
javascript
// Options:
// - primary (default): read from primary only
// - primaryPreferred: primary if available, else secondary
// - secondary: read from secondary only
// - secondaryPreferred: secondary if available, else primary
// - nearest: lowest network latency

db.collection.find().readPref("secondaryPreferred")
控制读取请求的服务节点:
javascript
// 选项:
// - primary(默认):仅从主节点读取
// - primaryPreferred:优先主节点,不可用时从从节点读取
// - secondary:仅从从节点读取
// - secondaryPreferred:优先从节点,不可用时从主节点读取
// - nearest:选择网络延迟最低的节点

db.collection.find().readPref("secondaryPreferred")

D. Transactions

D. 事务

Multi-document ACID transactions:
javascript
const session = client.startSession();
session.startTransaction();

try {
  await accounts.updateOne(
    { _id: fromAccount },
    { $inc: { balance: -amount } },
    { session }
  );

  await accounts.updateOne(
    { _id: toAccount },
    { $inc: { balance: amount } },
    { session }
  );

  await session.commitTransaction();
} catch (error) {
  await session.abortTransaction();
  throw error;
} finally {
  session.endSession();
}

多文档ACID事务:
javascript
const session = client.startSession();
session.startTransaction();

try {
  await accounts.updateOne(
    { _id: fromAccount },
    { $inc: { balance: -amount } },
    { session }
  );

  await accounts.updateOne(
    { _id: toAccount },
    { $inc: { balance: amount } },
    { session }
  );

  await session.commitTransaction();
} catch (error) {
  await session.abortTransaction();
  throw error;
} finally {
  session.endSession();
}

V. SHARDING & HORIZONTAL SCALING

五、分片与水平扩展

A. Sharded Cluster Architecture

A. 分片集群架构

Components:
  • Shards: Replica sets holding data subsets
  • Config Servers: Store cluster metadata
  • Mongos: Query routers directing operations to shards
组件:
  • 分片(Shards): 存储数据子集的副本集
  • 配置服务器(Config Servers): 存储集群元数据
  • Mongos: 查询路由器,将操作路由到对应分片

B. Shard Key Selection

B. 分片键选择

CRITICAL: Shard key determines data distribution and query performance.
Good Shard Keys:
  • High cardinality (many unique values)
  • Even distribution (no hotspots)
  • Query-aligned (queries include shard key)
javascript
// Enable sharding on database
sh.enableSharding("myDatabase")

// Shard collection with hashed key
sh.shardCollection(
  "myDatabase.users",
  { userId: "hashed" }
)

// Shard with compound key
sh.shardCollection(
  "myDatabase.orders",
  { customerId: 1, orderDate: 1 }
)
关键要点: 分片键决定数据分布与查询性能。
优质分片键的特征:
  • 高基数(大量唯一值)
  • 分布均匀(无热点)
  • 与查询对齐(查询包含分片键)
javascript
// 为数据库启用分片
sh.enableSharding("myDatabase")

// 使用哈希键分片集合
sh.shardCollection(
  "myDatabase.users",
  { userId: "hashed" }
)

// 使用复合键分片集合
sh.shardCollection(
  "myDatabase.orders",
  { customerId: 1, orderDate: 1 }
)

C. Zone Sharding

C. 区域分片

Assign data ranges to specific shards:
javascript
// Add shard tags
sh.addShardTag("shard0", "US-EAST")
sh.addShardTag("shard1", "US-WEST")

// Assign ranges to zones
sh.addTagRange(
  "myDatabase.users",
  { zipcode: "00000" },
  { zipcode: "50000" },
  "US-EAST"
)
将数据范围分配到特定分片:
javascript
// 为分片添加标签
sh.addShardTag("shard0", "US-EAST")
sh.addShardTag("shard1", "US-WEST")

// 将数据范围关联到区域
sh.addTagRange(
  "myDatabase.users",
  { zipcode: "00000" },
  { zipcode: "50000" },
  "US-EAST"
)

D. Query Routing

D. 查询路由

javascript
// Targeted query (includes shard key) - fast
db.users.find({ userId: "12345" })

// Scatter-gather (no shard key) - slow
db.users.find({ email: "user@example.com" })

javascript
// 定向查询(包含分片键)- 快速
db.users.find({ userId: "12345" })

// 散射-聚集查询(无分片键)- 缓慢
db.users.find({ email: "user@example.com" })

VI. SECURITY

六、安全机制

A. Authentication

A. 认证

Methods:
  1. SCRAM (Username/Password) - Default
  2. X.509 Certificates - Mutual TLS
  3. LDAP (Enterprise)
  4. Kerberos (Enterprise)
  5. AWS IAM
  6. OIDC (OpenID Connect)
javascript
// Create admin user
use admin
db.createUser({
  user: "admin",
  pwd: "strongPassword",
  roles: ["root"]
})

// Create database user
use myDatabase
db.createUser({
  user: "appUser",
  pwd: "password",
  roles: [
    { role: "readWrite", db: "myDatabase" }
  ]
})
认证方式:
  1. SCRAM(用户名/密码) - 默认方式
  2. X.509证书 - 双向TLS
  3. LDAP(企业版)
  4. Kerberos(企业版)
  5. AWS IAM
  6. OIDC(OpenID Connect)
javascript
// 创建管理员用户
use admin
db.createUser({
  user: "admin",
  pwd: "strongPassword",
  roles: ["root"]
})

// 创建数据库用户
use myDatabase
db.createUser({
  user: "appUser",
  pwd: "password",
  roles: [
    { role: "readWrite", db: "myDatabase" }
  ]
})

B. Role-Based Access Control (RBAC)

B. 基于角色的访问控制(RBAC)

Built-in Roles:
  • read
    ,
    readWrite
    : Collection-level
  • dbAdmin
    ,
    dbOwner
    : Database administration
  • userAdmin
    : User management
  • clusterAdmin
    : Cluster management
  • root
    : Superuser
Custom Roles:
javascript
db.createRole({
  role: "customRole",
  privileges: [
    {
      resource: { db: "myDatabase", collection: "users" },
      actions: ["find", "update"]
    }
  ],
  roles: []
})
内置角色:
  • read
    ,
    readWrite
    : 集合级别
  • dbAdmin
    ,
    dbOwner
    : 数据库管理
  • userAdmin
    : 用户管理
  • clusterAdmin
    : 集群管理
  • root
    : 超级用户
自定义角色:
javascript
db.createRole({
  role: "customRole",
  privileges: [
    {
      resource: { db: "myDatabase", collection: "users" },
      actions: ["find", "update"]
    }
  ],
  roles: []
})

C. Encryption

C. 加密

Encryption at Rest

静态加密

javascript
// Configure in mongod.conf
security:
  enableEncryption: true
  encryptionKeyFile: /path/to/keyfile
javascript
// 在mongod.conf中配置
security:
  enableEncryption: true
  encryptionKeyFile: /path/to/keyfile

Encryption in Transit (TLS/SSL)

传输加密(TLS/SSL)

javascript
// mongod.conf
net:
  tls:
    mode: requireTLS
    certificateKeyFile: /path/to/cert.pem
    CAFile: /path/to/ca.pem
javascript
// mongod.conf
net:
  tls:
    mode: requireTLS
    certificateKeyFile: /path/to/cert.pem
    CAFile: /path/to/ca.pem

Client-Side Field Level Encryption (CSFLE)

客户端字段级加密(CSFLE)

javascript
// Automatic encryption of sensitive fields
const clientEncryption = new ClientEncryption(client, {
  keyVaultNamespace: "encryption.__keyVault",
  kmsProviders: {
    aws: {
      accessKeyId: "...",
      secretAccessKey: "..."
    }
  }
})

// Create data key
const dataKeyId = await clientEncryption.createDataKey("aws", {
  masterKey: { region: "us-east-1", key: "..." }
})

// Configure auto-encryption
const encryptedClient = new MongoClient(uri, {
  autoEncryption: {
    keyVaultNamespace: "encryption.__keyVault",
    kmsProviders: { aws: {...} },
    schemaMap: {
      "myDatabase.users": {
        bsonType: "object",
        properties: {
          ssn: {
            encrypt: {
              keyId: [dataKeyId],
              algorithm: "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic"
            }
          }
        }
      }
    }
  }
})

javascript
// 自动加密敏感字段
const clientEncryption = new ClientEncryption(client, {
  keyVaultNamespace: "encryption.__keyVault",
  kmsProviders: {
    aws: {
      accessKeyId: "...",
      secretAccessKey: "..."
    }
  }
})

// 创建数据密钥
const dataKeyId = await clientEncryption.createDataKey("aws", {
  masterKey: { region: "us-east-1", key: "..." }
})

// 配置自动加密
const encryptedClient = new MongoClient(uri, {
  autoEncryption: {
    keyVaultNamespace: "encryption.__keyVault",
    kmsProviders: { aws: {...} },
    schemaMap: {
      "myDatabase.users": {
        bsonType: "object",
        properties: {
          ssn: {
            encrypt: {
              keyId: [dataKeyId],
              algorithm: "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic"
            }
          }
        }
      }
    }
  }
})

VII. DEPLOYMENT OPTIONS

七、部署选项

A. MongoDB Atlas (Cloud)

A. MongoDB Atlas(云服务)

Recommended for most use cases.
Quick Start:
  1. Create free M0 cluster at mongodb.com/atlas
  2. Whitelist IP address
  3. Create database user
  4. Get connection string
Features:
  • Auto-scaling
  • Automated backups
  • Multi-cloud (AWS, Azure, GCP)
  • Multi-region deployments
  • Atlas Search & Vector Search
  • Charts (embedded analytics)
  • Data Federation
  • Serverless instances
Connection:
javascript
const uri = "mongodb+srv://user:pass@cluster.mongodb.net/database?retryWrites=true&w=majority";
const client = new MongoClient(uri);
推荐用于大多数场景。
快速开始:
  1. 在mongodb.com/atlas创建免费M0集群
  2. 白名单IP地址
  3. 创建数据库用户
  4. 获取连接字符串
特性:
  • 自动扩缩容
  • 自动备份
  • 多云支持(AWS、Azure、GCP)
  • 多区域部署
  • Atlas Search与向量搜索
  • Charts(嵌入式分析)
  • 数据联邦
  • 无服务器实例
连接示例:
javascript
const uri = "mongodb+srv://user:pass@cluster.mongodb.net/database?retryWrites=true&w=majority";
const client = new MongoClient(uri);

B. Self-Managed

B. 自建部署

Installation:
bash
undefined
安装命令(Ubuntu/Debian):
bash
undefined

Ubuntu/Debian

Ubuntu/Debian

wget -qO - https://www.mongodb.org/static/pgp/server-8.0.asc | sudo apt-key add - echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/8.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list sudo apt-get update sudo apt-get install -y mongodb-org
wget -qO - https://www.mongodb.org/static/pgp/server-8.0.asc | sudo apt-key add - echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/8.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list sudo apt-get update sudo apt-get install -y mongodb-org

Start

启动服务

sudo systemctl start mongod sudo systemctl enable mongod

**Configuration (mongod.conf):**
```yaml
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true

systemLog:
  destination: file
  path: /var/log/mongodb/mongod.log
  logAppend: true

net:
  port: 27017
  bindIp: 127.0.0.1

security:
  authorization: enabled

replication:
  replSetName: "myReplicaSet"
sudo systemctl start mongod sudo systemctl enable mongod

**配置文件(mongod.conf):**
```yaml
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true

systemLog:
  destination: file
  path: /var/log/mongodb/mongod.log
  logAppend: true

net:
  port: 27017
  bindIp: 127.0.0.1

security:
  authorization: enabled

replication:
  replSetName: "myReplicaSet"

C. Kubernetes Deployment

C. Kubernetes部署

MongoDB Kubernetes Operator:
yaml
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongodb-replica-set
spec:
  members: 3
  type: ReplicaSet
  version: "8.0"
  security:
    authentication:
      modes: ["SCRAM"]
  users:
    - name: admin
      db: admin
      passwordSecretRef:
        name: mongodb-admin-password
      roles:
        - name: root
          db: admin
  statefulSet:
    spec:
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes: ["ReadWriteOnce"]
            resources:
              requests:
                storage: 10Gi

MongoDB Kubernetes Operator示例:
yaml
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongodb-replica-set
spec:
  members: 3
  type: ReplicaSet
  version: "8.0"
  security:
    authentication:
      modes: ["SCRAM"]
  users:
    - name: admin
      db: admin
      passwordSecretRef:
        name: mongodb-admin-password
      roles:
        - name: root
          db: admin
  statefulSet:
    spec:
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes: ["ReadWriteOnce"]
            resources:
              requests:
                storage: 10Gi

VIII. INTEGRATION & DRIVERS

八、集成与驱动

A. Official Drivers (15+ Languages)

A. 官方驱动(15+种语言)

Node.js

Node.js

javascript
const { MongoClient } = require("mongodb");

const client = new MongoClient(uri);
await client.connect();

const db = client.db("myDatabase");
const collection = db.collection("users");

// CRUD
await collection.insertOne({ name: "Alice" });
const user = await collection.findOne({ name: "Alice" });
await collection.updateOne({ name: "Alice" }, { $set: { age: 30 } });
await collection.deleteOne({ name: "Alice" });
javascript
const { MongoClient } = require("mongodb");

const client = new MongoClient(uri);
await client.connect();

const db = client.db("myDatabase");
const collection = db.collection("users");

// CRUD操作
await collection.insertOne({ name: "Alice" });
const user = await collection.findOne({ name: "Alice" });
await collection.updateOne({ name: "Alice" }, { $set: { age: 30 } });
await collection.deleteOne({ name: "Alice" });

Python (PyMongo)

Python(PyMongo)

python
from pymongo import MongoClient

client = MongoClient(uri)
db = client.myDatabase
collection = db.users
python
from pymongo import MongoClient

client = MongoClient(uri)
db = client.myDatabase
collection = db.users

// CRUD操作
collection.insert_one({"name": "Alice"})
user = collection.find_one({"name": "Alice"})
collection.update_one({"name": "Alice"}, {"$set": {"age": 30}})
collection.delete_one({"name": "Alice"})

CRUD

Java

collection.insert_one({"name": "Alice"}) user = collection.find_one({"name": "Alice"}) collection.update_one({"name": "Alice"}, {"$set": {"age": 30}}) collection.delete_one({"name": "Alice"})
undefined
java
MongoClient mongoClient = MongoClients.create(uri);
MongoDatabase database = mongoClient.getDatabase("myDatabase");
MongoCollection<Document> collection = database.getCollection("users");

// 插入
collection.insertOne(new Document("name", "Alice"));

// 查询
Document user = collection.find(eq("name", "Alice")).first();

// 更新
collection.updateOne(eq("name", "Alice"), set("age", 30));

Java

Go

java
MongoClient mongoClient = MongoClients.create(uri);
MongoDatabase database = mongoClient.getDatabase("myDatabase");
MongoCollection<Document> collection = database.getCollection("users");

// Insert
collection.insertOne(new Document("name", "Alice"));

// Find
Document user = collection.find(eq("name", "Alice")).first();

// Update
collection.updateOne(eq("name", "Alice"), set("age", 30));
go
client, _ := mongo.Connect(context.TODO(), options.Client().ApplyURI(uri))
collection := client.Database("myDatabase").Collection("users")

// 插入
collection.InsertOne(context.TODO(), bson.M{"name": "Alice"})

// 查询
var user bson.M
collection.FindOne(context.TODO(), bson.M{"name": "Alice"}).Decode(&user)

Go

B. 集成工具

Kafka连接器

go
client, _ := mongo.Connect(context.TODO(), options.Client().ApplyURI(uri))
collection := client.Database("myDatabase").Collection("users")

// Insert
collection.InsertOne(context.TODO(), bson.M{"name": "Alice"})

// Find
var user bson.M
collection.FindOne(context.TODO(), bson.M{"name": "Alice"}).Decode(&user)
json
{
  "connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
  "connection.uri": "mongodb://localhost:27017",
  "database": "myDatabase",
  "collection": "events",
  "topics": "my-topic"
}

B. Integration Tools

Spark连接器

Kafka Connector

json
{
  "connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
  "connection.uri": "mongodb://localhost:27017",
  "database": "myDatabase",
  "collection": "events",
  "topics": "my-topic"
}
scala
val df = spark.read
  .format("mongodb")
  .option("uri", "mongodb://localhost:27017/myDatabase.myCollection")
  .load()

df.filter($"age" > 18).show()

Spark Connector

BI连接器(SQL接口)

scala
val df = spark.read
  .format("mongodb")
  .option("uri", "mongodb://localhost:27017/myDatabase.myCollection")
  .load()

df.filter($"age" > 18).show()
sql
-- 使用SQL查询MongoDB
SELECT name, AVG(age) as avg_age
FROM users
WHERE status = 'active'
GROUP BY name;

BI Connector (SQL Interface)

九、高级特性

A. Atlas Search(全文搜索)

sql
-- Query MongoDB using SQL
SELECT name, AVG(age) as avg_age
FROM users
WHERE status = 'active'
GROUP BY name;

创建搜索索引:
json
{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": {
        "type": "string",
        "analyzer": "lucene.standard"
      },
      "description": {
        "type": "string",
        "analyzer": "lucene.english"
      }
    }
  }
}
查询示例:
javascript
db.articles.aggregate([
  {
    $search: {
      text: {
        query: "mongodb database",
        path: ["title", "description"],
        fuzzy: { maxEdits: 1 }
      }
    }
  },
  { $limit: 10 },
  { $project: { title: 1, description: 1, score: { $meta: "searchScore" } } }
])

IX. ADVANCED FEATURES

B. Atlas向量搜索

A. Atlas Search (Full-Text)

Create Search Index:
json
{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": {
        "type": "string",
        "analyzer": "lucene.standard"
      },
      "description": {
        "type": "string",
        "analyzer": "lucene.english"
      }
    }
  }
}
Query:
javascript
db.articles.aggregate([
  {
    $search: {
      text: {
        query: "mongodb database",
        path: ["title", "description"],
        fuzzy: { maxEdits: 1 }
      }
    }
  },
  { $limit: 10 },
  { $project: { title: 1, description: 1, score: { $meta: "searchScore" } } }
])
用于AI/ML相似度搜索:
javascript
db.products.aggregate([
  {
    $vectorSearch: {
      index: "vector_index",
      path: "embedding",
      queryVector: [0.123, 0.456, ...],  // OpenAI使用1536维向量
      numCandidates: 100,
      limit: 10
    }
  },
  {
    $project: {
      name: 1,
      description: 1,
      score: { $meta: "vectorSearchScore" }
    }
  }
])

B. Atlas Vector Search

C. 变更流(实时监控)

For AI/ML similarity search:
javascript
db.products.aggregate([
  {
    $vectorSearch: {
      index: "vector_index",
      path: "embedding",
      queryVector: [0.123, 0.456, ...],  // 1536 dimensions for OpenAI
      numCandidates: 100,
      limit: 10
    }
  },
  {
    $project: {
      name: 1,
      description: 1,
      score: { $meta: "vectorSearchScore" }
    }
  }
])
javascript
const changeStream = collection.watch([
  { $match: { "fullDocument.status": "active" } }
]);

changeStream.on("change", (change) => {
  console.log("检测到变更:", change);
  // change.operationType: "insert", "update", "delete", "replace"
  // change.fullDocument: 完整文档(若已配置)
});

// 从特定位置恢复
const resumeToken = changeStream.resumeToken;
const newStream = collection.watch([], { resumeAfter: resumeToken });

C. Change Streams (Real-Time)

D. 批量操作

javascript
const changeStream = collection.watch([
  { $match: { "fullDocument.status": "active" } }
]);

changeStream.on("change", (change) => {
  console.log("Change detected:", change);
  // change.operationType: "insert", "update", "delete", "replace"
  // change.fullDocument: entire document (if configured)
});

// Resume from specific point
const resumeToken = changeStream.resumeToken;
const newStream = collection.watch([], { resumeAfter: resumeToken });
javascript
const bulkOps = [
  { insertOne: { document: { name: "Alice", age: 30 } } },
  { updateOne: {
    filter: { name: "Bob" },
    update: { $set: { age: 25 } },
    upsert: true
  }},
  { deleteOne: { filter: { name: "Charlie" } } }
];

const result = await collection.bulkWrite(bulkOps, { ordered: false });
console.log(`插入数量: ${result.insertedCount}, 更新数量: ${result.modifiedCount}`);

D. Bulk Operations

十、性能优化

最佳实践

javascript
const bulkOps = [
  { insertOne: { document: { name: "Alice", age: 30 } } },
  { updateOne: {
    filter: { name: "Bob" },
    update: { $set: { age: 25 } },
    upsert: true
  }},
  { deleteOne: { filter: { name: "Charlie" } } }
];

const result = await collection.bulkWrite(bulkOps, { ordered: false });
console.log(`Inserted: ${result.insertedCount}, Updated: ${result.modifiedCount}`);

  1. 为关键字段创建索引
    • 为查询、排序、关联中使用的字段创建索引
    • 监控慢查询(>100ms)
    • 为多字段查询使用复合索引
  2. 使用投影
    javascript
    // 推荐:仅返回所需字段
    db.users.find({ status: "active" }, { name: 1, email: 1 })
    
    // 不推荐:返回完整文档
    db.users.find({ status: "active" })
  3. 限制结果集大小
    javascript
    db.users.find().limit(100)
  4. 使用聚合管道
    • 在服务端处理数据,而非客户端
    • 尽早使用
      $match
      过滤数据
    • 使用
      $project
      减少文档大小
  5. 连接池配置
    javascript
    const client = new MongoClient(uri, {
      maxPoolSize: 50,
      minPoolSize: 10
    });
  6. 批量写入
    javascript
    // 推荐:批量插入
    await collection.insertMany(documents);
    
    // 不推荐:单独插入
    for (const doc of documents) {
      await collection.insertOne(doc);
    }
  7. 写入关注调优
    • 非关键写入使用
      w: 1
      (更快)
    • 关键数据使用
      w: "majority"
      (更安全)
  8. 读取偏好设置
    • 读密集型分析场景使用
      secondary
    • 强一致性要求场景使用
      primary

X. PERFORMANCE OPTIMIZATION

监控

Best Practices

  1. Index Critical Fields
    • Index fields used in queries, sorts, joins
    • Monitor slow queries (>100ms)
    • Use compound indexes for multi-field queries
  2. Use Projection
    javascript
    // Good: Only return needed fields
    db.users.find({ status: "active" }, { name: 1, email: 1 })
    
    // Bad: Return entire document
    db.users.find({ status: "active" })
  3. Limit Result Sets
    javascript
    db.users.find().limit(100)
  4. Use Aggregation Pipeline
    • Process data server-side instead of client-side
    • Use
      $match
      early to filter
    • Use
      $project
      to reduce document size
  5. Connection Pooling
    javascript
    const client = new MongoClient(uri, {
      maxPoolSize: 50,
      minPoolSize: 10
    });
  6. Batch Writes
    javascript
    // Good: Batch insert
    await collection.insertMany(documents);
    
    // Bad: Individual inserts
    for (const doc of documents) {
      await collection.insertOne(doc);
    }
  7. Write Concern Tuning
    • Use
      w: 1
      for non-critical writes (faster)
    • Use
      w: "majority"
      for critical data (safer)
  8. Read Preference
    • Use
      secondary
      for read-heavy analytics
    • Use
      primary
      for strong consistency
javascript
// 检查慢查询
db.setProfilingLevel(1, { slowms: 100 })
db.system.profile.find().sort({ ts: -1 }).limit(10)

// 当前操作
db.currentOp()

// 服务器状态
db.serverStatus()

// 集合统计
db.collection.stats()

Monitoring

十一、故障排查

常见错误

javascript
// Check slow queries
db.setProfilingLevel(1, { slowms: 100 })
db.system.profile.find().sort({ ts: -1 }).limit(10)

// Current operations
db.currentOp()

// Server status
db.serverStatus()

// Collection stats
db.collection.stats()

错误原因解决方案
MongoNetworkError
连接失败检查网络、IP白名单、凭证
E11000 duplicate key
唯一字段重复检查唯一索引,处理重复数据
ValidationError
Schema验证失败检查文档结构、字段类型
OperationTimeout
查询过慢添加索引、优化查询、增加超时时间
AggregationResultTooLarge
结果超过16MB使用
$limit
$project
$out
InvalidSharKey
分片键无效选择高基数、分布均匀的键
ChunkTooBig
块过大使用
refineShardKey
或重新分片
OplogTailFailed
复制延迟检查网络,增大oplog大小

XI. TROUBLESHOOTING

调试工具

Common Errors

ErrorCauseSolution
MongoNetworkError
Connection failedCheck network, IP whitelist, credentials
E11000 duplicate key
Duplicate unique fieldCheck unique indexes, handle duplicates
ValidationError
Schema validation failedCheck document structure, field types
OperationTimeout
Query too slowAdd indexes, optimize query, increase timeout
AggregationResultTooLarge
Result > 16MBUse
$limit
,
$project
, or
$out
InvalidSharKey
Bad shard keyChoose high-cardinality, even-distribution key
ChunkTooBig
Jumbo chunkUse
refineShardKey
or re-shard
OplogTailFailed
Replication lagCheck network, increase oplog size
javascript
// 解释查询计划
db.collection.find({ field: value }).explain("executionStats")

// 检查索引使用情况
db.collection.aggregate([{ $indexStats: {} }])

// 分析慢查询
db.setProfilingLevel(2)  // 分析所有查询
db.system.profile.find({ millis: { $gt: 100 } })

// 检查复制延迟
rs.printReplicationInfo()
rs.printSecondaryReplicationInfo()

Debugging Tools

十二、快速参考

高频操作Top20

javascript
// Explain query plan
db.collection.find({ field: value }).explain("executionStats")

// Check index usage
db.collection.aggregate([{ $indexStats: {} }])

// Analyze slow queries
db.setProfilingLevel(2)  // Profile all queries
db.system.profile.find({ millis: { $gt: 100 } })

// Check replication lag
rs.printReplicationInfo()
rs.printSecondaryReplicationInfo()

  1. find()
    - 查询文档
  2. updateOne()
    /
    updateMany()
    - 修改文档
  3. insertOne()
    /
    insertMany()
    - 添加文档
  4. deleteOne()
    /
    deleteMany()
    - 删除文档
  5. aggregate()
    - 复杂查询
  6. createIndex()
    - 性能优化
  7. explain()
    - 查询分析
  8. findOne()
    - 获取单个文档
  9. countDocuments()
    - 统计匹配数量
  10. replaceOne()
    - 替换文档
  11. distinct()
    - 获取唯一值
  12. bulkWrite()
    - 批量操作
  13. findAndModify()
    - 原子更新
  14. watch()
    - 监控变更
  15. sort()
    /
    limit()
    /
    skip()
    - 结果处理
  16. $lookup
    - 关联集合
  17. $group
    - 数据聚合
  18. $match
    - 管道过滤
  19. $project
    - 输出重塑
  20. hint()
    - 强制使用索引

XII. QUICK REFERENCE

常见模式

Top 20 Operations (by Frequency)

  1. find()
    - Query documents
  2. updateOne()
    /
    updateMany()
    - Modify documents
  3. insertOne()
    /
    insertMany()
    - Add documents
  4. deleteOne()
    /
    deleteMany()
    - Remove documents
  5. aggregate()
    - Complex queries
  6. createIndex()
    - Performance optimization
  7. explain()
    - Query analysis
  8. findOne()
    - Get single document
  9. countDocuments()
    - Count matches
  10. replaceOne()
    - Replace document
  11. distinct()
    - Get unique values
  12. bulkWrite()
    - Batch operations
  13. findAndModify()
    - Atomic update
  14. watch()
    - Monitor changes
  15. sort()
    /
    limit()
    /
    skip()
    - Result manipulation
  16. $lookup
    - Join collections
  17. $group
    - Aggregate data
  18. $match
    - Filter pipeline
  19. $project
    - Shape output
  20. hint()
    - Force index
分页:
javascript
const page = 2;
const pageSize = 20;
db.collection.find()
  .skip((page - 1) * pageSize)
  .limit(pageSize)
基于游标的分页(更优):
javascript
const lastId = ObjectId("...");
db.collection.find({ _id: { $gt: lastId } })
  .limit(20)
原子计数器:
javascript
db.counters.findAndModify({
  query: { _id: "sequence" },
  update: { $inc: { value: 1 } },
  new: true,
  upsert: true
})
软删除:
javascript
// 标记为已删除
db.users.updateOne({ _id: userId }, { $set: { deleted: true, deletedAt: new Date() } })

// 查询仅未删除的文档
db.users.find({ deleted: { $ne: true } })

Common Patterns

十三、资源

官方文档

Pagination:
javascript
const page = 2;
const pageSize = 20;
db.collection.find()
  .skip((page - 1) * pageSize)
  .limit(pageSize)
Cursor-based Pagination (Better):
javascript
const lastId = ObjectId("...");
db.collection.find({ _id: { $gt: lastId } })
  .limit(20)
Atomic Counter:
javascript
db.counters.findAndModify({
  query: { _id: "sequence" },
  update: { $inc: { value: 1 } },
  new: true,
  upsert: true
})
Soft Delete:
javascript
// Mark as deleted
db.users.updateOne({ _id: userId }, { $set: { deleted: true, deletedAt: new Date() } })

// Query active only
db.users.find({ deleted: { $ne: true } })

XIII. RESOURCES

工具

Official Documentation

  • MongoDB Compass - MongoDB图形化界面工具
  • MongoDB Shell (mongosh) - 现代Shell工具
  • Atlas CLI - 自动化Atlas操作
  • 数据库工具 - mongodump、mongorestore、mongoimport

Tools

最佳实践总结

  • MongoDB Compass - GUI for MongoDB
  • MongoDB Shell (mongosh) - Modern shell
  • Atlas CLI - Automate Atlas operations
  • Database Tools - mongodump, mongorestore, mongoimport
  1. 始终为查询字段创建索引
  2. 嵌入与引用选择: 一对少用嵌入,一对多用引用
  3. 分片键: 高基数+均匀分布+与查询对齐
  4. 安全: 生产环境启用认证、使用TLS、静态加密
  5. 复制: 至少3个节点以实现高可用性
  6. 写入关注: 关键数据使用
    w: "majority"
  7. 监控: 跟踪慢查询、复制延迟、磁盘使用情况
  8. 测试: 使用explain()验证查询性能
  9. 连接池: 配置合适的池大小
  10. Schema验证: 定义Schema以保证数据完整性

Best Practices Summary

十四、版本特定特性

MongoDB 8.0(当前版本)

  1. Always use indexes for queried fields
  2. Embedded vs. Referenced: Embed for 1-to-few, reference for 1-to-many
  3. Shard key: High cardinality + even distribution + query-aligned
  4. Security: Enable auth, use TLS, encrypt at rest for production
  5. Replication: Minimum 3 nodes for high availability
  6. Write concern:
    w: "majority"
    for critical data
  7. Monitor: Track slow queries, replication lag, disk usage
  8. Test: Use explain() to verify query performance
  9. Connection pooling: Configure appropriate pool size
  10. Schema validation: Define schema for data integrity

  • 配置分片(合并配置服务器与分片角色)
  • 聚合性能提升
  • 增强安全特性

XIV. VERSION-SPECIFIC FEATURES

MongoDB 7.0

MongoDB 8.0 (Current)

  • Config shard (combined config + shard role)
  • Improved aggregation performance
  • Enhanced security features
  • 自动合并块
  • 时间序列功能改进
  • 查询加密正式发布

MongoDB 7.0

MongoDB 6.0

  • Auto-merging chunks
  • Time series improvements
  • Queryable encryption GA
  • 支持重新分片
  • 集群化集合
  • 时间序列集合改进

MongoDB 6.0

MongoDB 5.0

  • Resharding support
  • Clustered collections
  • Time series collections improvements
  • 时间序列集合
  • 在线重新分片
  • 版本化API

MongoDB 5.0

常见使用场景

电商

  • Time series collections
  • Live resharding
  • Versioned API

  • 商品目录(嵌入属性)
  • 订单(事务保证一致性)
  • 用户会话(TTL索引自动清理)
  • 搜索(Atlas Search搜索商品)

Common Use Cases

IoT/时间序列

E-Commerce

  • Product catalog (embedded attributes)
  • Orders (transactions for consistency)
  • User sessions (TTL indexes for cleanup)
  • Search (Atlas Search for products)
  • 传感器数据(时间序列集合)
  • 实时分析(变更流)
  • 保留策略(TTL索引)

IoT/Time Series

社交网络

  • Sensor data (time series collections)
  • Real-time analytics (change streams)
  • Retention policies (TTL indexes)
  • 用户资料(嵌入或引用)
  • 帖子与评论(少量用嵌入,大量用引用)
  • 实时信息流(变更流)
  • 搜索(Atlas Search搜索内容)

Social Network

分析

  • User profiles (embedded or referenced)
  • Posts & comments (embedded for small, referenced for large)
  • Real-time feeds (change streams)
  • Search (Atlas Search for content)
  • 事件追踪(高写入吞吐量)
  • 聚合管道(复杂分析)
  • 数据联邦(跨源查询)

Analytics

不适合使用MongoDB的场景

  • Event tracking (high write throughput)
  • Aggregation pipelines (complex analytics)
  • Data federation (query across sources)

  • 优先强一致性而非可用性(使用传统关系型数据库)
  • 复杂多表关联(SQL数据库更擅长)
  • 极小数据集(<1GB)且查询简单
  • 跨多个数据库的ACID事务(不支持)

本技能提供了从基础CRUD操作到高级分布式系统(分片、复制、安全)的全面MongoDB知识,助力实现数据库解决方案。请始终参考官方文档获取最新特性与版本特定细节。",

When NOT to Use MongoDB

  • Strong consistency over availability (use traditional RDBMS)
  • Complex multi-table joins (SQL databases excel here)
  • Extremely small dataset (<1GB) with simple queries
  • ACID transactions across multiple databases (not supported)

This skill provides comprehensive MongoDB knowledge for implementing database solutions, from basic CRUD operations to advanced distributed systems with sharding, replication, and security. Always refer to official documentation for the latest features and version-specific details.