masterdata-storage-strategy

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Master Data storage strategy

Master Data存储策略

When this skill applies

本指南适用场景

Use this skill before creating any new Master Data entity or when auditing existing usage. It helps you answer:
  • Is Master Data the right storage for this data, or would Catalog, OMS, VBase, or an external database serve better?
  • How should I design the JSON Schema for performance and security?
  • Which fields should I index (
    v-indexed
    ), and which should I not?
  • Should I enable or disable caching (
    v-cache
    )?
  • Do I need triggers (
    v-triggers
    ), or is an event-driven IO approach better?
  • How do I plan for capacity and lifecycle of schemas and documents?
Do not use this skill for:
  • VTEX IO app integration patterns (MasterDataClient,
    masterdata
    builder, CRUD in code) — use
    vtex-io-masterdata
  • Performance patterns for IO services (LRU, VBase caching layers) — use
    vtex-io-application-performance
请在创建任何新的Master Data实体之前或审计现有使用情况时使用本指南。它可帮助你解答以下问题:
  • Master Data是否是该数据的合适存储方案,还是Catalog、OMS、VBase或外部数据库更适用?
  • 如何设计JSON Schema以兼顾性能与安全性?
  • 哪些字段需要索引
    v-indexed
    ),哪些不需要?
  • 是否应该启用或禁用缓存
    v-cache
    )?
  • 是否需要触发器
    v-triggers
    ),还是事件驱动的IO方法更优?
  • 如何规划Schema和文档的容量生命周期
本指南不适用于:
  • VTEX IO应用集成模式(MasterDataClient、
    masterdata
    构建器、代码中的CRUD操作)——请使用
    vtex-io-masterdata
    指南
  • IO服务的性能模式(LRU、VBase缓存层)——请使用
    vtex-io-application-performance
    指南

Decision rules

决策规则

When to use Master Data

何时使用Master Data

Master Data is a good fit when all of the following are true:
  1. Document-oriented access — Your data is naturally key-value or document-shaped (JSON documents with variable schemas). You query by indexed fields and retrieve full or partial documents.
  2. Platform-integrated — You benefit from VTEX-native features:
    v-triggers
    for automated workflows,
    v-security
    for per-field public access,
    v-indexed
    for search/filter, and the
    masterdata
    builder for schema-as-code.
  3. Moderate volume — Your entity will hold thousands to low millions of documents. MD handles this well with proper indexing.
  4. Not on the purchase critical path — MD is not optimized for sub-10ms latency. Synchronous MD reads in checkout/cart/payment flows risk conversion if MD is slow.
  5. No better native fit — The data doesn't belong in Catalog (product/SKU attributes), OMS (order data), CL/AD (customer profiles/addresses), or VBase (app-specific cache/state).
所有以下条件都满足时,Master Data是合适的选择:
  1. 面向文档的访问——你的数据天然是键值对或文档形态(具有可变Schema的JSON文档)。你通过索引字段查询,并检索完整或部分文档。
  2. 平台集成——你需要借助VTEX原生功能:用于自动化工作流的
    v-triggers
    、支持字段级公共访问的
    v-security
    、用于搜索/过滤的
    v-indexed
    ,以及实现Schema即代码的
    masterdata
    构建器。
  3. 中等数据量——你的实体将存储数千至数百万级别的文档。在合理索引的情况下,MD能很好地处理这类数据量。
  4. 不在购买关键路径上——MD并非为亚10毫秒延迟优化。在结账/购物车/支付流程中同步读取MD数据,若MD响应缓慢可能影响转化。
  5. 无更合适的原生存储——该数据不属于Catalog(商品/SKU属性)、OMS(订单数据)、CL/AD(客户档案/地址)或VBase(应用专属缓存/状态)的范畴。

When NOT to use Master Data

何时不使用Master Data

Data typeBetter storageWhy
Product attributes, specificationsCatalog (specifications, unstructured specs)Native indexing, search integration, catalog APIs
Order data, order historyOMS (via OMS APIs + BFF cache)Single source of truth; duplicating to MD creates drift
Customer profiles, addressesCL/AD native entitiesPlatform-managed, already indexed and cached
App-specific cache or temp stateVBaseDesigned for per-app ephemeral storage, no schema overhead
Application logs, debug traces
ctx.vtex.logger
Structured logging infrastructure, not a database
High-throughput time-series dataExternal database (SQL, NoSQL, time-series DB)MD is not designed for millions of writes/day
Relational data with joinsExternal SQL databaseMD has no join support; denormalize or use a relational DB
Data requiring strong consistencyExternal databaseMD is eventually consistent for indexed fields
数据类型更合适的存储方案原因
商品属性、规格信息Catalog(规格参数、非结构化说明)原生索引、搜索集成、Catalog专属API
订单数据、订单历史OMS(通过OMS API + BFF缓存)单一数据源;复制到MD会导致数据不一致
客户档案、地址信息CL/AD原生实体平台托管,已完成索引与缓存配置
应用专属缓存或临时状态VBase专为应用级临时存储设计,无Schema冗余
应用日志、调试追踪信息
ctx.vtex.logger
结构化日志基础设施,并非数据库
高吞吐量时间序列数据外部数据库(SQL、NoSQL、时间序列数据库)MD并非为每日百万级写入量设计
需要关联查询的关系型数据外部SQL数据库MD不支持关联查询;需进行数据反规范化或使用关系型数据库
需要强一致性的数据外部数据库MD的索引字段最终一致,不满足强一致性要求

Schema design principles

Schema设计原则

  • One entity per concept — Don't mix unrelated data in a single entity. Each entity should represent a clear business concept (e.g.
    reviews
    ,
    wishlists
    ,
    legacyOrders
    ).
  • Index what you query — Only fields in
    v-indexed
    can be used in
    where
    clauses. But don't over-index: each indexed field increases write latency and storage because the index is updated on every document change.
  • Minimal
    v-default-fields
    — Return only the fields most consumers need by default. Large default payloads waste bandwidth.
  • v-cache
    matches the workload
    — Leave
    true
    (default) for read-heavy entities. Set to
    false
    for entities with high write frequency where consumers need immediate consistency after writes.
  • v-security
    is explicit
    — Set
    allowGetAll: false
    unless unauthenticated list access is intentional. Use
    publicRead
    ,
    publicWrite
    ,
    publicFilter
    only for fields that must be accessible without authentication.
  • 一个实体对应一个业务概念——不要在单个实体中混合无关数据。每个实体应代表清晰的业务概念(例如
    reviews
    wishlists
    legacyOrders
    )。
  • 仅索引需要查询的字段——只有
    v-indexed
    中的字段可用于
    where
    子句。但不要过度索引:每个索引字段都会增加写入延迟和存储占用,因为每次文档变更都会更新索引。
  • 最小化
    v-default-fields
    ——默认仅返回大多数消费者需要的字段。过大的默认响应 payload 会浪费带宽。
  • v-cache
    匹配工作负载
    ——对于读密集型实体,保持默认值
    true
    。对于写入频率高、且消费者需要在写入后立即获取最新数据的实体,设置为
    false
  • v-security
    配置明确
    ——除非有意允许未认证用户的列表访问,否则设置
    allowGetAll: false
    。仅对无需认证即可访问的字段使用
    publicRead
    publicWrite
    publicFilter

VTEX schema extensions (
v-*
fields) — reference

VTEX Schema扩展(
v-*
字段)——参考指南

Master Data v2 extends standard JSON Schema with
v-*
properties that control indexing, caching, security, defaults, triggers, and schema inheritance. These are VTEX-specific; standard JSON Schema validators ignore them.
Master Data v2扩展了标准JSON Schema,新增
v-*
属性用于控制索引、缓存、安全、默认值、触发器和Schema继承。这些是VTEX专属属性;标准JSON Schema验证器会忽略它们。

v-indexed

v-indexed

Array of field names that Master Data will create secondary indexes for.
  • Only indexed fields can appear in
    where
    clauses for
    searchDocuments
    and
    scrollDocuments
    . Queries on non-indexed fields trigger full document scans that time out on large datasets.
  • Each index is updated on every document write. Over-indexing increases write latency and storage cost proportionally.
  • When to index: fields used in
    where
    filters, sort expressions, or
    publicFilter
    . When not to index: large text fields (
    description
    ,
    notes
    ), fields never queried, or fields only read by document ID (indexing adds no benefit for
    getDocument
    ).
json
{ "v-indexed": ["email", "status", "createdAt"] }
Master Data将为该数组中的字段名创建二级索引。
  • 只有索引字段可出现在
    searchDocuments
    scrollDocuments
    where
    子句中。对非索引字段的查询会触发全文档扫描,在大数据集上会超时。
  • 每次文档写入时都会更新所有索引。过度索引会成比例增加写入延迟和存储成本。
  • 何时索引:用于
    where
    过滤、排序表达式或
    publicFilter
    的字段。何时不索引:大文本字段(
    description
    notes
    )、从未被查询的字段、仅通过文档ID读取的字段(索引对
    getDocument
    无益处)。
json
{ "v-indexed": ["email", "status", "createdAt"] }

v-cache

v-cache

Boolean (default
true
). Controls whether Master Data caches GET responses for individual documents.
  • true
    (default)
    — Read-heavy entities benefit from caching. Most entities should leave this as default.
  • false
    — Use for entities with high write frequency where consumers need fresh reads immediately after writes (e.g. real-time counters, configuration flags, session-like state).
json
{ "v-cache": false }
布尔值(默认
true
)。控制Master Data是否缓存单个文档的GET响应。
  • true
    (默认)
    ——读密集型实体可从缓存中获益。大多数实体应保持默认值。
  • false
    ——适用于写入频率高且消费者需要在写入后立即获取最新数据的实体(例如实时计数器、配置标志、类会话状态)。
json
{ "v-cache": false }

v-default-fields

v-default-fields

Array of field names returned when the caller does not specify a
fields
parameter in the API request.
  • Keep this minimal — only the fields most consumers need by default.
  • Reduces payload size for common queries.
json
{ "v-default-fields": ["email", "status", "score", "createdAt"] }
当调用者未在API请求中指定
fields
参数时,返回该数组中的字段名。
  • 保持最小化——仅包含大多数消费者默认需要的字段。
  • 减少常见查询的响应 payload 大小。
json
{ "v-default-fields": ["email", "status", "score", "createdAt"] }

v-security

v-security

Object controlling unauthenticated (public) access to fields. By default, all fields require authentication.
PropertyTypeDescription
allowGetAll
boolean
If
true
, unauthenticated users can list all documents. Default
false
; keep it off unless intentional.
publicRead
string[]
Fields readable without authentication
publicWrite
string[]
Fields writable without authentication
publicFilter
string[]
Fields usable in
where
clauses without authentication (must also be in
v-indexed
)
json
{
  "v-security": {
    "allowGetAll": false,
    "publicRead": ["status", "displayName", "rating"],
    "publicWrite": [],
    "publicFilter": ["status"]
  }
}
Never include PII (email, phone, addresses), internal IDs, or business-sensitive data in
publicRead
or
publicFilter
.
控制未认证(公共)用户对字段访问权限的对象。默认情况下,所有字段都需要认证。
属性类型描述
allowGetAll
boolean
如果为
true
,未认证用户可列出所有文档。默认
false
;除非有意设置,否则保持关闭。
publicRead
string[]
无需认证即可读取的字段列表
publicWrite
string[]
无需认证即可写入的字段列表
publicFilter
string[]
无需认证即可在
where
子句中使用的字段列表(必须同时在
v-indexed
中)
json
{
  "v-security": {
    "allowGetAll": false,
    "publicRead": ["status", "displayName", "rating"],
    "publicWrite": [],
    "publicFilter": ["status"]
  }
}
切勿将PII(邮箱、电话、地址)、内部ID或业务敏感数据包含在
publicRead
publicFilter
中。

v-triggers

v-triggers

Array of trigger objects that define automated actions executed when documents are created or updated and meet specified conditions.
PropertyTypeDescription
name
string
Unique trigger name
active
boolean
Enable/disable the trigger
condition
string
where
-style filter (e.g.
"approved=false"
,
"status=pending AND priority>3"
)
action.type
string
"email"
,
"http"
(webhook), or
"action"
action.provider
string
Email provider name (for email type)
action.uri
string
Webhook URL (for http type)
action.method
string
HTTP method for webhook (for http type)
retry.times
number
Retry count on failure
retry.delay
object
Delay between retries (e.g.
{ "addMinutes": 5 }
)
json
{
  "v-triggers": [
    {
      "name": "notify-on-creation",
      "active": true,
      "condition": "status=new",
      "action": {
        "type": "email",
        "provider": "default",
        "subject": "New record: {{title}}",
        "to": ["admin@mystore.com"],
        "body": "Record {{id}} created by {{author}}"
      },
      "retry": { "times": 3, "delay": { "addMinutes": 5 } }
    },
    {
      "name": "webhook-on-approval",
      "active": true,
      "condition": "approved=true",
      "action": {
        "type": "http",
        "uri": "https://my-integration.example.com/webhook",
        "method": "POST",
        "headers": { "X-Custom-Header": "value" }
      },
      "retry": { "times": 2, "delay": { "addMinutes": 10 } }
    }
  ]
}
定义自动化操作的触发器对象数组,当文档被创建或更新且满足指定条件时执行这些操作。
属性类型描述
name
string
唯一的触发器名称
active
boolean
启用/禁用触发器
condition
string
where
风格的过滤条件(例如
"approved=false"
"status=pending AND priority>3"
action.type
string
"email"
"http"
(Webhook)或
"action"
action.provider
string
邮件服务提供商名称(适用于email类型)
action.uri
string
Webhook URL(适用于http类型)
action.method
string
Webhook的HTTP方法(适用于http类型)
retry.times
number
失败时的重试次数
retry.delay
object
重试间隔(例如
{ "addMinutes": 5 }
json
{
  "v-triggers": [
    {
      "name": "notify-on-creation",
      "active": true,
      "condition": "status=new",
      "action": {
        "type": "email",
        "provider": "default",
        "subject": "New record: {{title}}",
        "to": ["admin@mystore.com"],
        "body": "Record {{id}} created by {{author}}"
      },
      "retry": { "times": 3, "delay": { "addMinutes": 5 } }
    },
    {
      "name": "webhook-on-approval",
      "active": true,
      "condition": "approved=true",
      "action": {
        "type": "http",
        "uri": "https://my-integration.example.com/webhook",
        "method": "POST",
        "headers": { "X-Custom-Header": "value" }
      },
      "retry": { "times": 2, "delay": { "addMinutes": 10 } }
    }
  ]
}

v-canonicalto

v-canonicalto

URL pointing to another schema in the same entity for schema inheritance. The current schema inherits properties and constraints from the target.
json
{
  "v-canonicalto": "https://{host}/api/dataentities/{entity}/schemas/{base-schema}"
}
指向同一实体中另一个Schema的URL,用于Schema继承。当前Schema将继承目标Schema的属性和约束。
json
{
  "v-canonicalto": "https://{host}/api/dataentities/{entity}/schemas/{base-schema}"
}

additionalProperties

additionalProperties

Standard JSON Schema property, but worth noting: set to
false
to reject fields not declared in
properties
. By default Master Data preserves extra fields without validation.
标准JSON Schema属性,但值得注意:设置为
false
拒绝
properties
中未声明的字段。默认情况下,Master Data会保留额外字段而不进行验证。

Hard constraints

硬性约束

Constraint: Index only fields used in where clauses or sort expressions

约束:仅索引用于where子句或排序表达式的字段

Every field in
v-indexed
creates a secondary index that is updated on every document write. Indexing fields that are never queried wastes write throughput and storage.
Why this matters — Over-indexing a high-write entity (e.g. indexing 15 fields when only 3 are queried) can double or triple write latency. On entities with millions of documents, unnecessary indexes also increase storage costs.
Detection — Compare
v-indexed
fields with actual
where
clauses in the codebase. Any indexed field not referenced in a
where
or sort is likely unnecessary.
Correct — Index only the fields you filter or sort on.
json
{
  "properties": {
    "email": { "type": "string" },
    "status": { "type": "string" },
    "score": { "type": "integer" },
    "notes": { "type": "string" },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "v-indexed": ["email", "status", "createdAt"]
}
Wrong — Indexing every field "just in case."
json
{
  "properties": {
    "email": { "type": "string" },
    "status": { "type": "string" },
    "score": { "type": "integer" },
    "notes": { "type": "string" },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "v-indexed": ["email", "status", "score", "notes", "createdAt"]
}
v-indexed
中的每个字段都会创建一个二级索引,每次文档写入时都会更新该索引。索引从未被查询的字段会浪费写入吞吐量和存储资源。
重要性——对高写入实体过度索引(例如仅需查询3个字段却索引了15个)会使写入延迟翻倍或三倍。对于拥有数百万文档的实体,不必要的索引还会增加存储成本。
检测方法——对比
v-indexed
字段与代码库中实际的
where
子句。任何未在
where
或排序中引用的索引字段都可能是不必要的。
正确示例——仅索引需要过滤或排序的字段。
json
{
  "properties": {
    "email": { "type": "string" },
    "status": { "type": "string" },
    "score": { "type": "integer" },
    "notes": { "type": "string" },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "v-indexed": ["email", "status", "createdAt"]
}
错误示例——为了"以防万一"索引所有字段。
json
{
  "properties": {
    "email": { "type": "string" },
    "status": { "type": "string" },
    "score": { "type": "integer" },
    "notes": { "type": "string" },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "v-indexed": ["email", "status", "score", "notes", "createdAt"]
}

Constraint: Do not expose sensitive fields via v-security publicRead

约束:不要通过v-security publicRead暴露敏感字段

The
v-security.publicRead
array makes fields accessible without any authentication. Never include PII (email, phone, addresses), internal IDs, or business-sensitive data in this list.
Why this matters — Public fields are accessible to anyone with the entity name and a document ID or search query. Exposing PII violates data protection regulations and creates security vulnerabilities.
Detection — Check
v-security.publicRead
and
publicFilter
for fields containing user data, internal references, or anything that should require authentication.
Correct — Expose only non-sensitive, display-oriented fields.
json
{
  "v-security": {
    "allowGetAll": false,
    "publicRead": ["status", "displayName", "rating"],
    "publicWrite": [],
    "publicFilter": ["status"]
  }
}
Wrong — Exposing PII and internal fields publicly.
json
{
  "v-security": {
    "allowGetAll": true,
    "publicRead": ["email", "phone", "cpf", "internalScore", "organizationId"],
    "publicWrite": ["email"],
    "publicFilter": ["email", "phone"]
  }
}
v-security.publicRead
数组中的字段可无需任何认证访问。切勿将PII(邮箱、电话、地址)、内部ID或业务敏感数据包含在此列表中。
重要性——公共字段可被任何知道实体名称和文档ID或搜索查询的人访问。暴露PII违反数据保护法规,并会造成安全漏洞。
检测方法——检查
v-security.publicRead
publicFilter
中是否包含用户数据、内部引用或任何需要认证的内容。
正确示例——仅暴露非敏感、面向展示的字段。
json
{
  "v-security": {
    "allowGetAll": false,
    "publicRead": ["status", "displayName", "rating"],
    "publicWrite": [],
    "publicFilter": ["status"]
  }
}
错误示例——公开暴露PII和内部字段。
json
{
  "v-security": {
    "allowGetAll": true,
    "publicRead": ["email", "phone", "cpf", "internalScore", "organizationId"],
    "publicWrite": ["email"],
    "publicFilter": ["email", "phone"]
  }
}

Constraint: Respect the 60-schema-per-entity limit

约束:遵守每个实体最多60个Schema的限制

Master Data v2 entities have a hard limit of 60 schemas. The
masterdata
builder creates a new schema per app version linked or installed. Once the limit is reached, new versions fail to deploy.
Why this matters — During active development with frequent
vtex link
cycles, schemas accumulate quickly. Hitting the limit blocks deployment until old schemas are manually deleted.
Detection — Apps with many link/publish cycles. Check schema count via
GET /api/dataentities/{entity}/schemas
.
Correct — Periodically clean up unused schemas. Automate cleanup in CI/CD.
bash
undefined
Master Data v2实体有硬性限制:最多60个Schema
masterdata
构建器会为每个关联或安装的应用版本创建一个新Schema。一旦达到限制,新版本将部署失败。
重要性——在频繁使用
vtex link
的活跃开发阶段,Schema会快速累积。达到限制后,必须手动删除旧Schema才能继续部署。
检测方法——查看有多次链接/发布周期的应用。通过
GET /api/dataentities/{entity}/schemas
检查Schema数量。
正确做法——定期清理未使用的Schema。在CI/CD中自动化清理流程。
bash
undefined

List schemas to identify stale ones

列出Schema以识别过期的Schema

curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas" \ -H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"

Delete unused schemas

删除未使用的Schema

curl -X DELETE "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas/{old-schema}"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"

**Wrong** — Never cleaning up schemas during development until the limit blocks deployment.
curl -X DELETE "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas/{old-schema}" \ -H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"

**错误做法**——开发期间从不清理Schema,直到达到限制导致部署受阻。

Preferred pattern

推荐模式

Complete schema example with all VTEX extensions

包含所有VTEX扩展的完整Schema示例

json
{
  "$schema": "http://json-schema.org/schema#",
  "title": "product-review-v1",
  "type": "object",
  "properties": {
    "productId": { "type": "string" },
    "author": { "type": "string" },
    "email": { "type": "string", "format": "email" },
    "rating": { "type": "integer", "minimum": 1, "maximum": 5 },
    "title": { "type": "string", "maxLength": 200 },
    "text": { "type": "string", "maxLength": 5000 },
    "approved": { "type": "boolean" },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "required": ["productId", "rating", "title", "text"],
  "v-indexed": ["productId", "approved", "rating", "createdAt"],
  "v-default-fields": [
    "productId",
    "author",
    "rating",
    "title",
    "approved",
    "createdAt"
  ],
  "v-cache": true,
  "v-security": {
    "allowGetAll": false,
    "publicRead": [
      "productId",
      "author",
      "rating",
      "title",
      "text",
      "approved"
    ],
    "publicWrite": [],
    "publicFilter": ["productId", "approved", "rating"]
  },
  "v-triggers": [
    {
      "name": "notify-moderator",
      "active": true,
      "condition": "approved=false",
      "action": {
        "type": "email",
        "provider": "default",
        "subject": "New review pending moderation",
        "to": ["moderator@mystore.com"],
        "body": "Review for product {{productId}} by {{author}}: {{title}}"
      },
      "retry": {
        "times": 3,
        "delay": { "addMinutes": 5 }
      }
    }
  ]
}
json
{
  "$schema": "http://json-schema.org/schema#",
  "title": "product-review-v1",
  "type": "object",
  "properties": {
    "productId": { "type": "string" },
    "author": { "type": "string" },
    "email": { "type": "string", "format": "email" },
    "rating": { "type": "integer", "minimum": 1, "maximum": 5 },
    "title": { "type": "string", "maxLength": 200 },
    "text": { "type": "string", "maxLength": 5000 },
    "approved": { "type": "boolean" },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "required": ["productId", "rating", "title", "text"],
  "v-indexed": ["productId", "approved", "rating", "createdAt"],
  "v-default-fields": [
    "productId",
    "author",
    "rating",
    "title",
    "approved",
    "createdAt"
  ],
  "v-cache": true,
  "v-security": {
    "allowGetAll": false,
    "publicRead": [
      "productId",
      "author",
      "rating",
      "title",
      "text",
      "approved"
    ],
    "publicWrite": [],
    "publicFilter": ["productId", "approved", "rating"]
  },
  "v-triggers": [
    {
      "name": "notify-moderator",
      "active": true,
      "condition": "approved=false",
      "action": {
        "type": "email",
        "provider": "default",
        "subject": "New review pending moderation",
        "to": ["moderator@mystore.com"],
        "body": "Review for product {{productId}} by {{author}}: {{title}}"
      },
      "retry": {
        "times": 3,
        "delay": { "addMinutes": 5 }
      }
    }
  ]
}

Triggers: when to use and when not to

触发器:何时使用,何时不使用

Use triggers when:
  • You need email notifications on document changes (e.g. moderation alerts)
  • You need to call an external webhook when a document meets a condition
  • The action is simple, fire-and-forget, and doesn't need complex error handling
Do NOT use triggers when:
  • You need complex orchestration, retries with backoff, or error recovery — use IO events instead
  • You need sub-second response to changes — triggers have built-in delay
  • The action modifies other MD entities in a chain — risk of cascading trigger loops
  • You need conditional logic more complex than a
    where
    -style filter
使用触发器的场景:
  • 需要在文档变更时发送邮件通知(例如审核提醒)
  • 需要在文档满足条件时调用外部Webhook
  • 操作简单、无需复杂错误处理的"触发即遗忘"场景
不使用触发器的场景:
  • 需要复杂编排、带退避策略的重试或错误恢复——改用IO事件
  • 需要对变更做出亚秒级响应——触发器存在内置延迟
  • 操作会链式修改其他MD实体——存在触发循环风险
  • 需要比
    where
    风格过滤更复杂的条件逻辑

Document counting without full fetch

无需全量获取的文档计数

Use the
REST-Content-Range
header to get document counts efficiently:
bash
undefined
使用
REST-Content-Range
标头高效获取文档计数:
bash
undefined

Count documents without fetching them

不获取文档仅计数

curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/search?_fields=id"
-H "REST-Range: resources=0-0"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/search?_fields=id" \ -H "REST-Range: resources=0-0" \ -H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"

Response header: REST-Content-Range: resources 0-0/12345

响应标头:REST-Content-Range: resources 0-0/12345

The number after "/" is the total document count

"/"后的数字即为总文档数

undefined
undefined

Search vs Scroll

Search与Scroll对比

UseWhenMax page size
searchDocuments
Bounded result sets, UI pagination, known small size100 per page
scrollDocuments
Large exports, bulk operations, unbounded iterationConfigurable batch
使用方式适用场景最大分页大小
searchDocuments
有限结果集、UI分页、已知小数据量场景每页最多100条
scrollDocuments
大量数据导出、批量操作、无界迭代场景可配置批量大小

Common failure modes

常见失败模式

  • Over-indexing — Indexing 10+ fields on a high-write entity. Every write updates all indexes, increasing latency and storage.
  • Missing indexes — Querying on non-indexed fields triggers full scans. Works in dev with 100 docs, times out in production with 100k.
  • v-cache: false
    by default
    — Disabling cache on read-heavy entities forces every GET to hit the database. Only disable for high-write entities.
  • allowGetAll: true
    with PII
    — Unauthenticated users can list all documents including sensitive data.
  • Schema accumulation — 60 schemas from development cycles blocks production deployments.
  • Trigger chains — Trigger A modifies entity B, which has a trigger that modifies entity A — infinite loop.
  • MD as a log store — Entities growing unboundedly with traffic volume. Use
    ctx.vtex.logger
    instead.
  • MD on critical path — Synchronous MD read in checkout with no timeout or fallback.
  • 过度索引——对高写入实体索引10个以上字段。每次写入都会更新所有索引,增加延迟和存储占用。
  • 缺失索引——对非索引字段查询触发全扫描。在开发环境中100条数据可正常工作,但在生产环境10万条数据时会超时。
  • 默认设置
    v-cache: false
    ——在读密集型实体上禁用缓存,导致每次GET请求都直接访问数据库。仅对高写入实体禁用缓存。
  • allowGetAll: true
    且包含PII
    ——未认证用户可列出包含敏感数据的所有文档。
  • Schema累积——开发周期中产生的60个Schema导致生产部署受阻。
  • 触发器链式调用——触发器A修改实体B,实体B的触发器又修改实体A——形成无限循环。
  • 将MD用作日志存储——实体随流量增长无限制膨胀。改用
    ctx.vtex.logger
  • MD位于关键路径上——在结账流程中同步读取MD数据,未设置超时或降级方案。

Review checklist

审核清单

  • Has a storage fit review been done? (MD vs Catalog vs OMS vs VBase vs external DB)
  • Are only queried fields in
    v-indexed
    ? No unnecessary indexes?
  • Is
    v-cache
    set appropriately for the entity's read/write ratio?
  • Does
    v-security
    restrict public access to non-sensitive fields only?
  • Is
    allowGetAll
    set to
    false
    unless explicitly needed?
  • Are triggers simple and non-chaining? No risk of trigger loops?
  • Is there a schema cleanup strategy for the 60-schema limit?
  • Is the entity off the purchase critical path (checkout, cart, payment)?
  • For large datasets (100k+ docs), is
    scrollDocuments
    used instead of paginated search?
  • Are
    v-default-fields
    minimal (not returning everything by default)?
  • 是否已完成存储适配性评审?(对比MD与Catalog、OMS、VBase、外部数据库)
  • v-indexed
    中是否仅包含被查询的字段?无不必要的索引?
  • v-cache
    的设置是否与实体的读写比例匹配?
  • v-security
    是否仅允许公共访问非敏感字段?
  • allowGetAll
    是否设置为
    false
    (除非明确需要)?
  • 触发器是否简单且无链式调用?无触发循环风险?
  • 是否有应对60个Schema限制的Schema清理策略
  • 实体是否不在购买关键路径(结账、购物车、支付)上?
  • 对于大数据集(10万条以上文档),是否使用
    scrollDocuments
    而非分页搜索?
  • v-default-fields
    是否最小化(默认不返回所有字段)?

Related skills

相关指南

  • vtex-io-masterdata — IO app integration: MasterDataClient,
    masterdata
    builder, CRUD patterns
  • vtex-io-application-performance — Caching layers and BFF patterns when exposing MD data
  • architecture-well-architected-commerce — Cross-cutting storage and architecture principles
  • vtex-io-masterdata —— IO应用集成:MasterDataClient、
    masterdata
    构建器、CRUD模式
  • vtex-io-application-performance —— 暴露MD数据时的缓存层与BFF模式
  • architecture-well-architected-commerce —— 跨领域的存储与架构原则

Reference

参考链接