masterdata-storage-strategy
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMaster Data storage strategy
Master Data存储策略
When this skill applies
本指南适用场景
Use this skill before creating any new Master Data entity or when auditing existing usage. It helps you answer:
- Is Master Data the right storage for this data, or would Catalog, OMS, VBase, or an external database serve better?
- How should I design the JSON Schema for performance and security?
- Which fields should I index (), and which should I not?
v-indexed - Should I enable or disable caching ()?
v-cache - Do I need triggers (), or is an event-driven IO approach better?
v-triggers - How do I plan for capacity and lifecycle of schemas and documents?
Do not use this skill for:
- VTEX IO app integration patterns (MasterDataClient, builder, CRUD in code) — use
masterdatavtex-io-masterdata - Performance patterns for IO services (LRU, VBase caching layers) — use
vtex-io-application-performance
请在创建任何新的Master Data实体之前或审计现有使用情况时使用本指南。它可帮助你解答以下问题:
- Master Data是否是该数据的合适存储方案,还是Catalog、OMS、VBase或外部数据库更适用?
- 如何设计JSON Schema以兼顾性能与安全性?
- 哪些字段需要索引(),哪些不需要?
v-indexed - 是否应该启用或禁用缓存()?
v-cache - 是否需要触发器(),还是事件驱动的IO方法更优?
v-triggers - 如何规划Schema和文档的容量与生命周期?
本指南不适用于:
- VTEX IO应用集成模式(MasterDataClient、构建器、代码中的CRUD操作)——请使用
masterdata指南vtex-io-masterdata - IO服务的性能模式(LRU、VBase缓存层)——请使用指南
vtex-io-application-performance
Decision rules
决策规则
When to use Master Data
何时使用Master Data
Master Data is a good fit when all of the following are true:
- Document-oriented access — Your data is naturally key-value or document-shaped (JSON documents with variable schemas). You query by indexed fields and retrieve full or partial documents.
- Platform-integrated — You benefit from VTEX-native features: for automated workflows,
v-triggersfor per-field public access,v-securityfor search/filter, and thev-indexedbuilder for schema-as-code.masterdata - Moderate volume — Your entity will hold thousands to low millions of documents. MD handles this well with proper indexing.
- Not on the purchase critical path — MD is not optimized for sub-10ms latency. Synchronous MD reads in checkout/cart/payment flows risk conversion if MD is slow.
- No better native fit — The data doesn't belong in Catalog (product/SKU attributes), OMS (order data), CL/AD (customer profiles/addresses), or VBase (app-specific cache/state).
当所有以下条件都满足时,Master Data是合适的选择:
- 面向文档的访问——你的数据天然是键值对或文档形态(具有可变Schema的JSON文档)。你通过索引字段查询,并检索完整或部分文档。
- 平台集成——你需要借助VTEX原生功能:用于自动化工作流的、支持字段级公共访问的
v-triggers、用于搜索/过滤的v-security,以及实现Schema即代码的v-indexed构建器。masterdata - 中等数据量——你的实体将存储数千至数百万级别的文档。在合理索引的情况下,MD能很好地处理这类数据量。
- 不在购买关键路径上——MD并非为亚10毫秒延迟优化。在结账/购物车/支付流程中同步读取MD数据,若MD响应缓慢可能影响转化。
- 无更合适的原生存储——该数据不属于Catalog(商品/SKU属性)、OMS(订单数据)、CL/AD(客户档案/地址)或VBase(应用专属缓存/状态)的范畴。
When NOT to use Master Data
何时不使用Master Data
| Data type | Better storage | Why |
|---|---|---|
| Product attributes, specifications | Catalog (specifications, unstructured specs) | Native indexing, search integration, catalog APIs |
| Order data, order history | OMS (via OMS APIs + BFF cache) | Single source of truth; duplicating to MD creates drift |
| Customer profiles, addresses | CL/AD native entities | Platform-managed, already indexed and cached |
| App-specific cache or temp state | VBase | Designed for per-app ephemeral storage, no schema overhead |
| Application logs, debug traces | | Structured logging infrastructure, not a database |
| High-throughput time-series data | External database (SQL, NoSQL, time-series DB) | MD is not designed for millions of writes/day |
| Relational data with joins | External SQL database | MD has no join support; denormalize or use a relational DB |
| Data requiring strong consistency | External database | MD is eventually consistent for indexed fields |
| 数据类型 | 更合适的存储方案 | 原因 |
|---|---|---|
| 商品属性、规格信息 | Catalog(规格参数、非结构化说明) | 原生索引、搜索集成、Catalog专属API |
| 订单数据、订单历史 | OMS(通过OMS API + BFF缓存) | 单一数据源;复制到MD会导致数据不一致 |
| 客户档案、地址信息 | CL/AD原生实体 | 平台托管,已完成索引与缓存配置 |
| 应用专属缓存或临时状态 | VBase | 专为应用级临时存储设计,无Schema冗余 |
| 应用日志、调试追踪信息 | | 结构化日志基础设施,并非数据库 |
| 高吞吐量时间序列数据 | 外部数据库(SQL、NoSQL、时间序列数据库) | MD并非为每日百万级写入量设计 |
| 需要关联查询的关系型数据 | 外部SQL数据库 | MD不支持关联查询;需进行数据反规范化或使用关系型数据库 |
| 需要强一致性的数据 | 外部数据库 | MD的索引字段最终一致,不满足强一致性要求 |
Schema design principles
Schema设计原则
- One entity per concept — Don't mix unrelated data in a single entity. Each entity should represent a clear business concept (e.g. ,
reviews,wishlists).legacyOrders - Index what you query — Only fields in can be used in
v-indexedclauses. But don't over-index: each indexed field increases write latency and storage because the index is updated on every document change.where - Minimal — Return only the fields most consumers need by default. Large default payloads waste bandwidth.
v-default-fields - matches the workload — Leave
v-cache(default) for read-heavy entities. Set totruefor entities with high write frequency where consumers need immediate consistency after writes.false - is explicit — Set
v-securityunless unauthenticated list access is intentional. UseallowGetAll: false,publicRead,publicWriteonly for fields that must be accessible without authentication.publicFilter
- 一个实体对应一个业务概念——不要在单个实体中混合无关数据。每个实体应代表清晰的业务概念(例如、
reviews、wishlists)。legacyOrders - 仅索引需要查询的字段——只有中的字段可用于
v-indexed子句。但不要过度索引:每个索引字段都会增加写入延迟和存储占用,因为每次文档变更都会更新索引。where - 最小化——默认仅返回大多数消费者需要的字段。过大的默认响应 payload 会浪费带宽。
v-default-fields - 匹配工作负载——对于读密集型实体,保持默认值
v-cache。对于写入频率高、且消费者需要在写入后立即获取最新数据的实体,设置为true。false - 配置明确——除非有意允许未认证用户的列表访问,否则设置
v-security。仅对无需认证即可访问的字段使用allowGetAll: false、publicRead、publicWrite。publicFilter
VTEX schema extensions (v-*
fields) — reference
v-*VTEX Schema扩展(v-*
字段)——参考指南
v-*Master Data v2 extends standard JSON Schema with properties that control indexing, caching, security, defaults, triggers, and schema inheritance. These are VTEX-specific; standard JSON Schema validators ignore them.
v-*Master Data v2扩展了标准JSON Schema,新增属性用于控制索引、缓存、安全、默认值、触发器和Schema继承。这些是VTEX专属属性;标准JSON Schema验证器会忽略它们。
v-*v-indexed
v-indexedv-indexed
v-indexedArray of field names that Master Data will create secondary indexes for.
- Only indexed fields can appear in clauses for
whereandsearchDocuments. Queries on non-indexed fields trigger full document scans that time out on large datasets.scrollDocuments - Each index is updated on every document write. Over-indexing increases write latency and storage cost proportionally.
- When to index: fields used in filters, sort expressions, or
where. When not to index: large text fields (publicFilter,description), fields never queried, or fields only read by document ID (indexing adds no benefit fornotes).getDocument
json
{ "v-indexed": ["email", "status", "createdAt"] }Master Data将为该数组中的字段名创建二级索引。
- 只有索引字段可出现在和
searchDocuments的scrollDocuments子句中。对非索引字段的查询会触发全文档扫描,在大数据集上会超时。where - 每次文档写入时都会更新所有索引。过度索引会成比例增加写入延迟和存储成本。
- 何时索引:用于过滤、排序表达式或
where的字段。何时不索引:大文本字段(publicFilter、description)、从未被查询的字段、仅通过文档ID读取的字段(索引对notes无益处)。getDocument
json
{ "v-indexed": ["email", "status", "createdAt"] }v-cache
v-cachev-cache
v-cacheBoolean (default ). Controls whether Master Data caches GET responses for individual documents.
true- (default) — Read-heavy entities benefit from caching. Most entities should leave this as default.
true - — Use for entities with high write frequency where consumers need fresh reads immediately after writes (e.g. real-time counters, configuration flags, session-like state).
false
json
{ "v-cache": false }布尔值(默认)。控制Master Data是否缓存单个文档的GET响应。
true- (默认)——读密集型实体可从缓存中获益。大多数实体应保持默认值。
true - ——适用于写入频率高且消费者需要在写入后立即获取最新数据的实体(例如实时计数器、配置标志、类会话状态)。
false
json
{ "v-cache": false }v-default-fields
v-default-fieldsv-default-fields
v-default-fieldsArray of field names returned when the caller does not specify a parameter in the API request.
fields- Keep this minimal — only the fields most consumers need by default.
- Reduces payload size for common queries.
json
{ "v-default-fields": ["email", "status", "score", "createdAt"] }当调用者未在API请求中指定参数时,返回该数组中的字段名。
fields- 保持最小化——仅包含大多数消费者默认需要的字段。
- 减少常见查询的响应 payload 大小。
json
{ "v-default-fields": ["email", "status", "score", "createdAt"] }v-security
v-securityv-security
v-securityObject controlling unauthenticated (public) access to fields. By default, all fields require authentication.
| Property | Type | Description |
|---|---|---|
| | If |
| | Fields readable without authentication |
| | Fields writable without authentication |
| | Fields usable in |
json
{
"v-security": {
"allowGetAll": false,
"publicRead": ["status", "displayName", "rating"],
"publicWrite": [],
"publicFilter": ["status"]
}
}Never include PII (email, phone, addresses), internal IDs, or business-sensitive data in or .
publicReadpublicFilter控制未认证(公共)用户对字段访问权限的对象。默认情况下,所有字段都需要认证。
| 属性 | 类型 | 描述 |
|---|---|---|
| | 如果为 |
| | 无需认证即可读取的字段列表 |
| | 无需认证即可写入的字段列表 |
| | 无需认证即可在 |
json
{
"v-security": {
"allowGetAll": false,
"publicRead": ["status", "displayName", "rating"],
"publicWrite": [],
"publicFilter": ["status"]
}
}切勿将PII(邮箱、电话、地址)、内部ID或业务敏感数据包含在或中。
publicReadpublicFilterv-triggers
v-triggersv-triggers
v-triggersArray of trigger objects that define automated actions executed when documents are created or updated and meet specified conditions.
| Property | Type | Description |
|---|---|---|
| | Unique trigger name |
| | Enable/disable the trigger |
| | |
| | |
| | Email provider name (for email type) |
| | Webhook URL (for http type) |
| | HTTP method for webhook (for http type) |
| | Retry count on failure |
| | Delay between retries (e.g. |
json
{
"v-triggers": [
{
"name": "notify-on-creation",
"active": true,
"condition": "status=new",
"action": {
"type": "email",
"provider": "default",
"subject": "New record: {{title}}",
"to": ["admin@mystore.com"],
"body": "Record {{id}} created by {{author}}"
},
"retry": { "times": 3, "delay": { "addMinutes": 5 } }
},
{
"name": "webhook-on-approval",
"active": true,
"condition": "approved=true",
"action": {
"type": "http",
"uri": "https://my-integration.example.com/webhook",
"method": "POST",
"headers": { "X-Custom-Header": "value" }
},
"retry": { "times": 2, "delay": { "addMinutes": 10 } }
}
]
}定义自动化操作的触发器对象数组,当文档被创建或更新且满足指定条件时执行这些操作。
| 属性 | 类型 | 描述 |
|---|---|---|
| | 唯一的触发器名称 |
| | 启用/禁用触发器 |
| | |
| | |
| | 邮件服务提供商名称(适用于email类型) |
| | Webhook URL(适用于http类型) |
| | Webhook的HTTP方法(适用于http类型) |
| | 失败时的重试次数 |
| | 重试间隔(例如 |
json
{
"v-triggers": [
{
"name": "notify-on-creation",
"active": true,
"condition": "status=new",
"action": {
"type": "email",
"provider": "default",
"subject": "New record: {{title}}",
"to": ["admin@mystore.com"],
"body": "Record {{id}} created by {{author}}"
},
"retry": { "times": 3, "delay": { "addMinutes": 5 } }
},
{
"name": "webhook-on-approval",
"active": true,
"condition": "approved=true",
"action": {
"type": "http",
"uri": "https://my-integration.example.com/webhook",
"method": "POST",
"headers": { "X-Custom-Header": "value" }
},
"retry": { "times": 2, "delay": { "addMinutes": 10 } }
}
]
}v-canonicalto
v-canonicaltov-canonicalto
v-canonicaltoURL pointing to another schema in the same entity for schema inheritance. The current schema inherits properties and constraints from the target.
json
{
"v-canonicalto": "https://{host}/api/dataentities/{entity}/schemas/{base-schema}"
}指向同一实体中另一个Schema的URL,用于Schema继承。当前Schema将继承目标Schema的属性和约束。
json
{
"v-canonicalto": "https://{host}/api/dataentities/{entity}/schemas/{base-schema}"
}additionalProperties
additionalPropertiesadditionalProperties
additionalPropertiesStandard JSON Schema property, but worth noting: set to to reject fields not declared in . By default Master Data preserves extra fields without validation.
falseproperties标准JSON Schema属性,但值得注意:设置为可拒绝中未声明的字段。默认情况下,Master Data会保留额外字段而不进行验证。
falsepropertiesHard constraints
硬性约束
Constraint: Index only fields used in where clauses or sort expressions
约束:仅索引用于where子句或排序表达式的字段
Every field in creates a secondary index that is updated on every document write. Indexing fields that are never queried wastes write throughput and storage.
v-indexedWhy this matters — Over-indexing a high-write entity (e.g. indexing 15 fields when only 3 are queried) can double or triple write latency. On entities with millions of documents, unnecessary indexes also increase storage costs.
Detection — Compare fields with actual clauses in the codebase. Any indexed field not referenced in a or sort is likely unnecessary.
v-indexedwherewhereCorrect — Index only the fields you filter or sort on.
json
{
"properties": {
"email": { "type": "string" },
"status": { "type": "string" },
"score": { "type": "integer" },
"notes": { "type": "string" },
"createdAt": { "type": "string", "format": "date-time" }
},
"v-indexed": ["email", "status", "createdAt"]
}Wrong — Indexing every field "just in case."
json
{
"properties": {
"email": { "type": "string" },
"status": { "type": "string" },
"score": { "type": "integer" },
"notes": { "type": "string" },
"createdAt": { "type": "string", "format": "date-time" }
},
"v-indexed": ["email", "status", "score", "notes", "createdAt"]
}v-indexed重要性——对高写入实体过度索引(例如仅需查询3个字段却索引了15个)会使写入延迟翻倍或三倍。对于拥有数百万文档的实体,不必要的索引还会增加存储成本。
检测方法——对比字段与代码库中实际的子句。任何未在或排序中引用的索引字段都可能是不必要的。
v-indexedwherewhere正确示例——仅索引需要过滤或排序的字段。
json
{
"properties": {
"email": { "type": "string" },
"status": { "type": "string" },
"score": { "type": "integer" },
"notes": { "type": "string" },
"createdAt": { "type": "string", "format": "date-time" }
},
"v-indexed": ["email", "status", "createdAt"]
}错误示例——为了"以防万一"索引所有字段。
json
{
"properties": {
"email": { "type": "string" },
"status": { "type": "string" },
"score": { "type": "integer" },
"notes": { "type": "string" },
"createdAt": { "type": "string", "format": "date-time" }
},
"v-indexed": ["email", "status", "score", "notes", "createdAt"]
}Constraint: Do not expose sensitive fields via v-security publicRead
约束:不要通过v-security publicRead暴露敏感字段
The array makes fields accessible without any authentication. Never include PII (email, phone, addresses), internal IDs, or business-sensitive data in this list.
v-security.publicReadWhy this matters — Public fields are accessible to anyone with the entity name and a document ID or search query. Exposing PII violates data protection regulations and creates security vulnerabilities.
Detection — Check and for fields containing user data, internal references, or anything that should require authentication.
v-security.publicReadpublicFilterCorrect — Expose only non-sensitive, display-oriented fields.
json
{
"v-security": {
"allowGetAll": false,
"publicRead": ["status", "displayName", "rating"],
"publicWrite": [],
"publicFilter": ["status"]
}
}Wrong — Exposing PII and internal fields publicly.
json
{
"v-security": {
"allowGetAll": true,
"publicRead": ["email", "phone", "cpf", "internalScore", "organizationId"],
"publicWrite": ["email"],
"publicFilter": ["email", "phone"]
}
}v-security.publicRead重要性——公共字段可被任何知道实体名称和文档ID或搜索查询的人访问。暴露PII违反数据保护法规,并会造成安全漏洞。
检测方法——检查和中是否包含用户数据、内部引用或任何需要认证的内容。
v-security.publicReadpublicFilter正确示例——仅暴露非敏感、面向展示的字段。
json
{
"v-security": {
"allowGetAll": false,
"publicRead": ["status", "displayName", "rating"],
"publicWrite": [],
"publicFilter": ["status"]
}
}错误示例——公开暴露PII和内部字段。
json
{
"v-security": {
"allowGetAll": true,
"publicRead": ["email", "phone", "cpf", "internalScore", "organizationId"],
"publicWrite": ["email"],
"publicFilter": ["email", "phone"]
}
}Constraint: Respect the 60-schema-per-entity limit
约束:遵守每个实体最多60个Schema的限制
Master Data v2 entities have a hard limit of 60 schemas. The builder creates a new schema per app version linked or installed. Once the limit is reached, new versions fail to deploy.
masterdataWhy this matters — During active development with frequent cycles, schemas accumulate quickly. Hitting the limit blocks deployment until old schemas are manually deleted.
vtex linkDetection — Apps with many link/publish cycles. Check schema count via .
GET /api/dataentities/{entity}/schemasCorrect — Periodically clean up unused schemas. Automate cleanup in CI/CD.
bash
undefinedMaster Data v2实体有硬性限制:最多60个Schema。构建器会为每个关联或安装的应用版本创建一个新Schema。一旦达到限制,新版本将部署失败。
masterdata重要性——在频繁使用的活跃开发阶段,Schema会快速累积。达到限制后,必须手动删除旧Schema才能继续部署。
vtex link检测方法——查看有多次链接/发布周期的应用。通过检查Schema数量。
GET /api/dataentities/{entity}/schemas正确做法——定期清理未使用的Schema。在CI/CD中自动化清理流程。
bash
undefinedList schemas to identify stale ones
列出Schema以识别过期的Schema
curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas" \
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
Delete unused schemas
删除未使用的Schema
curl -X DELETE "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas/{old-schema}"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
**Wrong** — Never cleaning up schemas during development until the limit blocks deployment.curl -X DELETE "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/schemas/{old-schema}" \
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
**错误做法**——开发期间从不清理Schema,直到达到限制导致部署受阻。Preferred pattern
推荐模式
Complete schema example with all VTEX extensions
包含所有VTEX扩展的完整Schema示例
json
{
"$schema": "http://json-schema.org/schema#",
"title": "product-review-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"author": { "type": "string" },
"email": { "type": "string", "format": "email" },
"rating": { "type": "integer", "minimum": 1, "maximum": 5 },
"title": { "type": "string", "maxLength": 200 },
"text": { "type": "string", "maxLength": 5000 },
"approved": { "type": "boolean" },
"createdAt": { "type": "string", "format": "date-time" }
},
"required": ["productId", "rating", "title", "text"],
"v-indexed": ["productId", "approved", "rating", "createdAt"],
"v-default-fields": [
"productId",
"author",
"rating",
"title",
"approved",
"createdAt"
],
"v-cache": true,
"v-security": {
"allowGetAll": false,
"publicRead": [
"productId",
"author",
"rating",
"title",
"text",
"approved"
],
"publicWrite": [],
"publicFilter": ["productId", "approved", "rating"]
},
"v-triggers": [
{
"name": "notify-moderator",
"active": true,
"condition": "approved=false",
"action": {
"type": "email",
"provider": "default",
"subject": "New review pending moderation",
"to": ["moderator@mystore.com"],
"body": "Review for product {{productId}} by {{author}}: {{title}}"
},
"retry": {
"times": 3,
"delay": { "addMinutes": 5 }
}
}
]
}json
{
"$schema": "http://json-schema.org/schema#",
"title": "product-review-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"author": { "type": "string" },
"email": { "type": "string", "format": "email" },
"rating": { "type": "integer", "minimum": 1, "maximum": 5 },
"title": { "type": "string", "maxLength": 200 },
"text": { "type": "string", "maxLength": 5000 },
"approved": { "type": "boolean" },
"createdAt": { "type": "string", "format": "date-time" }
},
"required": ["productId", "rating", "title", "text"],
"v-indexed": ["productId", "approved", "rating", "createdAt"],
"v-default-fields": [
"productId",
"author",
"rating",
"title",
"approved",
"createdAt"
],
"v-cache": true,
"v-security": {
"allowGetAll": false,
"publicRead": [
"productId",
"author",
"rating",
"title",
"text",
"approved"
],
"publicWrite": [],
"publicFilter": ["productId", "approved", "rating"]
},
"v-triggers": [
{
"name": "notify-moderator",
"active": true,
"condition": "approved=false",
"action": {
"type": "email",
"provider": "default",
"subject": "New review pending moderation",
"to": ["moderator@mystore.com"],
"body": "Review for product {{productId}} by {{author}}: {{title}}"
},
"retry": {
"times": 3,
"delay": { "addMinutes": 5 }
}
}
]
}Triggers: when to use and when not to
触发器:何时使用,何时不使用
Use triggers when:
- You need email notifications on document changes (e.g. moderation alerts)
- You need to call an external webhook when a document meets a condition
- The action is simple, fire-and-forget, and doesn't need complex error handling
Do NOT use triggers when:
- You need complex orchestration, retries with backoff, or error recovery — use IO events instead
- You need sub-second response to changes — triggers have built-in delay
- The action modifies other MD entities in a chain — risk of cascading trigger loops
- You need conditional logic more complex than a -style filter
where
使用触发器的场景:
- 需要在文档变更时发送邮件通知(例如审核提醒)
- 需要在文档满足条件时调用外部Webhook
- 操作简单、无需复杂错误处理的"触发即遗忘"场景
不使用触发器的场景:
- 需要复杂编排、带退避策略的重试或错误恢复——改用IO事件
- 需要对变更做出亚秒级响应——触发器存在内置延迟
- 操作会链式修改其他MD实体——存在触发循环风险
- 需要比风格过滤更复杂的条件逻辑
where
Document counting without full fetch
无需全量获取的文档计数
Use the header to get document counts efficiently:
REST-Content-Rangebash
undefined使用标头高效获取文档计数:
REST-Content-Rangebash
undefinedCount documents without fetching them
不获取文档仅计数
curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/search?_fields=id"
-H "REST-Range: resources=0-0"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
-H "REST-Range: resources=0-0"
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
curl "https://{account}.vtexcommercestable.com.br/api/dataentities/{entity}/search?_fields=id" \
-H "REST-Range: resources=0-0" \
-H "X-VTEX-API-AppKey: {key}" -H "X-VTEX-API-AppToken: {token}"
Response header: REST-Content-Range: resources 0-0/12345
响应标头:REST-Content-Range: resources 0-0/12345
The number after "/" is the total document count
"/"后的数字即为总文档数
undefinedundefinedSearch vs Scroll
Search与Scroll对比
| Use | When | Max page size |
|---|---|---|
| Bounded result sets, UI pagination, known small size | 100 per page |
| Large exports, bulk operations, unbounded iteration | Configurable batch |
| 使用方式 | 适用场景 | 最大分页大小 |
|---|---|---|
| 有限结果集、UI分页、已知小数据量场景 | 每页最多100条 |
| 大量数据导出、批量操作、无界迭代场景 | 可配置批量大小 |
Common failure modes
常见失败模式
- Over-indexing — Indexing 10+ fields on a high-write entity. Every write updates all indexes, increasing latency and storage.
- Missing indexes — Querying on non-indexed fields triggers full scans. Works in dev with 100 docs, times out in production with 100k.
- by default — Disabling cache on read-heavy entities forces every GET to hit the database. Only disable for high-write entities.
v-cache: false - with PII — Unauthenticated users can list all documents including sensitive data.
allowGetAll: true - Schema accumulation — 60 schemas from development cycles blocks production deployments.
- Trigger chains — Trigger A modifies entity B, which has a trigger that modifies entity A — infinite loop.
- MD as a log store — Entities growing unboundedly with traffic volume. Use instead.
ctx.vtex.logger - MD on critical path — Synchronous MD read in checkout with no timeout or fallback.
- 过度索引——对高写入实体索引10个以上字段。每次写入都会更新所有索引,增加延迟和存储占用。
- 缺失索引——对非索引字段查询触发全扫描。在开发环境中100条数据可正常工作,但在生产环境10万条数据时会超时。
- 默认设置——在读密集型实体上禁用缓存,导致每次GET请求都直接访问数据库。仅对高写入实体禁用缓存。
v-cache: false - 且包含PII——未认证用户可列出包含敏感数据的所有文档。
allowGetAll: true - Schema累积——开发周期中产生的60个Schema导致生产部署受阻。
- 触发器链式调用——触发器A修改实体B,实体B的触发器又修改实体A——形成无限循环。
- 将MD用作日志存储——实体随流量增长无限制膨胀。改用。
ctx.vtex.logger - MD位于关键路径上——在结账流程中同步读取MD数据,未设置超时或降级方案。
Review checklist
审核清单
- Has a storage fit review been done? (MD vs Catalog vs OMS vs VBase vs external DB)
- Are only queried fields in ? No unnecessary indexes?
v-indexed - Is set appropriately for the entity's read/write ratio?
v-cache - Does restrict public access to non-sensitive fields only?
v-security - Is set to
allowGetAllunless explicitly needed?false - Are triggers simple and non-chaining? No risk of trigger loops?
- Is there a schema cleanup strategy for the 60-schema limit?
- Is the entity off the purchase critical path (checkout, cart, payment)?
- For large datasets (100k+ docs), is used instead of paginated search?
scrollDocuments - Are minimal (not returning everything by default)?
v-default-fields
- 是否已完成存储适配性评审?(对比MD与Catalog、OMS、VBase、外部数据库)
- 中是否仅包含被查询的字段?无不必要的索引?
v-indexed - 的设置是否与实体的读写比例匹配?
v-cache - 是否仅允许公共访问非敏感字段?
v-security - 是否设置为
allowGetAll(除非明确需要)?false - 触发器是否简单且无链式调用?无触发循环风险?
- 是否有应对60个Schema限制的Schema清理策略?
- 实体是否不在购买关键路径(结账、购物车、支付)上?
- 对于大数据集(10万条以上文档),是否使用而非分页搜索?
scrollDocuments - 是否最小化(默认不返回所有字段)?
v-default-fields
Related skills
相关指南
- vtex-io-masterdata — IO app integration: MasterDataClient, builder, CRUD patterns
masterdata - vtex-io-application-performance — Caching layers and BFF patterns when exposing MD data
- architecture-well-architected-commerce — Cross-cutting storage and architecture principles
- vtex-io-masterdata —— IO应用集成:MasterDataClient、构建器、CRUD模式
masterdata - vtex-io-application-performance —— 暴露MD数据时的缓存层与BFF模式
- architecture-well-architected-commerce —— 跨领域的存储与架构原则
Reference
参考链接
- Working with JSON Schemas in Master Data v2 — v-indexed, v-cache, v-security, v-triggers configuration
- Master Data v2 Basics — Core concepts and data model
- Master Data Schema Lifecycle — Schema versioning and the 60-schema limit
- Setting Up Triggers on Master Data v2 — Trigger configuration and patterns
- Master Data v2 API Reference — Complete API specification
- Master Data v2 Document Saving Flow — Validation, indexing, and trigger execution order
- Working with JSON Schemas in Master Data v2 —— v-indexed、v-cache、v-security、v-triggers配置指南
- Master Data v2 Basics —— 核心概念与数据模型
- Master Data Schema Lifecycle —— Schema版本管理与60个Schema限制
- Setting Up Triggers on Master Data v2 —— 触发器配置与模式
- Master Data v2 API Reference —— 完整API规范
- Master Data v2 Document Saving Flow —— 验证、索引与触发器执行顺序",