vtex-io-masterdata-strategy
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMaster Data Strategy
Master Data 策略
When this skill applies
适用场景
Use this skill when deciding whether Master Data v2 is the right mechanism for custom data in a VTEX IO app.
- Modeling reviews, wishlists, forms, or custom app records
- Choosing entity boundaries
- Planning schema indexing and lifecycle
- Reviewing long-term Master Data design
Do not use this skill for:
- low-level client usage details
- runtime or route structure
- app settings schemas
- frontend UI behavior
当你需要决策VTEX IO应用中的自定义数据是否适合使用Master Data v2作为存储方案时,可参考本规范:
- 评论、心愿单、表单或自定义应用记录的建模
- 实体边界划分
- 规划schema索引与生命周期
- 审核长期Master Data设计方案
本规范不适用于以下场景:
- 底层客户端使用细节
- 运行时或路由结构设计
- 应用设置schema
- 前端UI行为
Decision rules
决策规则
- Use this skill once Master Data is a serious candidate storage mechanism. For the broader choice between Master Data, VBase, VTEX core APIs, and external stores, use .
vtex-io-data-access-patterns - Use Master Data for structured custom data that needs validation, indexing, and query support.
- Use the builder when this app introduces a new business entity, owns the data model, and wants the schema to be created and versioned as part of the app contract.
masterdata - Prefer using only the Master Data client when the entity and schema already exist and are shared or centrally managed, and this app only needs to read or write records without redefining the schema itself.
- For stable schemas that the app owns but should not be recreated or updated on every app version, keep the schema definition in code and use the Master Data client in a controlled setup path to create or update the schema only when needed.
- Remember that Master Data entities are account-scoped. Changing a shared entity or schema affects every app in that account that depends on it, so prefer client-only consumption when the schema is centrally managed.
- Keep entity boundaries intentional and aligned with the business concept being stored.
- Index fields that are actually used for filtering and search.
- Plan schema lifecycle explicitly to avoid schema sprawl.
- Consider data volume and retention from the start. If the dataset will grow unbounded and there is no retention or archival strategy, Master Data is likely not the right storage mechanism.
- Do not treat Master Data as an unbounded dumping ground for arbitrary payloads.
- Do not use Master Data as an unbounded log or event store for high-volume append-only data. Prefer dedicated logging or storage mechanisms when the main need is raw history rather than structured queries.
- Do not store secrets, credentials, or global app configuration in Master Data. Use app settings or configuration apps instead.
- Do not generate one entity or schema per account, workspace, or feature flag. Keep a stable entity name and distinguish tenants or environments through record fields when necessary.
- Be careful when tying schema evolution directly to app versioning through the builder. Frequent schema changes coupled to app releases can generate excessive schema updates, indexing changes, and long-term schema sprawl.
masterdata
- 当Master Data已经成为重点候选存储机制时可参考本规范。如果需要在Master Data、VBase、VTEX核心API、外部存储之间做更广泛的选型,请参考规范。
vtex-io-data-access-patterns - 对于需要校验、索引和查询支持的结构化自定义数据,可使用Master Data存储。
- 如果应用需要引入新的业务实体、拥有数据模型所有权,并且希望schema作为应用契约的一部分被创建和版本管理,请使用builder。
masterdata - 如果实体和schema已经存在,且是共享的或集中管理的,应用仅需要读写记录无需重新定义schema本身,优先仅使用Master Data客户端即可。
- 对于应用拥有所有权、但不需要在每次应用版本发布时都重新创建或更新的稳定schema,可将schema定义保存在代码中,通过受控的设置路径使用Master Data客户端仅在需要时创建或更新schema。
- 请注意Master Data实体是账户级范围的,修改共享实体或schema会影响该账户下所有依赖它的应用,因此当schema是集中管理时,优先仅通过客户端消费。
- 实体边界设计要清晰,与存储的业务概念对齐。
- 仅为实际用于过滤和搜索的字段建立索引。
- 明确规划schema生命周期,避免schema泛滥。
- 从一开始就要考虑数据量和留存策略,如果数据集会无限增长且没有留存或归档策略,Master Data大概率不是合适的存储机制。
- 不要将Master Data当做任意载荷的无界转储空间。
- 不要将Master Data用作高容量仅追加数据的无界日志或事件存储,当核心需求是存储原始历史记录而非结构化查询时,优先使用专用的日志或存储机制。
- 不要在Master Data中存储密钥、凭证或全局应用配置,请改用应用设置或配置类应用。
- 不要为每个账户、工作区或特性标记生成单独的实体或schema,保持稳定的实体名称,必要时通过记录字段区分租户或环境。
- 通过builder将schema演进与应用版本直接绑定时要谨慎,与应用发布绑定的频繁schema变更会导致过多的schema更新、索引变更,以及长期的schema泛滥问题。
masterdata
Choosing between the masterdata
builder and the Master Data client
masterdatamasterdata
builder与Master Data客户端的选型
masterdataThere are three main ways for a VTEX IO app to work with Master Data:
-
Owning the schema via thebuilder:
masterdata- The app declares entities and schemas under in the repository.
masterdata/ - Schema fields, validation, and indexing evolve together with the app code.
- Use this when the app is the primary owner of the data model, schema changes are relatively infrequent, and the schema should be rolled out as part of the app contract.
- The app declares entities and schemas under
-
Consuming an existing schema via the Master Data client only:
- The app uses a Master Data client, but does not declare entities or schemas through the builder.
masterdata - The app assumes a stable schema managed elsewhere and only reads or writes records that follow that contract.
- Use this when the entity is shared across multiple apps or managed centrally, and this app should not redefine or fragment the schema across environments.
- The app uses a Master Data client, but does not declare entities or schemas through the
-
Owning a stable schema definition in code and applying it through the client:
- The app keeps a stable schema definition in code instead of builder files.
masterdata/ - A controlled setup path checks whether the schema exists and creates or updates it only when needed.
- Use this when the app truly owns the schema, but should not couple schema rollout to every app version or every release pipeline step.
- The app keeps a stable schema definition in code instead of
VTEX IO应用对接Master Data主要有三种方式:
-
通过builder持有schema所有权:
masterdata- 应用在代码仓库的目录下声明实体和schema。
masterdata/ - Schema字段、校验规则、索引与应用代码一同演进。
- 当应用是数据模型的主要所有者、schema变更相对不频繁,且schema需要作为应用契约的一部分发布时,使用该方式。
- 应用在代码仓库的
-
仅通过Master Data客户端消费现有schema:
- 应用使用Master Data客户端,但不通过builder声明实体或schema。
masterdata - 应用假设schema是由其他方管理的稳定版本,仅读写符合该契约的记录。
- 当实体是跨多个应用共享的或集中管理的,且应用不应该跨环境重新定义或拆分schema时,使用该方式。
- 应用使用Master Data客户端,但不通过
-
在代码中持有稳定schema定义并通过客户端应用:
- 应用将稳定的schema定义保存在代码中,而非builder文件内。
masterdata/ - 受控的设置路径会检查schema是否存在,仅在需要时创建或更新。
- 当应用确实拥有schema所有权,但不应该将schema发布与每个应用版本或每个发布流水线步骤绑定的时候,使用该方式。
- 应用将稳定的schema定义保存在代码中,而非
Hard constraints
硬性约束
Constraint: Master Data entities must have explicit schema boundaries
约束:Master Data实体必须有明确的schema边界
Each entity MUST represent a clear business concept and have a schema that matches its intended usage.
Why this matters
Weak entity boundaries create confusing queries, poor indexing choices, and schema drift.
Detection
If one entity mixes unrelated concepts or stores many unrelated record shapes, STOP and split the design.
Correct
json
{
"title": "review-schema-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}Wrong
json
{
"title": "everything-schema",
"type": "object"
}每个实体必须代表清晰的业务概念,并且schema与其预期用途匹配。
重要性
模糊的实体边界会导致查询混乱、索引选择不当以及schema漂移。
检测方式
如果一个实体混合了不相关的概念,或者存储了大量不相关的记录结构,立刻停止并拆分设计。
正确示例
json
{
"title": "review-schema-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}错误示例
json
{
"title": "everything-schema",
"type": "object"
}Constraint: Indexed fields must match real query behavior
约束:索引字段必须与实际查询行为匹配
Fields used in filters or lookups MUST be indexed intentionally.
Why this matters
Missing indexes lead to poor query behavior and unnecessary operational risk.
Detection
If queries depend on fields that are not represented in indexing strategy, STOP and align schema and access patterns.
Correct
json
{
"v-indexed": ["productId", "approved"]
}Wrong
json
{
"v-indexed": []
}过滤或查询中用到的字段必须有明确的索引。
重要性
缺失索引会导致查询性能差,带来不必要的运维风险。
检测方式
如果查询依赖的字段没有纳入索引策略,立刻停止并对齐schema与访问模式。
正确示例
json
{
"v-indexed": ["productId", "approved"]
}错误示例
json
{
"v-indexed": []
}Constraint: Schema lifecycle must be managed explicitly
约束:必须显式管理schema生命周期
Master Data schema evolution MUST be planned with cleanup and versioning in mind.
Why this matters
Unmanaged schema growth creates long-term operational pain and can run into platform limits.
Detection
If schema versions are added with no lifecycle or cleanup plan, STOP and define that plan.
Correct
text
review-schema-v1 -> review-schema-v2 with cleanup planWrong
text
review-schema-v1, v2, v3, v4, v5 with no cleanup strategyRemember that changing indexed fields or field types can affect how existing documents are indexed and queried. When schema evolution is coupled to frequent app version changes, this risk increases.
Master Data schema演进必须提前规划清理与版本管理机制。
重要性
不受管控的schema增长会带来长期运维负担,还可能触发平台限制。
检测方式
如果新增schema版本没有配套生命周期或清理计划,立刻停止并制定相关计划。
正确示例
text
review-schema-v1 -> review-schema-v2 配套清理计划错误示例
text
review-schema-v1, v2, v3, v4, v5 无清理策略请注意修改索引字段或字段类型会影响已有文档的索引和查询方式,如果schema演进与频繁的应用版本变更绑定,该风险会进一步提升。
Constraint: Entity and schema names must remain stable across environments
约束:实体与schema名称必须跨环境保持稳定
Entity names and schema identifiers MUST remain stable across accounts, workspaces, and environments. Do not encode account names, workspaces, or rollout flags into the entity or schema name itself.
Why this matters
Per-account or per-workspace schema naming leads to schema sprawl, harder lifecycle management, and operational limits that are difficult to clean up later.
Detection
If the design proposes one entity or schema per workspace, per account, or per environment, STOP and redesign around stable names with scoped fields or records instead.
Correct
text
review-schema-v1
RVWrong
text
review-schema-brazil-master
RV_US_MASTERUsing one clearly managed schema for development and one for production can be acceptable when there is a deliberate plan to keep them synchronized. Avoid generating schema names per workspace, per account, or per feature flag.
实体名称与schema标识符必须跨账户、工作区和环境保持稳定,不要将账户名称、工作区或发布标记编码到实体或schema名称本身。
重要性
按账户或工作区命名schema会导致schema泛滥、生命周期管理难度提升,还会触发后续难以清理的运维限制。
检测方式
如果设计方案提出为每个工作区、每个账户或每个环境单独创建实体或schema,立刻停止,改为使用带范围字段或记录的稳定名称重新设计。
正确示例
text
review-schema-v1
RV错误示例
text
review-schema-brazil-master
RV_US_MASTER如果有明确的同步计划,为开发和生产分别使用一套清晰管理的schema是可接受的,避免为每个工作区、每个账户或每个特性标记生成独立schema名称。
Preferred pattern
推荐模式
Use Master Data for structured custom records, index only what you query, and plan schema evolution deliberately.
Example: app owning a schema through the builder
masterdata- declares the schema and indexes for the
masterdata/review-schema-v1.jsonentity.RV - The app then uses a dedicated Master Data client to create and query documents.
RV
json
{
"title": "review-schema-v1",
"v-entity": "RV",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}Example: app consuming an existing schema through the client only
- This app declares no builder files.
masterdata - It uses the Master Data client against an existing, stable entity managed elsewhere.
RV
typescript
await ctx.clients.masterdata.createDocument({
dataEntity: 'RV',
fields: {
productId,
userId,
rating,
approved: false,
},
})Example: app owning a stable schema in code and ensuring it exists through the client
- The app keeps a stable schema definition in code.
- A controlled setup path ensures the schema exists instead of relying on the builder for every rollout.
masterdata
typescript
const schema = {
title: 'review-schema-v1',
'v-entity': 'RV',
}
const existing = await ctx.clients.masterdata.getSchema('review-schema-v1')
if (!existing) {
await ctx.clients.masterdata.createOrUpdateSchema('review-schema-v1', schema)
}将Master Data用于结构化自定义记录,仅为查询用到的字段建立索引,提前规划schema演进。
示例:应用通过 builder持有schema所有权
masterdata- 声明
masterdata/review-schema-v1.json实体的schema和索引。RV - 应用后续通过专用Master Data客户端创建和查询文档。
RV
json
{
"title": "review-schema-v1",
"v-entity": "RV",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}示例:应用仅通过客户端消费现有schema
- 应用不声明任何builder文件。
masterdata - 它通过Master Data客户端访问由其他方管理的现有稳定实体。
RV
typescript
await ctx.clients.masterdata.createDocument({
dataEntity: 'RV',
fields: {
productId,
userId,
rating,
approved: false,
},
})示例:应用在代码中持有稳定schema并通过客户端确保其存在
- 应用在代码中保存稳定的schema定义。
- 受控的设置路径会确保schema存在,而非每次发布都依赖builder。
masterdata
typescript
const schema = {
title: 'review-schema-v1',
'v-entity': 'RV',
}
const existing = await ctx.clients.masterdata.getSchema('review-schema-v1')
if (!existing) {
await ctx.clients.masterdata.createOrUpdateSchema('review-schema-v1', schema)
}Common failure modes
常见故障模式
- Creating entities that are too broad.
- Querying on fields that are not indexed.
- Accumulating schema versions with no lifecycle plan.
- Using Master Data as a high-volume log or event sink without retention or archival strategy.
- Storing configuration, secrets, or cross-app shared settings in Master Data instead of using configuration-specific mechanisms.
- Generating per-account or per-workspace entities such as instead of using a stable entity like
RV_storeA_masterwith scoped record fields.RV - Relying on the builder for frequent schema changes tied to every app version, causing excessive schema updates and indexing side effects over time.
masterdata
- 创建的实体范围过宽。
- 在未建立索引的字段上执行查询。
- 累积大量无生命周期计划的schema版本。
- 将Master Data用作无留存或归档策略的高容量日志或事件接收器。
- 在Master Data中存储配置、密钥或跨应用共享设置,而非使用专用配置机制。
- 生成按账户或工作区划分的实体(例如),而非使用带范围记录字段的稳定实体(例如
RV_storeA_master)。RV - 依赖builder处理与每个应用版本绑定的频繁schema变更,长期来看会导致过多的schema更新和索引副作用。
masterdata
Review checklist
审核检查清单
- Is Master Data the right storage mechanism for this use case?
- Should this app own the schema through the builder, or just consume an existing stable schema through the client?
masterdata - Would a stable schema in code plus a controlled setup path be safer than coupling schema rollout to every app version?
- Does each entity represent a clear business concept?
- Are entity and schema names stable across workspaces and accounts?
- Are filtered fields indexed intentionally?
- Is there a schema lifecycle plan?
- If different schemas are used for development and production, is there a clear plan to keep them synchronized without creating schema sprawl?
- 该场景下Master Data是合适的存储机制吗?
- 该应用应该通过builder持有schema所有权,还是仅通过客户端消费现有稳定schema?
masterdata - 代码中保存稳定schema加受控设置路径的方案,是否比schema发布与每个应用版本绑定的方案更安全?
- 每个实体都代表清晰的业务概念吗?
- 实体和schema名称跨工作区和账户保持稳定吗?
- 过滤用到的字段都有明确建立索引吗?
- 有对应的schema生命周期计划吗?
- 如果开发和生产使用不同的schema,是否有明确的同步计划避免schema泛滥?
Related skills
相关规范
- - Use when deciding between Master Data, VBase, VTEX core APIs, or external stores for a given dataset
vtex-io-data-access-patterns
- - 当你需要为特定数据集在Master Data、VBase、VTEX核心API或外部存储之间做选型时参考
vtex-io-data-access-patterns
Reference
参考资料
- Master Data - Platform data storage context
- Master Data - 平台数据存储背景说明