vtex-io-masterdata-strategy
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMaster Data Strategy
Master Data 策略
When this skill applies
适用场景
Use this skill when deciding whether Master Data v2 is the right mechanism for custom data in a VTEX IO app.
- Modeling reviews, wishlists, forms, or custom app records
- Choosing entity boundaries
- Planning schema indexing and lifecycle
- Reviewing long-term Master Data design
Do not use this skill for:
- low-level client usage details
- runtime or route structure
- app settings schemas
- frontend UI behavior
当你需要判断VTEX IO应用中的自定义数据是否适合使用Master Data v2作为存储机制时,可以参考本指南:
- 建模评论、愿望清单、表单或自定义应用记录
- 定义实体边界
- 规划schema索引与生命周期
- 评审Master Data长期设计
本指南不适用于以下场景:
- 底层客户端使用细节
- 运行时或路由结构
- 应用设置schema
- 前端UI行为
Decision rules
决策规则
- Use this skill once Master Data is a serious candidate storage mechanism. For the broader choice between Master Data, VBase, VTEX core APIs, and external stores, use .
vtex-io-data-access-patterns - Use Master Data for structured custom data that needs validation, indexing, and query support.
- Use the builder when this app introduces a new business entity, owns the data model, and wants the schema to be created and versioned as part of the app contract.
masterdata - Prefer using only the Master Data client when the entity and schema already exist and are shared or centrally managed, and this app only needs to read or write records without redefining the schema itself.
- For stable schemas that the app owns but should not be recreated or updated on every app version, keep the schema definition in code and use the Master Data client in a controlled setup path to create or update the schema only when needed.
- Remember that Master Data entities are account-scoped. Changing a shared entity or schema affects every app in that account that depends on it, so prefer client-only consumption when the schema is centrally managed.
- Keep entity boundaries intentional and aligned with the business concept being stored.
- Index fields that are actually used for filtering and search.
- Plan schema lifecycle explicitly to avoid schema sprawl.
- Consider data volume and retention from the start. If the dataset will grow unbounded and there is no retention or archival strategy, Master Data is likely not the right storage mechanism.
- Do not treat Master Data as an unbounded dumping ground for arbitrary payloads.
- Do not use Master Data as an unbounded log or event store for high-volume append-only data. Prefer dedicated logging or storage mechanisms when the main need is raw history rather than structured queries.
- Do not store secrets, credentials, or global app configuration in Master Data. Use app settings or configuration apps instead.
- Do not generate one entity or schema per account, workspace, or feature flag. Keep a stable entity name and distinguish tenants or environments through record fields when necessary.
- Be careful when tying schema evolution directly to app versioning through the builder. Frequent schema changes coupled to app releases can generate excessive schema updates, indexing changes, and long-term schema sprawl.
masterdata
- 当Master Data已成为备选存储机制时使用本指南。如果需要在Master Data、VBase、VTEX核心API和外部存储之间做更广泛的选型,请参考。
vtex-io-data-access-patterns - 对于需要校验、索引和查询支持的结构化自定义数据,使用Master Data。
- 当应用引入新的业务实体、拥有数据模型所有权,并且希望schema作为应用契约的一部分被创建和版本化时,使用builder。
masterdata - 当实体和schema已经存在、由共享或集中管理,且应用仅需要读写记录而无需重新定义schema本身时,优先仅使用Master Data客户端。
- 对于应用拥有所有权、但不需要在每个应用版本发布时都重新创建或更新的稳定schema,将schema定义保存在代码中,在受控的设置路径中使用Master Data客户端仅在需要时创建或更新schema。
- 请注意Master Data实体是账户级作用域的。修改共享实体或schema会影响该账户下所有依赖它的应用,因此当schema由集中管理时,优先仅通过客户端消费。
- 实体边界的定义要有明确意图,与存储的业务概念保持一致。
- 仅对实际用于过滤和搜索的字段建立索引。
- 明确规划schema生命周期,避免schema膨胀。
- 从一开始就考虑数据量和保留策略。如果数据集会无限制增长,且没有保留或归档策略,那么Master Data很可能不是合适的存储机制。
- 不要把Master Data当成任意载荷的无限制转储地。
- 不要将Master Data用作高容量仅追加数据的无限制日志或事件存储。当核心需求是原始历史记录而非结构化查询时,优先使用专用的日志或存储机制。
- 不要在Master Data中存储密钥、凭证或全局应用配置,请改用应用设置或配置应用。
- 不要为每个账户、工作区或功能标记生成单独的实体或schema。保持实体名称稳定,必要时通过记录字段区分租户或环境。
- 通过builder将schema演进直接与应用版本绑定时要谨慎。与应用发布绑定的频繁schema变更会导致过多的schema更新、索引变更和长期schema膨胀。
masterdata
Choosing between the masterdata
builder and the Master Data client
masterdata如何选择masterdata
builder和Master Data客户端
masterdataThere are three main ways for a VTEX IO app to work with Master Data:
-
Owning the schema via thebuilder:
masterdata- The app declares entities and schemas under in the repository.
masterdata/ - Schema fields, validation, and indexing evolve together with the app code.
- Use this when the app is the primary owner of the data model, schema changes are relatively infrequent, and the schema should be rolled out as part of the app contract.
- The app declares entities and schemas under
-
Consuming an existing schema via the Master Data client only:
- The app uses a Master Data client, but does not declare entities or schemas through the builder.
masterdata - The app assumes a stable schema managed elsewhere and only reads or writes records that follow that contract.
- Use this when the entity is shared across multiple apps or managed centrally, and this app should not redefine or fragment the schema across environments.
- The app uses a Master Data client, but does not declare entities or schemas through the
-
Owning a stable schema definition in code and applying it through the client:
- The app keeps a stable schema definition in code instead of builder files.
masterdata/ - A controlled setup path checks whether the schema exists and creates or updates it only when needed.
- Use this when the app truly owns the schema, but should not couple schema rollout to every app version or every release pipeline step.
- The app keeps a stable schema definition in code instead of
VTEX IO应用与Master Data交互主要有三种方式:
-
通过builder拥有schema所有权:
masterdata- 应用在代码库的目录下声明实体和schema。
masterdata/ - schema字段、校验规则和索引与应用代码共同演进。
- 当应用是数据模型的主要所有者、schema变更相对不频繁,且schema需要作为应用契约的一部分发布时,使用该方式。
- 应用在代码库的
-
仅通过Master Data客户端消费现有schema:
- 应用使用Master Data客户端,但不通过builder声明实体或schema。
masterdata - 应用假设存在由其他方管理的稳定schema,仅读写符合该契约的记录。
- 当实体由多个应用共享或集中管理,且应用不应该跨环境重新定义或拆分schema时,使用该方式。
- 应用使用Master Data客户端,但不通过
-
在代码中维护稳定schema定义并通过客户端应用:
- 应用在代码中维护稳定的schema定义,而非放在构建器文件中。
masterdata/ - 受控的设置路径会检查schema是否存在,仅在需要时创建或更新。
- 当应用确实拥有schema所有权,但不应该将schema发布与每个应用版本或每个发布流水线步骤绑定时,使用该方式。
- 应用在代码中维护稳定的schema定义,而非放在
Hard constraints
硬性约束
Constraint: Master Data entities must have explicit schema boundaries
约束:Master Data实体必须有明确的schema边界
Each entity MUST represent a clear business concept and have a schema that matches its intended usage.
Why this matters
Weak entity boundaries create confusing queries, poor indexing choices, and schema drift.
Detection
If one entity mixes unrelated concepts or stores many unrelated record shapes, STOP and split the design.
Correct
json
{
"title": "review-schema-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}Wrong
json
{
"title": "everything-schema",
"type": "object"
}每个实体必须代表清晰的业务概念,且schema与其预期用途匹配。
重要性
模糊的实体边界会导致查询混乱、索引选择不当和schema漂移。
检测方式
如果一个实体混合了不相关的概念,或者存储了多种不相关的记录结构,请立即停止并拆分设计。
正确示例
json
{
"title": "review-schema-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}错误示例
json
{
"title": "everything-schema",
"type": "object"
}Constraint: Indexed fields must match real query behavior
约束:索引字段必须与实际查询行为匹配
Fields used in filters or lookups MUST be indexed intentionally.
Why this matters
Missing indexes lead to poor query behavior and unnecessary operational risk.
Detection
If queries depend on fields that are not represented in indexing strategy, STOP and align schema and access patterns.
Correct
json
{
"v-indexed": ["productId", "approved"]
}Wrong
json
{
"v-indexed": []
}用于过滤或查找的字段必须有明确的索引。
重要性
缺失索引会导致查询性能差和不必要的运营风险。
检测方式
如果查询依赖的字段没有包含在索引策略中,请立即停止并对齐schema与访问模式。
正确示例
json
{
"v-indexed": ["productId", "approved"]
}错误示例
json
{
"v-indexed": []
}Constraint: Schema lifecycle must be managed explicitly
约束:schema生命周期必须显式管理
Master Data schema evolution MUST be planned with cleanup and versioning in mind.
Why this matters
Unmanaged schema growth creates long-term operational pain and can run into platform limits.
Detection
If schema versions are added with no lifecycle or cleanup plan, STOP and define that plan.
Correct
text
review-schema-v1 -> review-schema-v2 with cleanup planWrong
text
review-schema-v1, v2, v3, v4, v5 with no cleanup strategyRemember that changing indexed fields or field types can affect how existing documents are indexed and queried. When schema evolution is coupled to frequent app version changes, this risk increases.
Master Data schema演进必须考虑清理和版本规划。
重要性
不受管理的schema增长会带来长期运营负担,并且可能触及平台限制。
检测方式
如果新增schema版本时没有生命周期或清理计划,请立即停止并定义该计划。
正确示例
text
review-schema-v1 -> review-schema-v2 with cleanup plan错误示例
text
review-schema-v1, v2, v3, v4, v5 with no cleanup strategy请注意,修改索引字段或字段类型会影响现有文档的索引和查询方式。当schema演进与频繁的应用版本变更绑定的时候,这种风险会更高。
Constraint: Entity and schema names must remain stable across environments
约束:实体和schema名称必须跨环境保持稳定
Entity names and schema identifiers MUST remain stable across accounts, workspaces, and environments. Do not encode account names, workspaces, or rollout flags into the entity or schema name itself.
Why this matters
Per-account or per-workspace schema naming leads to schema sprawl, harder lifecycle management, and operational limits that are difficult to clean up later.
Detection
If the design proposes one entity or schema per workspace, per account, or per environment, STOP and redesign around stable names with scoped fields or records instead.
Correct
text
review-schema-v1
RVWrong
text
review-schema-brazil-master
RV_US_MASTERUsing one clearly managed schema for development and one for production can be acceptable when there is a deliberate plan to keep them synchronized. Avoid generating schema names per workspace, per account, or per feature flag.
实体名称和schema标识符必须跨账户、工作区和环境保持稳定。不要将账户名称、工作区或发布标记编码到实体或schema名称本身中。
重要性
按账户或工作区命名schema会导致schema膨胀、生命周期管理难度加大,以及后续难以清理的运营限制。
检测方式
如果设计提出为每个工作区、每个账户或每个环境单独创建实体或schema,请立即停止,改用带作用域字段或记录的稳定名称重新设计。
正确示例
text
review-schema-v1
RV错误示例
text
review-schema-brazil-master
RV_US_MASTER如果有明确的同步计划,为开发和生产使用独立的清晰管理的schema是可接受的。避免为每个工作区、每个账户或每个功能标记生成schema名称。
Preferred pattern
推荐模式
Use Master Data for structured custom records, index only what you query, and plan schema evolution deliberately.
Example: app owning a schema through the builder
masterdata- declares the schema and indexes for the
masterdata/review-schema-v1.jsonentity.RV - The app then uses a dedicated Master Data client to create and query documents.
RV
json
{
"title": "review-schema-v1",
"v-entity": "RV",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}Example: app consuming an existing schema through the client only
- This app declares no builder files.
masterdata - It uses the Master Data client against an existing, stable entity managed elsewhere.
RV
typescript
await ctx.clients.masterdata.createDocument({
dataEntity: 'RV',
fields: {
productId,
userId,
rating,
approved: false,
},
})Example: app owning a stable schema in code and ensuring it exists through the client
- The app keeps a stable schema definition in code.
- A controlled setup path ensures the schema exists instead of relying on the builder for every rollout.
masterdata
typescript
const schema = {
title: 'review-schema-v1',
'v-entity': 'RV',
}
const existing = await ctx.clients.masterdata.getSchema('review-schema-v1')
if (!existing) {
await ctx.clients.masterdata.createOrUpdateSchema('review-schema-v1', schema)
}将Master Data用于结构化自定义记录,仅为查询用到的字段建立索引,并且有意识地规划schema演进。
示例:应用通过 builder拥有schema所有权
masterdata- 声明
masterdata/review-schema-v1.json实体的schema和索引。RV - 应用使用专用的Master Data客户端创建和查询文档。
RV
json
{
"title": "review-schema-v1",
"v-entity": "RV",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}示例:应用仅通过客户端消费现有schema
- 应用不声明任何builder文件。
masterdata - 它使用Master Data客户端访问由其他方管理的现有稳定实体。
RV
typescript
await ctx.clients.masterdata.createDocument({
dataEntity: 'RV',
fields: {
productId,
userId,
rating,
approved: false,
},
})示例:应用在代码中维护稳定schema并通过客户端确保其存在
- 应用在代码中维护稳定的schema定义。
- 受控的设置路径确保schema存在,而不是依赖builder进行每次发布。
masterdata
typescript
const schema = {
title: 'review-schema-v1',
'v-entity': 'RV',
}
const existing = await ctx.clients.masterdata.getSchema('review-schema-v1')
if (!existing) {
await ctx.clients.masterdata.createOrUpdateSchema('review-schema-v1', schema)
}Common failure modes
常见故障模式
- Creating entities that are too broad.
- Querying on fields that are not indexed.
- Accumulating schema versions with no lifecycle plan.
- Using Master Data as a high-volume log or event sink without retention or archival strategy.
- Storing configuration, secrets, or cross-app shared settings in Master Data instead of using configuration-specific mechanisms.
- Generating per-account or per-workspace entities such as instead of using a stable entity like
RV_storeA_masterwith scoped record fields.RV - Relying on the builder for frequent schema changes tied to every app version, causing excessive schema updates and indexing side effects over time.
masterdata
- 创建的实体范围过宽。
- 在未建立索引的字段上查询。
- 积累大量没有生命周期计划的schema版本。
- 将Master Data用作没有保留或归档策略的高容量日志或事件接收器。
- 在Master Data中存储配置、密钥或跨应用共享设置,而不是使用专用的配置机制。
- 生成按账户或工作区划分的实体,比如,而不是使用带作用域记录字段的稳定实体如
RV_storeA_master。RV - 依赖builder处理与每个应用版本绑定的频繁schema变更,长期来看会导致过多的schema更新和索引副作用。
masterdata
Review checklist
审核检查清单
- Is Master Data the right storage mechanism for this use case?
- Should this app own the schema through the builder, or just consume an existing stable schema through the client?
masterdata - Would a stable schema in code plus a controlled setup path be safer than coupling schema rollout to every app version?
- Does each entity represent a clear business concept?
- Are entity and schema names stable across workspaces and accounts?
- Are filtered fields indexed intentionally?
- Is there a schema lifecycle plan?
- If different schemas are used for development and production, is there a clear plan to keep them synchronized without creating schema sprawl?
- Master Data是否是该用例的合适存储机制?
- 该应用应该通过builder拥有schema所有权,还是仅通过客户端消费现有稳定schema?
masterdata - 与将schema发布与每个应用版本绑定相比,代码中维护稳定schema加受控设置路径的方式是否更安全?
- 每个实体是否都代表清晰的业务概念?
- 实体和schema名称是否跨工作区和账户保持稳定?
- 过滤用的字段是否都有明确的索引?
- 是否有schema生命周期计划?
- 如果开发和生产使用不同的schema,是否有明确的同步计划避免schema膨胀?
Related skills
相关技能
- - Use when deciding between Master Data, VBase, VTEX core APIs, or external stores for a given dataset
vtex-io-data-access-patterns
- - 用于为给定数据集在Master Data、VBase、VTEX核心API或外部存储之间做选型决策
vtex-io-data-access-patterns
Reference
参考资料
- Master Data - Platform data storage context
- Master Data - 平台存储上下文说明