crm-data-quality
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRead first — JSONL piping, batch read, pagination, and dry-run/digest/confirm gating apply to every command below.
bulk-operations/SKILL.md请先阅读——JSONL管道传输、批量读取、分页以及试运行/摘要确认机制适用于以下所有命令。
bulk-operations/SKILL.mdProperty discovery
属性发现
Don't guess property names. List them:
bash
hubspot properties list --type contacts --format table
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label}'Same for , , or any custom type ().
--type companiesdealshubspot objects types不要猜测属性名称,直接列出它们:
bash
hubspot properties list --type contacts --format table
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label}'同样适用于、或任何自定义类型()。
--type companiesdealshubspot objects types1. Find incomplete records
1. 查找不完整记录
!namename--filterAND--filterbash
hubspot objects search --type contacts --filter "!email" --properties firstname,lastname,company
hubspot objects search --type contacts --filter "!phone AND !mobilephone" --properties email
hubspot objects search --type contacts --filter "!hubspot_owner_id" --properties email,lifecyclestageFor >100 results, use the pagination loop from .
bulk-operations!namename--filterAND--filterbash
hubspot objects search --type contacts --filter "!email" --properties firstname,lastname,company
hubspot objects search --type contacts --filter "!phone AND !mobilephone" --properties email
hubspot objects search --type contacts --filter "!hubspot_owner_id" --properties email,lifecyclestage若结果超过100条,请使用中的分页循环。
bulk-operations2. Normalize field values
2. 标准化字段值
Search → reshape with → pipe into . Always first; covers digest/confirm escalation for >100 rows. Reshape patterns: .
jqupdate--dry-runbulk-operationsbulk-operations/resources/json-patterns.mdbash
undefined搜索 → 用重塑数据 → 管道输入到命令。请始终先执行;当处理超过100行数据时,会触发摘要/确认流程。重塑模式可参考:。
jqupdate--dry-runbulk-operationsbulk-operations/resources/json-patterns.mdbash
undefinedCollapse spellings into one canonical value
将不同拼写统一为标准值
hubspot objects search --type contacts --filter "company~acme"
| jq -c '{id, properties:{company:"Acme Corporation"}}'
| hubspot objects update --type contacts --dry-run
| jq -c '{id, properties:{company:"Acme Corporation"}}'
| hubspot objects update --type contacts --dry-run
hubspot objects search --type contacts --filter "company~acme"
| jq -c '{id, properties:{company:"Acme Corporation"}}'
| hubspot objects update --type contacts --dry-run
| jq -c '{id, properties:{company:"Acme Corporation"}}'
| hubspot objects update --type contacts --dry-run
Lowercase emails (read, reshape, write)
将邮箱转换为小写(读取、重塑、写入)
hubspot objects search --type contacts --filter "email" --properties email
| jq -c '{id, properties:{email: (.properties.email | ascii_downcase)}}'
| hubspot objects update --type contacts --dry-run
| jq -c '{id, properties:{email: (.properties.email | ascii_downcase)}}'
| hubspot objects update --type contacts --dry-run
undefinedhubspot objects search --type contacts --filter "email" --properties email
| jq -c '{id, properties:{email: (.properties.email | ascii_downcase)}}'
| hubspot objects update --type contacts --dry-run
| jq -c '{id, properties:{email: (.properties.email | ascii_downcase)}}'
| hubspot objects update --type contacts --dry-run
undefined3. Dedupe with hubspot objects merge
hubspot objects merge3. 使用hubspot objects merge
去重
hubspot objects mergeSecondary is folded into primary and deleted. Irreversible. Dry-run/digest/confirm gating applies.
bash
undefined次要记录会合并到主记录中并被删除。此操作不可逆。试运行/摘要确认机制同样适用。
bash
undefinedSingle pair
单组记录对
hubspot objects merge --type contacts --primary 149 --secondary 425 --dry-run
hubspot objects merge --type contacts --primary 149 --secondary 425 # execute (≤100 pairs)
Bulk: pipe JSONL `{"primary":"...","secondary":"..."}` on stdin (omit `--primary`/`--secondary`).
**Pagination required.** `objects search` caps at 100 rows per call and `jq -s` slurps a single stream into memory — running the snippet below against a raw `search` will silently miss every duplicate that crosses a page boundary. Collect the full set first with the pagination loop from `bulk-operations/SKILL.md` (write to `/tmp/contacts.jsonl`), then dedupe from the file:
```bashhubspot objects merge --type contacts --primary 149 --secondary 425 --dry-run
hubspot objects merge --type contacts --primary 149 --secondary 425 # 执行操作(≤100组记录对)
批量处理:通过标准输入传入JSONL格式的`{"primary":"...","secondary":"..."}`(无需指定`--primary`/`--secondary`参数)。
**必须使用分页**。`objects search`每次调用最多返回100条记录,而`jq -s`会将单条数据流加载到内存中——直接对原始`search`结果运行以下代码片段会遗漏跨分页的重复记录。请先通过`bulk-operations/SKILL.md`中的分页循环收集完整数据集(写入`/tmp/contacts.jsonl`),再从文件中进行去重:
```bash/tmp/contacts.jsonl produced by the pagination loop (bulk-operations/SKILL.md)
/tmp/contacts.jsonl 由分页循环生成(参考bulk-operations/SKILL.md)
jq -s -c '
group_by(.properties.email)[]
| select(length > 1)
| sort_by(.id | tonumber)
| .[0].id as $p | .[1:][] | {primary: $p, secondary: .id}
' /tmp/contacts.jsonl
| hubspot objects merge --type contacts --dry-run | tee /tmp/merge-preview.jsonl
| hubspot objects merge --type contacts --dry-run | tee /tmp/merge-preview.jsonl
For >100 pairs, lift `digest` and `impact.records_affected` from the `BulkData` line and re-pipe the same producer with `--digest`/`--confirm` (see `bulk-operations`).jq -s -c '
group_by(.properties.email)[]
| select(length > 1)
| sort_by(.id | tonumber)
| .[0].id as $p | .[1:][] | {primary: $p, secondary: .id}
' /tmp/contacts.jsonl
| hubspot objects merge --type contacts --dry-run | tee /tmp/merge-preview.jsonl
| hubspot objects merge --type contacts --dry-run | tee /tmp/merge-preview.jsonl
若处理超过100组记录对,请从`BulkData`行中提取`digest`和`impact.records_affected`,并将相同的数据源通过`--digest`/`--confirm`参数重新管道输入(参考`bulk-operations`文档)。4. Audit properties
4. 审核属性
hubspot properties listgetbatch-read{name, label, type, fieldType, groupName}hubspot objects search ... --properties <enum>bash
undefinedhubspot properties listgetbatch-read{name, label, type, fieldType, groupName}hubspot objects search ... --properties <enum>bash
undefinedCount properties per group (HubSpot groups standard fields; custom groups stand out)
按分组统计属性数量(HubSpot会对标准字段分组,自定义分组会很显眼)
hubspot properties list --type contacts | jq -rs 'group_by(.groupName) | map({group: .[0].groupName, count: length}) | .[]'
hubspot properties list --type contacts | jq -rs 'group_by(.groupName) | map({group: .[0].groupName, count: length}) | .[]'
All enumeration properties
所有枚举类型属性
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label, fieldType}'
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label, fieldType}'
Create a DQ flag property, then set it via the normalize pattern in section 2
创建数据质量标记属性,然后通过第2节的标准化模式设置它
hubspot properties create --type contacts --name dq_missing_phone --label "DQ: Missing Phone" --prop-type string --field-type text
undefinedhubspot properties create --type contacts --name dq_missing_phone --label "DQ: Missing Phone" --prop-type string --field-type text
undefinedRecovery
恢复
Merge is irreversible. After any merge, captures the audit trail. If wrong direction, restore the secondary from the UI's recycle bin.
hubspot history --since 1h合并操作不可逆。任何合并操作后,会捕获审计日志。若合并方向错误,可从UI的回收站恢复次要记录。
hubspot history --since 1h