molt-verify

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

molt verify

molt verify

Compares source and target databases for schema (DDL) and row (data) consistency. Run after
molt fetch
to confirm migration integrity.
比较源数据库与目标数据库的架构(DDL)和行(数据)一致性。在执行
molt fetch
后运行此命令,以确认迁移的完整性。

Basic Structure

基本结构

bash
molt verify \
  --source "<source-conn>" \
  --target "<crdb-conn>" \
  [options]
bash
molt verify \
  --source "<source-conn>" \
  --target "<crdb-conn>" \
  [options]

Verification Phases

验证阶段

Phase 1 — Schema: Compares table presence, columns, types, NOT NULL constraints, and primary key structure.
Phase 2 — Rows (default,
--rows=true
): Iterates source rows in PK order and compares against target. Reports missing, extraneous, and mismatched rows per shard.
阶段1 — 架构:比较表的存在性、列、类型、NOT NULL约束以及主键结构。
阶段2 — 行数据(默认开启,
--rows=true
):按主键顺序遍历源数据库的行数据,并与目标数据库进行对比。按分片报告缺失、多余和不匹配的行。

Modes

模式

ModeCommandUse When
Full (default)
molt verify --source "..." --target "..."
Post-migration integrity check
Schema-only
molt verify ... --rows=false
Fast DDL check; no data I/O
Compile-only
molt verify ... --compile-only
Validate flag syntax without connecting
模式命令使用场景
完整验证(默认)
molt verify --source "..." --target "..."
迁移后完整性检查
仅架构验证
molt verify ... --rows=false
快速DDL检查;无数据I/O操作
仅编译验证
molt verify ... --compile-only
验证参数语法,无需连接数据库

Concurrency & Sharding

并发与分片

bash
undefined
bash
undefined

Default: CPU-count tables in parallel, 1 shard/table, 20k rows/batch

默认配置:并行处理与CPU数量相同的表,每个表1个分片,每批次20000行

molt verify --source "..." --target "..."
molt verify --source "..." --target "..."

Large tables: parallelize within a single table

大型表:在单个表内并行处理

molt verify --source "..." --target "..."
--concurrency 1 --concurrency-per-table 8 --row-batch-size 50000
molt verify --source "..." --target "..."
--concurrency 1 --concurrency-per-table 8 --row-batch-size 50000

Rate-limited (minimize production impact)

速率限制(降低对生产环境的影响)

molt verify --source "..." --target "..."
--rows-per-second 1000 --concurrency 2

Sharding splits a table's PK range across workers. Supported PK types: INT, FLOAT, DECIMAL, UUID. Falls back to a single full-scan for unsupported types.
molt verify --source "..." --target "..."
--rows-per-second 1000 --concurrency 2

分片将表的主键范围分配给多个工作进程。支持的主键类型:INT、FLOAT、DECIMAL、UUID。对于不支持的类型,将回退为单进程全量扫描。

Common Workflows

常见工作流程

1. Post-migration sanity check

1. 迁移后 sanity 检查

bash
molt verify \
  --source "postgresql://user:pass@pg:5432/db" \
  --target "postgresql://root@crdb:26257/db"
bash
molt verify \
  --source "postgresql://user:pass@pg:5432/db" \
  --target "postgresql://root@crdb:26257/db"

2. Schema-only (CI gate)

2. 仅架构验证(CI 门禁)

bash
molt verify \
  --source "..." --target "..." \
  --rows=false --non-interactive --log-file stdout
bash
molt verify \
  --source "..." --target "..." \
  --rows=false --non-interactive --log-file stdout

3. Filtered verification (subset of tables)

3. 过滤式验证(指定表子集)

bash
molt verify \
  --source "..." --target "..." \
  --table-filter "customers|orders"
bash
molt verify \
  --source "..." --target "..." \
  --table-filter "customers|orders"

4. Verify with column exclusions

4. 排除指定列的验证

bash
undefined
bash
undefined

transformations.json: {"tables":[{"name":"users","excludedColumns":["temp_col"]}]}

transformations.json: {"tables":[{"name":"users","excludedColumns":["temp_col"]}]}

molt verify
--source "..." --target "..."
--transformations-file transformations.json
undefined
molt verify
--source "..." --target "..."
--transformations-file transformations.json
undefined

5. Validate flags without connecting

5. 无需连接即可验证参数

bash
molt verify --source "..." --target "..." --compile-only
bash
molt verify --source "..." --target "..." --compile-only

Returns: {"status":"ok","message":"arguments parsed successfully"}

返回结果: {"status":"ok","message":"arguments parsed successfully"}

undefined
undefined

Source-Specific Prerequisites

特定数据源的先决条件

PostgreSQL: No special requirements. Partition tables (child partitions) are not supported — remove them before verifying.
MySQL: Queries current database only.
ONLY_FULL_GROUP_BY
may affect queries; disable if issues arise.
Oracle: Binary must be built with
CGO_ENABLED=1 -tags="cgo source_all"
. Oracle Instant Client in
LD_LIBRARY_PATH
. Use
--source-cdb
for multi-tenant (CDB) setups. Selective data verification (
--filter-path
) is not supported.
PostgreSQL:无特殊要求。不支持分区表(子分区)——验证前需移除分区。
MySQL:仅查询当前数据库。
ONLY_FULL_GROUP_BY
可能影响查询;若出现问题请禁用该选项。
Oracle:必须使用
CGO_ENABLED=1 -tags="cgo source_all"
编译二进制文件。需将Oracle Instant Client加入
LD_LIBRARY_PATH
。对于多租户(CDB)环境,使用
--source-cdb
参数。不支持选择性数据验证(
--filter-path
)。

Output & Reporting

输出与报告

Each table prints a summary per shard:
truth rows seen: 10000, success: 9950, missing: 5, mismatch: 45, extraneous: 0
  • missing: rows present on source but absent on target
  • extraneous: rows on target with no match on source
  • mismatch: rows present on both but values differ
Schema issues (missing/extra tables or columns, type mismatches, PK differences) are logged as warnings and do not stop row verification.
Prometheus metrics available at
--metrics-listen-addr
(default
localhost:8888
).
每个表会按分片打印汇总信息:
truth rows seen: 10000, success: 9950, missing: 5, mismatch: 45, extraneous: 0
  • missing:源数据库存在但目标数据库缺失的行
  • extraneous:目标数据库存在但源数据库无匹配的行
  • mismatch:两端都存在但值不同的行
架构问题(缺失/多余的表或列、类型不匹配、主键差异)会记录为警告,且不会终止行数据验证。
可通过
--metrics-listen-addr
获取Prometheus指标(默认地址为
localhost:8888
)。

Error Recovery

错误恢复

ErrorCauseFix
missing table X on target
Table not migratedRerun fetch or check filters
extraneous table X on target
Unexpected tableClean up or adjust
--table-filter
column type mismatch
Type conversion issueCheck type mappings or use
--transformations-file
PRIMARY KEY does not match
PK structure differsInspect schema conversion output
partition table X
Source has partition tablesDrop/move partitions before verifying
missing a PRIMARY KEY
No PK on source tableAdd PK or use
--rows=false
TLSModeDisableError
Insecure connection rejectedAdd
--allow-tls-mode-disable
Statement timeoutQuery exceeds
--verify-statement-timeout
Increase timeout or reduce
--row-batch-size
错误原因解决方法
missing table X on target
表未完成迁移重新执行fetch或检查过滤规则
extraneous table X on target
存在意外的表清理目标库或调整
--table-filter
参数
column type mismatch
类型转换问题检查类型映射或使用
--transformations-file
PRIMARY KEY does not match
主键结构不一致检查架构转换输出
partition table X
源数据库包含分区表验证前删除/迁移分区
missing a PRIMARY KEY
源表无主键添加主键或使用
--rows=false
TLSModeDisableError
不安全连接被拒绝添加
--allow-tls-mode-disable
参数
Statement timeout查询超出
--verify-statement-timeout
限制
增加超时时间或减小
--row-batch-size

Gotchas

注意事项

  • Schema changes between source and target after migration are not automatically reconciled — fix schema first, then re-run
  • --concurrency
    values exceeding 4× CPU count trigger a warning and may degrade performance
  • Row verification requires primary keys on both source and target tables; tables without PKs are skipped for row comparison
  • --filter-path
    (selective row filters) is not supported for Oracle sources
  • Log files contain sensitive query data; avoid
    --show-connection-logging
    in production logs
  • After fetch, always run verify before cutover to confirm data integrity
See flags reference for the full flag list.
  • 迁移后源数据库与目标数据库之间的架构变更不会自动同步——需先修复架构,再重新运行验证
  • --concurrency
    值超过CPU数量4倍时会触发警告,且可能导致性能下降
  • 行数据验证要求源表和目标表均有主键;无主键的表会被跳过行对比
  • Oracle数据源不支持
    --filter-path
    (选择性行过滤)参数
  • 日志文件包含敏感查询数据;生产环境中避免使用
    --show-connection-logging
  • 执行fetch后,务必在切换流量前运行verify以确认数据完整性
查看参数参考文档获取完整参数列表。