cobol-migration-analyzer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

COBOL Migration Analyzer

COBOL迁移分析工具

Analyze legacy COBOL programs and JCL scripts for migration to Java. Extract business logic, data structures, and dependencies to generate actionable migration strategies.
分析遗留COBOL程序和JCL脚本,以迁移至Java。提取业务逻辑、数据结构和依赖关系,生成可执行的迁移策略。

Core Capabilities

核心功能

1. COBOL Program Analysis

1. COBOL程序分析

Extract COBOL divisions (IDENTIFICATION, ENVIRONMENT, DATA, PROCEDURE), Working-Storage variables, file definitions (FD), business logic paragraphs, PERFORM statements, CALL hierarchies, embedded SQL, and error handling patterns.
提取COBOL的各个部(IDENTIFICATION、ENVIRONMENT、DATA、PROCEDURE)、工作存储变量、文件定义(FD)、业务逻辑段落、PERFORM语句、CALL调用层级、嵌入式SQL以及错误处理模式。

2. JCL Job Analysis

2. JCL作业分析

Parse JCL job steps, program invocations, data dependencies (DD statements), conditional logic (COND, IF/THEN/ELSE), return codes, and resource requirements.
解析JCL作业步骤、程序调用、数据依赖(DD语句)、条件逻辑(COND、IF/THEN/ELSE)、返回码以及资源需求。

3. Copybook Processing

3. Copybook处理

Extract record layouts with level numbers, REDEFINES clauses, group items, OCCURS clauses, and picture clauses. Generate Java POJOs from copybook structures.
提取带有层级编号、REDEFINES子句、组项、OCCURS子句和PICTURE子句的记录布局。根据copybook结构生成Java POJO。

4. Dependency Mapping

4. 依赖关系映射

Build complete dependency graphs showing CALL hierarchies, copybook usage, file dependencies, database table access, and shared utility references across the codebase.
构建完整的依赖关系图,展示整个代码库中的CALL调用层级、copybook使用情况、文件依赖、数据库表访问以及共享工具引用。

Workflow

工作流程

Step 1: Discover COBOL Assets

步骤1:发现COBOL资产

Find COBOL programs, JCL jobs, and copybooks:
bash
find . -name "*.cbl" -o -name "*.CBL" -o -name "*.cob"
find . -name "*.jcl" -o -name "*.JCL"
find . -name "*.cpy" -o -name "*.CPY"
Use
scripts/analyze-dependencies.sh
or
scripts/analyze-dependencies.ps1
to generate dependency graph.
查找COBOL程序、JCL作业和copybook:
bash
find . -name "*.cbl" -o -name "*.CBL" -o -name "*.cob"
find . -name "*.jcl" -o -name "*.JCL"
find . -name "*.cpy" -o -name "*.CPY"
使用
scripts/analyze-dependencies.sh
scripts/analyze-dependencies.ps1
生成依赖关系图。

Step 2: Extract Structure

步骤2:提取结构

Use
scripts/extract-structure.py
to parse COBOL programs and extract divisions, variables, paragraphs, and dependencies in JSON format.
使用
scripts/extract-structure.py
解析COBOL程序,以JSON格式提取各个部、变量、段落和依赖关系。

Step 3: Generate Java Code

步骤3:生成Java代码

Use
scripts/generate-java-classes.py
to convert copybooks to Java POJOs with appropriate data types and Bean Validation annotations.
使用
scripts/generate-java-classes.py
将copybook转换为带有适当数据类型和Bean Validation注解的Java POJO。

Step 4: Estimate Complexity

步骤4:评估复杂度

Use
scripts/estimate-complexity.py
to calculate migration complexity based on LOC, external calls, file operations, SQL statements, and control flow.
使用
scripts/estimate-complexity.py
根据代码行数(LOC)、外部调用、文件操作、SQL语句和控制流计算迁移复杂度。

Step 5: Create Migration Strategy

步骤5:制定迁移策略

Document program overview, dependencies, data structures, business logic patterns, proposed Java design, migration estimate, and action items.
记录程序概述、依赖关系、数据结构、业务逻辑模式、拟采用的Java设计、迁移估算和行动项。

Quick Reference

快速参考

COBOL to Java Type Mapping

COBOL到Java类型映射

COBOL PictureJava TypeNotes
PIC 9(n)
int
,
long
,
BigInteger
Unsigned numeric
PIC S9(n)
int
,
long
,
BigInteger
Signed numeric
PIC 9(n)V9(m)
BigDecimal
Unsigned decimal
PIC S9(n)V9(m)
BigDecimal
Signed decimal
PIC S9(n)V9(m) COMP-3
BigDecimal
Packed decimal - critical precision!
PIC S9(n) COMP
/
BINARY
int
,
long
Binary storage
PIC S9(n) COMP-1
float
Single precision (avoid for financial)
PIC S9(n) COMP-2
double
Double precision (avoid for financial)
PIC X(n)
String
Alphanumeric/character
PIC A(n)
String
Alphabetic only
PIC N(n)
String
National/Unicode
OCCURS n
List<T>
or
T[]
Fixed arrays/tables
OCCURS n DEPENDING ON
List<T>
Variable-length arrays
88 level
enum
or constants
Condition names
INDEX
int
Table index (1-based in COBOL)
COBOL PictureJava Type说明
PIC 9(n)
int
,
long
,
BigInteger
无符号数值
PIC S9(n)
int
,
long
,
BigInteger
有符号数值
PIC 9(n)V9(m)
BigDecimal
无符号十进制
PIC S9(n)V9(m)
BigDecimal
有符号十进制
PIC S9(n)V9(m) COMP-3
BigDecimal
压缩十进制 - 精度至关重要!
PIC S9(n) COMP
/
BINARY
int
,
long
二进制存储
PIC S9(n) COMP-1
float
单精度(财务场景避免使用)
PIC S9(n) COMP-2
double
双精度(财务场景避免使用)
PIC X(n)
String
字母数字/字符型
PIC A(n)
String
仅字母
PIC N(n)
String
国家/Unicode
OCCURS n
List<T>
or
T[]
固定数组/表
OCCURS n DEPENDING ON
List<T>
可变长度数组
88 level
enum
or constants
条件名称
INDEX
int
表索引(COBOL中为1-based)

Common Pattern Conversions

常见模式转换

  • File I/O:
    READ...AT END
    BufferedReader
    with try-with-resources or NIO streams
  • File updates:
    REWRITE
    → Update operations in DB or file systems
  • Table lookup:
    SEARCH
    → Linear search with streams
  • Binary search:
    SEARCH ALL
    Collections.binarySearch()
    or
    stream().filter().findFirst()
  • String operations:
    STRING/UNSTRING
    StringBuilder
    or
    String.split()
  • Inspection:
    INSPECT
    String.replace()
    ,
    replaceAll()
    , or regex
  • CALL statements: → Method calls or service invocations
  • EVALUATE: →
    switch
    statement (Java 14+ with enhanced switch)
  • Date arithmetic:
    FUNCTION INTEGER-OF-DATE
    LocalDate
    operations
  • ACCEPT DATE/TIME: →
    LocalDate.now()
    ,
    LocalTime.now()
  • Condition names (Level 88): →
    enum
    or typed constants
  • Computed GO TO: → Strategy pattern or switch statement
  • REDEFINES: → Union types, ByteBuffer views, or separate accessor classes
  • COPY statements: → Package imports or shared entity classes
  • 文件I/O
    READ...AT END
    → 使用带try-with-resources的
    BufferedReader
    或NIO流
  • 文件更新
    REWRITE
    → 数据库或文件系统中的更新操作
  • 表查找
    SEARCH
    → 使用流进行线性搜索
  • 二分查找
    SEARCH ALL
    Collections.binarySearch()
    stream().filter().findFirst()
  • 字符串操作
    STRING/UNSTRING
    StringBuilder
    String.split()
  • 检查操作
    INSPECT
    String.replace()
    replaceAll()
    或正则表达式
  • CALL语句:→ 方法调用或服务调用
  • EVALUATE:→
    switch
    语句(Java 14+ 增强版switch)
  • 日期运算
    FUNCTION INTEGER-OF-DATE
    LocalDate
    操作
  • 获取日期/时间
    ACCEPT DATE/TIME
    LocalDate.now()
    LocalTime.now()
  • 条件名称(Level 88):→
    enum
    或类型化常量
  • 计算式GO TO:→ 策略模式或switch语句
  • REDEFINES:→ 联合类型、ByteBuffer视图或单独的访问器类
  • COPY语句:→ 包导入或共享实体类

Example: Copybook to Java POJO

示例:Copybook转Java POJO

COBOL Copybook:
cobol
01  EMPLOYEE-RECORD.
    05  EMP-ID        PIC 9(6).
    05  EMP-NAME      PIC X(30).
    05  EMP-SALARY    PIC S9(7)V99 COMP-3.
Generated Java:
java
public class EmployeeRecord {
    private int empId;
    private String empName;
    private BigDecimal empSalary;
    // getters/setters
}
COBOL Copybook:
cobol
01  EMPLOYEE-RECORD.
    05  EMP-ID        PIC 9(6).
    05  EMP-NAME      PIC X(30).
    05  EMP-SALARY    PIC S9(7)V99 COMP-3.
生成的Java代码:
java
public class EmployeeRecord {
    private int empId;
    private String empName;
    private BigDecimal empSalary;
    // getters/setters
}

Migration Considerations

迁移注意事项

Critical Patterns:
  1. ALWAYS use
    BigDecimal
    for COMP-3 and numeric with decimals (never float/double)
  2. Preserve precision: Use
    BigDecimal
    with exact scale for financial calculations
  3. 1-based indexing: Document that COBOL arrays start at 1, Java at 0
  4. Implicit conversions: Make COBOL's automatic numeric↔string conversions explicit
  5. REDEFINES: Model as union type, ByteBuffer overlay, or separate view classes
  6. Computed GO TO: Refactor to strategy pattern or switch statement
  7. ALTER statement: Refactor to structured control flow (if/while/switch)
  8. PERFORM THRU: Map to single method containing full paragraph range
  9. BY REFERENCE vs BY CONTENT: Document parameter passing semantics
  10. Test rigorously: Validate with production data samples, especially for COMP-3
Output Requirements:
  • Program overview and type classification
  • Complete dependency graph (CALL tree, copybooks, files, DB tables)
  • Data structure mapping (copybooks → Java classes)
  • Business logic summary (key paragraphs → methods)
  • Proposed Java architecture (services, repositories, entities)
  • Migration effort estimate (complexity score, LOC, risk factors)
  • Prioritized action items
关键模式:
  1. 始终对COMP-3和带小数的数值使用
    BigDecimal
    (绝不要用float/double)
  2. 保留精度:对财务计算使用带精确刻度的
    BigDecimal
  3. 1-based索引:注意COBOL数组从1开始,Java从0开始
  4. 隐式转换:将COBOL自动的数值↔字符串转换显式化
  5. REDEFINES:建模为联合类型、ByteBuffer覆盖层或单独的视图类
  6. 计算式GO TO:重构为策略模式或switch语句
  7. ALTER语句:重构为结构化控制流(if/while/switch)
  8. PERFORM THRU:映射为包含完整段落范围的单个方法
  9. BY REFERENCE vs BY CONTENT:记录参数传递语义
  10. 严格测试:使用生产数据样本进行验证,尤其是COMP-3类型
输出要求:
  • 程序概述和类型分类
  • 完整的依赖关系图(调用树、copybook、文件、数据库表)
  • 数据结构映射(copybook → Java类)
  • 业务逻辑摘要(关键段落 → 方法)
  • 拟采用的Java架构(服务、仓库、实体)
  • 迁移工作量估算(复杂度评分、代码行数、风险因素)
  • 优先级行动项

Advanced Topics

高级主题

For detailed conversion rules and patterns, see:
  • pseudocode-cobol-rules.md - Comprehensive COBOL to pseudocode conversion rules including data types, statements, file operations, string operations, table operations, program control, translation patterns, and common gotchas
  • pseudocode-common-rules.md - Common pseudocode syntax and conventions applicable to all languages
  • transaction-handling.md - Transaction management and rollback strategies for CICS/IMS to Java
  • messaging-integration.md - Message queue and async patterns (MQ, CICS queues to JMS/Kafka)
  • performance-patterns.md - Batch processing optimization and memory management
  • testing-strategy.md - Comprehensive testing including unit, integration, parallel validation, and data-driven testing
如需详细的转换规则和模式,请参阅:
  • pseudocode-cobol-rules.md - 完整的COBOL转伪代码转换规则,涵盖数据类型、语句、文件操作、字符串操作、表操作、程序控制、转换模式以及常见陷阱
  • pseudocode-common-rules.md - 适用于所有语言的通用伪代码语法和约定
  • transaction-handling.md - CICS/IMS到Java的事务管理和回滚策略
  • messaging-integration.md - 消息队列和异步模式(MQ、CICS队列到JMS/Kafka)
  • performance-patterns.md - 批处理优化和内存管理
  • testing-strategy.md - 全面测试,包括单元测试、集成测试、并行验证和数据驱动测试

Tools and Scripts

工具与脚本

All scripts support cross-platform execution (Windows PowerShell, bash):
  • analyze-dependencies.sh/ps1
    - Generate dependency graph
  • extract-structure.py
    - Parse COBOL structure to JSON
  • generate-java-classes.py
    - Convert copybooks to Java POJOs
  • estimate-complexity.py
    - Calculate migration complexity score
Scripts use standard libraries only and output JSON for easy integration with CI/CD pipelines.
所有脚本支持跨平台执行(Windows PowerShell、bash):
  • analyze-dependencies.sh/ps1
    - 生成依赖关系图
  • extract-structure.py
    - 将COBOL结构解析为JSON
  • generate-java-classes.py
    - 将copybook转换为Java POJO
  • estimate-complexity.py
    - 计算迁移复杂度评分
脚本仅使用标准库,输出JSON格式,便于与CI/CD管道集成。