What AI Can and Cannot Do with COBOL Source Code

An honest assessment from someone who has been working with COBOL since 1990 and watching AI tools applied to it since they became capable enough to be worth evaluating.

AI & Mainframe · Carmi Sternberg · May 2026 · 10 min read

There is a lot of noise about AI and COBOL right now. Some of it is vendors overpromising what their tools can do. Some of it is sceptics dismissing AI entirely for legacy code. The truth is more useful than either position.

AI tools can genuinely help with COBOL – in specific, well-defined ways. They also produce confidently wrong answers in specific, well-defined ways. Knowing which is which is the practical skill that matters.

I have been working with COBOL since 1990 and have been evaluating AI tools applied to it since they became capable enough to be worth evaluating. Here is my honest assessment.

Keep reading to the end. There is a risk hiding at the bottom of this article that almost nobody in the AI-for-COBOL space talks about – and it is the kind of risk that makes everything else in this list look minor by comparison.

What AI does well with COBOL

Explaining unfamiliar code. If you open a COBOL program you have never seen before, AI assistants are genuinely useful for getting oriented. Paste a paragraph and ask what it does – you will usually get a clear, accurate explanation of the logic. For developers who can read COBOL but are not fluent, this accelerates comprehension significantly.

Identifying dead code. AI tools can help identify code paths that appear unreachable – conditions that are never true given the data structures, PERFORM statements to paragraphs that are never called, code after unconditional STOP RUN statements. Important caveat: dead code identification from static analysis alone is unreliable. Code that looks unreachable may be called from a different program via CALL.

Generating documentation stubs. AI is reasonably good at generating initial documentation from COBOL source – describing what a program appears to do, listing its inputs and outputs, summarising its main processing steps. The generated documentation will be accurate at the code level. It will miss the business context – why the program does what it does, what business rule it implements, what regulatory requirement it satisfies.

Explaining COBOL syntax. For developers learning COBOL, AI assistants are excellent at explaining syntax, verb usage, and language constructs. They have absorbed enough COBOL documentation to give accurate, helpful explanations of most language features.

Refactoring suggestions. AI can suggest refactoring opportunities – consolidating duplicated logic, simplifying nested EVALUATE structures. These suggestions are often technically correct. Whether to apply them in production requires human judgment about risk tolerance and testing coverage.

What AI does poorly with COBOL

Understanding data without copybooks. COBOL programs define their data structures in the DATA DIVISION, but most production COBOL uses COPY statements to include external copybooks – shared data structure definitions that live in separate libraries. A program that says COPY CUSTACCT does not contain the definition of the customer account record – it references it. Any AI analysis of COBOL that does not resolve copybooks is working with incomplete information.

Resolving inter-program dependencies. A COBOL CALL statement invokes another program. In a large mainframe application with hundreds of programs, this dependency graph is complex. A tool that analyses one program at a time cannot tell you what the called program does or whether the data being passed to it is correct.

Inferring business rules from code alone. The most dangerous failure mode. An AI tool reads a specific calculation in a COBOL program and infers what business rule it implements. The inference may be technically correct – the calculation is accurately described. But the business context – why this specific formula, what regulatory requirement mandated it, what exception applies in specific circumstances – is not in the code.

"When an AI tool tells you what a piece of COBOL does, it is telling you what the code says. It is not telling you what the code means in the business context."

JCL context is invisible to AI. A COBOL program runs inside a JCL job. The JCL defines which datasets are actually allocated to each DD name, which PROC is invoked, what PARM values are passed at execution time. A program that reads from SYSUT1 could be reading any dataset – the COBOL source does not tell you which one. AI analysing the source cannot see the execution environment the program actually runs in. This matters enormously for impact analysis. Knowing what the code does is not the same as knowing what it does to which data.

CICS transaction context. A CICS COBOL program behaves completely differently depending on the transaction that invokes it, the COMMAREA passed to it, the terminal type, the CICS region configuration, and the resource definitions in the CSD. AI reading the source sees the code. It cannot see the execution context. A program that appears to display a customer record may display different records, behave differently under different transaction IDs, or fail entirely if the COMMAREA length does not match what the calling transaction sends. None of this is visible in the source.

DB2 bind parameters and plan currency. A COBOL program with embedded SQL is bound to a DB2 plan or package. The bind parameters determine isolation level, optimisation strategy, and which version of the static SQL is actually executing. If the plan is out of date relative to the source – which happens in large shops where rebinds are controlled processes – the SQL the AI reads in the source may not be the SQL that runs. AI cannot see bind parameters, plan timestamps, or whether the currently bound plan matches the source it is analysing.

VSAM file definitions. VSAM file attributes – key position, key length, record format, CI size, alternate indexes – are defined in the catalog, not in the COBOL source. A program that opens a VSAM KSDS and reads by key will fail if the key definition in the program does not match the actual file definition. AI reading the source cannot verify this match. In large applications where VSAM files are shared across dozens of programs, this is a real source of errors that static source analysis cannot detect.

Handling system-specific behaviour. COBOL programs interact with z/OS services in ways that are specific to the mainframe environment – EXEC CICS commands, DB2 EXEC SQL statements, z/OS system calls, SMF record writing. AI tools have varying levels of understanding of these interfaces. Some handle EXEC SQL well. Most have limited understanding of EXEC CICS beyond basic transaction management.

COBOL is not a program. It is one layer of a system. The source code layer is the most readable layer – which is why AI tools focus on it. But the JCL layer, the CICS layer, the DB2 layer, and the file definition layer are where production behaviour actually lives. AI tools that analyse the source layer are seeing one layer of a multi-layer system and reporting on the whole.

AI CAN

Explain self-contained COBOL paragraphs
Flag potentially unreachable code paths
Generate first-draft documentation stubs
Explain COBOL syntax and language constructs
Suggest refactoring opportunities

AI CANNOT

Analyse programs without their copybooks
Resolve full inter-program call chains reliably
Infer the business reason behind a threshold or formula
Understand EXEC CICS and z/OS services deeply
Tell you what the code means, only what it says

The copybook problem in detail

Consider a COBOL program with this data definition:

01 WS-CUSTOMER-RECORD. COPY CUSTACCT.

The program uses fields from CUSTACCT throughout its logic. An AI tool that only sees the program source sees references to fields it has no definition for. It will make inferences about what those fields probably contain based on their names and how they are used. Those inferences will often be wrong. A field named WS-CUST-BAL might be a balance in dollars, in cents, in basis points, or in some internal coded representation. The copybook definition does tell you. The name alone does not.

Before applying AI analysis to any COBOL program, ensure the tool has access to all copybooks the program uses. Without them, the analysis is built on guesswork dressed up as insight.

How to use AI on COBOL safely

Verify AI explanations against runtime behaviour. When an AI tool explains what a COBOL program does, treat it as a hypothesis, not a fact. Verify against runtime evidence – what the program actually produces when it runs, what the SMF records show, what the output datasets contain. Discrepancies between what AI says a program does and what it actually does are usually where the important undocumented business logic lives.

Always provide copybooks. When submitting COBOL for AI analysis, provide the full copybook library, not just the program source. Any tool that does not ask for copybooks is either handling them internally (check how) or ignoring them (treat its output with caution).

Use AI for first pass, human for verification. AI is good for getting oriented in unfamiliar code. It is not reliable enough for making production decisions on its own. Use AI to accelerate the human analysis, not to replace it.

Treat business rule inferences as hypotheses. When AI infers a business rule from COBOL code, write it down as a hypothesis and validate it with someone who knows the business context. This is most urgent for programs that have not been touched in years and whose original authors are no longer available.

⚠️ !!! The most dangerous assumption in COBOL AI analysis !!!

AI reads the source code you give it. It has no way of knowing whether that source code is what was actually compiled into the production module running on your system right now.

This is not a theoretical risk. It happens constantly:

A developer made a hotfix directly to the load library and never updated the source
The source was modified but never recompiled
The wrong version of a copybook was used at compile time
The production module came from a different source library than the one you’re looking at
CSECT timestamps don’t match the source modification date
The compiler version or compile options (OPTIMIZE, RENT, ARCH, CODEPAGE) were different from what you assume – the same source compiled with different options can produce meaningfully different runtime behaviour

AI will analyse whatever you give it – confidently, thoroughly, and completely wrong if the source doesn’t match the object.

Before trusting any AI analysis of COBOL source:

Compare the compile date in the load module (AMBLIST output) against the source modification date
Verify the load module was compiled from THIS source, not a different version
Check copybook versions – the copybooks in your source library may not match what was used at compile time
Identify the compiler version and compile options that produced the module – OPTIMIZE level, RENT/NORENT, ARCH target, and CODEPAGE can all affect runtime behaviour in ways the source alone will never reveal

AI cannot do this for you. No tool can do this automatically without access to your build history and load library metadata.

If the source doesn’t match the object, the analysis is fiction.

Also in this series: Why Generic AI Tools Fail on Mainframe · The Hidden Risk in Every COBOL Migration Project · Why Mainframe is Different

IMUAI – AI diagnostics that start from runtime evidence Rather than inferring from source code alone, IMUAI grounds its analysis in what the system actually records at runtime.

Learn more

Working on Linux and mainframe? IM3270 is a modern 3270 terminal emulator for Linux – free 60-day trial, no credit card required.

Download Free

Carmi Sternberg

FOUNDER, INFOMANTA LTD · MAINFRAME SINCE 1990

Blog