MCP server that parses COBOL programs — extracts divisions, SQL, CICS commands, and generates AI-powered business logic summaries via any LLM
Project description
cobol-parser-mcp
MCP server that parses COBOL mainframe programs and extracts structured data plus AI-powered business logic summaries — works with any LLM provider.
pip install cobol-parser-mcp # pure parsing, no LLM needed
pip install cobol-parser-mcp[anthropic] # + Claude
pip install cobol-parser-mcp[openai] # + GPT-4o
pip install cobol-parser-mcp[groq] # + Groq (fast + cheap)
pip install cobol-parser-mcp[all] # + all providers
Part of the mainframe modernization pipeline — takes the manifest.json from mainframe-ingest-mcp and produces per-program JSON files ready for code generation.
What it extracts
For every COBOL program:
| What | Details |
|---|---|
| IDENTIFICATION DIVISION | Program ID, author, date written |
| DATA DIVISION | All working storage variables with level numbers and PIC clauses |
| PROCEDURE DIVISION | Every paragraph with line ranges and PERFORM references |
| Embedded SQL | Every SELECT/INSERT/UPDATE/DELETE with tables and columns |
| Embedded CICS | Every SEND/RECEIVE MAP, LINK, XCTL, READ, WRITE |
| AI summary | Plain-English description of what the program does (any LLM) |
Claude Desktop config
{
"mcpServers": {
"cobol-parser-mcp": {
"command": "cobol-parser-mcp",
"env": {
"ANTHROPIC_API_KEY": "your-key-here"
}
}
}
}
Usage from Python
Parse a single file (no API key needed)
from cobol_parser_mcp.tools.parser import parse_cobol_file
import json
result = parse_cobol_file("/path/to/SS6001XX.cob")
print(json.dumps(result, indent=2))
Parse entire codebase with AI summaries
import asyncio
from cobol_parser_mcp.tools.batch import parse_all_programs
# Using Anthropic (Claude)
result = asyncio.run(parse_all_programs(
manifest_path="./manifest.json", # from mainframe-ingest-mcp
output_dir="./parsed",
ai_summaries=True,
provider="anthropic", # or openai, groq, ollama, custom
api_key="sk-ant-...", # or set ANTHROPIC_API_KEY env var
max_ai_programs=20,
))
print(f"Parsed {result['parsed_ok']} of {result['total_programs']} programs")
Using other LLM providers
# OpenAI
result = asyncio.run(parse_all_programs(
manifest_path="./manifest.json",
output_dir="./parsed",
provider="openai",
api_key="sk-...",
model="gpt-4o",
))
# Groq — fast and cheap
result = asyncio.run(parse_all_programs(
manifest_path="./manifest.json",
output_dir="./parsed",
provider="groq",
api_key="gsk_...",
))
# Ollama — fully local, no API key, no internet
result = asyncio.run(parse_all_programs(
manifest_path="./manifest.json",
output_dir="./parsed",
provider="ollama",
model="llama3",
))
# No AI at all — pure parsing only
result = asyncio.run(parse_all_programs(
manifest_path="./manifest.json",
output_dir="./parsed",
ai_summaries=False,
))
Output format
Each program produces a <PROGRAM_NAME>.json file:
{
"program": "SS6001XX",
"line_count": 1048,
"identification": {
"program_id": "SS6001XX",
"author": "STATE OF MARYLAND SDAT",
"date_written": "1995-06-12"
},
"data_division": {
"working_storage": [
{ "level": "1", "name": "WS-ENTITY-RECORD", "type": "GROUP" },
{ "level": "5", "name": "WS-ENTITY-ID", "type": "PIC X(10)" }
],
"copybooks_expanded": ["BUSENTIT", "SSCC03EQ", "DFHAID"]
},
"procedure_division": {
"paragraph_count": 7,
"paragraphs": [
{
"name": "0000-MAIN",
"lines": "28-45",
"calls_paragraphs": ["1000-INIT", "2000-PROCESS", "9999-EXIT"]
}
]
},
"sql_statements": [
{
"type": "SELECT",
"tables": ["BUSENTIT"],
"columns": ["ENTITY_ID", "ENTITY_NAME", "STATUS_CD"],
"where": "ENTITY_ID = :WS-ENTITY-ID",
"line": 167
}
],
"cics_commands": [
{ "command": "RECEIVE MAP", "map": "SS6TMAP", "mapset": "SS6TMAP", "line": 152 },
{ "command": "LINK PROGRAM", "program": "SS6009XX", "line": 334 }
],
"business_logic_summary": {
"db2_tables_read": ["BUSENTIT"],
"db2_tables_written": ["BUSENTIT", "TRNSACTN"],
"screens_used": ["SS6TMAP", "SS6XMAP"],
"programs_called": ["SS6009XX"],
"estimated_complexity": "HIGH",
"ai_summary": {
"purpose": "Handles online business entity inquiry and status update for CICS terminal users",
"business_domain": "Business Entity Registration",
"user_facing": true,
"key_operations": [
"Receive user input from SS6TMAP screen",
"Query BUSENTIT table by entity ID",
"Update entity status in BUSENTIT",
"Log transaction to TRNSACTN",
"Link to SS6009XX for downstream processing"
],
"modernization_notes": "Maps cleanly to GET /api/entity/{id} and PUT /api/entity/{id}/status REST endpoints"
}
}
}
A _index.json summary file is also written with stats across all programs.
Supported LLM providers
| Provider | Default model | API key env var | Notes |
|---|---|---|---|
anthropic |
claude-sonnet-4-20250514 |
ANTHROPIC_API_KEY |
Default |
openai |
gpt-4o |
OPENAI_API_KEY |
|
groq |
llama3-70b-8192 |
GROQ_API_KEY |
Fast and cheap |
ollama |
llama3 |
none | Fully local, free |
custom |
gpt-4o |
OPENAI_API_KEY |
Any OpenAI-compatible endpoint, pass base_url |
Part of the modernization pipeline
mainframe-ingest-mcp → manifest.json
↓
cobol-parser-mcp → per-program JSON files ← you are here
↓
bms-to-angular-mcp → Angular components
db2-schema-mcp → PostgreSQL schema
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cobol_parser_mcp-0.1.1.tar.gz.
File metadata
- Download URL: cobol_parser_mcp-0.1.1.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b01ea5396d7ae9211c3dea976e1efd89753bfe6e187cfabf91da50214b2a066
|
|
| MD5 |
c443b15b601eb022dbd8107f0d64cfae
|
|
| BLAKE2b-256 |
80b9ee1e6454537d3ca865cea4f73065a18b8e9b2ac13f59ff7069b040afe014
|
File details
Details for the file cobol_parser_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: cobol_parser_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb32ab3a3455532b3c78373366a9bf0ab2e7eb79bbe84d3b29f78c90bdd98c35
|
|
| MD5 |
59f3c65aa5e83d7a65284d7d5e303f58
|
|
| BLAKE2b-256 |
59f15b16db2bb7b3a95701c0093da957c7e9148ee73a4f48c32075015225c620
|