Call COBOL programs as Python functions — file I/O handled for you
Project description
pobol
Call COBOL programs as Python functions. Like maturin for Rust, but for GnuCOBOL.
Motivation: You're migrating COBOL off a mainframe. The programs do real work you want to keep, but they expect VSAM files, DD names, and JCL — none of which exist on Linux/macOS. pobol wraps cobc (GnuCOBOL) so you can:
- Point it at a
.cblfile — it parses the source and discovers everything - Pass Python dicts as input records
- Get Python dicts back from the output files
No copybook transcription, no temp-file juggling, no batch scripts.
Quick Start
uv add pobol # or: pip install pobol
Zero-config mode (auto-discovery)
from pobol import load
# pobol parses the COBOL source, discovers SELECT/FD clauses,
# extracts record layouts, strips mainframe artifacts, compiles, and
# handles all file I/O automatically.
report = load("customer_report.cob")
print(report.file_info)
# COBOL Program: customer_report.cob
#
# INPUTS:
# INPUT-FILE (45 bytes): IN-CUST-ID, IN-CUST-NAME, IN-BALANCE
# OUTPUTS:
# OUTPUT-FILE (54 bytes): OUT-CUST-ID, OUT-CUST-NAME, OUT-BALANCE, OUT-DISCOUNT
result = report(input_file=[
{"IN-CUST-ID": 1, "IN-CUST-NAME": "Alice", "IN-BALANCE": 2500.00},
{"IN-CUST-ID": 2, "IN-CUST-NAME": "Bob", "IN-BALANCE": 800.00},
])
for rec in result.output_file:
print(rec)
# {'out_cust_id': 1, 'out_cust_name': 'Alice', 'out_balance': 2500.0, 'out_discount': 250.0}
# {'out_cust_id': 2, 'out_cust_name': 'Bob', 'out_balance': 800.0, 'out_discount': 0.0}
Mainframe source — works directly
# Point at raw mainframe COBOL with sequence numbers, LABEL RECORDS,
# IBM-370, DD-name assigns — pobol handles it all:
prog = load("check_disbursement.cbl")
print(prog.file_info)
# Discovers all 6 files, 4 record layouts under one FD, 60+ fields...
# Strips sequence numbers, rewrites ASSIGN TO for env-var mapping,
# removes LABEL RECORDS/RECORDING MODE/BLOCK CONTAINS.
Explicit copybooks (optional override)
from pobol import load, parse_copybook
input_cb = parse_copybook("""
01 INPUT-RECORD.
05 CUST-ID PIC 9(6).
05 CUST-NAME PIC X(30).
05 BALANCE PIC 9(7)V9(2).
""")
output_cb = parse_copybook("""
01 OUTPUT-RECORD.
05 CUST-ID PIC 9(6).
05 CUST-NAME PIC X(30).
05 BALANCE PIC 9(7)V9(2).
05 DISCOUNT PIC 9(7)V9(2).
""")
report = load(
"customer_report.cob",
inputs={"INPUT-FILE": input_cb},
outputs={"OUTPUT-FILE": output_cb},
)
How It Works
┌──────────────────┐
Python dict ──▶ Copybook.encode() ──▶ temp file ──▶ │ │
│ cobc-compiled │
Python dict ◀── Copybook.decode() ◀── temp file ◀── │ COBOL program │
│ │
└──────────────────┘
- Parse source — discovers SELECT/FD/OPEN clauses, extracts PIC field layouts, detects input vs output files
- Strip mainframe — removes sequence numbers (cols 1-6), identification (cols 73-80), LABEL RECORDS, RECORDING MODE, BLOCK CONTAINS, fixes SOURCE-COMPUTER
- Rewrite assigns — converts
ASSIGN TO DATAIN(DD names) to working-storage paths loaded fromDD_*environment variables - Compile —
cobc -xcompiles to a native executable (cached by content hash) - Write inputs — your Python dicts are encoded as fixed-width records to temp files
- Run — the executable runs with
DD_*env vars pointing to temp files - Read outputs — output temp files are decoded back into Python dicts
Handling Multiple Record Types
Real mainframe COBOL often has multiple 01-level records under one FD (header, detail, trailer). pobol discovers all of them:
prog = load("check_disbursement.cbl")
# See all discovered layouts
for file_name, layouts in prog.record_layouts.items():
for rec_name, copybook in layouts.items():
print(f"{file_name}/{rec_name}: {len(copybook.fields)} fields")
# TXN-DATA-FILE/TXN-DATA-RECORD: 2 fields
# TXN-DATA-FILE/TXN-DATA-HDR-RECORD: 5 fields
# TXN-DATA-FILE/TXN-DATA-DETAIL-RECORD: 60 fields
# TXN-DATA-FILE/TXN-DATA-TRLR-RECORD: 5 fields
# For multi-record files, use raw_files with pre-encoded bytes:
header_bytes = header_copybook.encode(header_dict)
detail_bytes = detail_copybook.encode_many(detail_dicts)
trailer_bytes = trailer_copybook.encode(trailer_dict)
result = prog(raw_files={
"TXN-DATA-FILE": header_bytes + detail_bytes + trailer_bytes,
})
API Reference
load(source, **kwargs) → CobolProgram
Compile a COBOL source file and return a callable.
| Parameter | Type | Default | Description |
|---|---|---|---|
source |
str | Path |
required | Path to .cob/.cbl file |
inputs |
dict[str, Copybook] |
None |
Override auto-discovered input layouts |
outputs |
dict[str, Copybook] |
None |
Override auto-discovered output layouts |
dialect |
str |
None |
GnuCOBOL -std= dialect |
extra_flags |
list[str] |
None |
Extra cobc flags |
strip_mainframe |
bool |
True |
Strip mainframe format artifacts |
rewrite_assigns |
bool |
True |
Rewrite ASSIGN TO for env-var mapping |
CobolProgram.__call__(**kwargs) → CobolResult
| Parameter | Type | Description |
|---|---|---|
stdin |
str |
Data for ACCEPT/stdin |
env |
dict |
Extra environment variables |
timeout |
float |
Execution timeout (default 30s) |
raw_files |
dict[str, bytes] |
Pre-encoded file data by SELECT name |
**file_inputs |
list[dict] |
Input data as list of dicts, keyed by SELECT name |
CobolResult
| Attribute | Type | Description |
|---|---|---|
.stdout |
str |
Program stdout (DISPLAY output) |
.stderr |
str |
Program stderr |
.return_code |
int |
Exit code |
.outputs |
dict[str, list[dict]] |
Decoded output files |
.output_file |
list[dict] |
Shorthand for .outputs["OUTPUT-FILE"] |
parse_cobol_source(source, **kwargs) → ParsedSource
Parse without compiling. Returns discovered files, record layouts, and cleaned source.
Supported PIC Clauses
| PIC | Python type | Notes |
|---|---|---|
X(n), XX |
str |
Left-justified, space-padded |
9(n), 999 |
int |
Zero-padded |
S9(n) |
int |
Trailing sign overpunch |
9(n)V9(m), 9(5)V99 |
float |
Implied decimal (mixed forms supported) |
S9(n)V9(m) |
float |
Signed implied decimal |
Development
uv sync
uv run pytest -v
uv run python examples/demo.py
Prerequisites
- GnuCOBOL (
cobc) —brew install gnucobol/apt install gnucobol - Python 3.11+
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pobol-0.1.2.tar.gz.
File metadata
- Download URL: pobol-0.1.2.tar.gz
- Upload date:
- Size: 15.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d48be16a1b716a433c44ae9c133dff4c86b790fe4a1ce01eec117aba8704e52
|
|
| MD5 |
d89632bf3ba768be3da88d9ef544d0a6
|
|
| BLAKE2b-256 |
91c20a6f032b09ae3a45d4a3b301f8216a404ce439f08e820d43c194b65a2bc2
|
Provenance
The following attestation bundles were made for pobol-0.1.2.tar.gz:
Publisher:
ci.yml on andyreagan/pobol
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pobol-0.1.2.tar.gz -
Subject digest:
7d48be16a1b716a433c44ae9c133dff4c86b790fe4a1ce01eec117aba8704e52 - Sigstore transparency entry: 1097461759
- Sigstore integration time:
-
Permalink:
andyreagan/pobol@6fe97e821c77d935f933845f313e735f93245dc8 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/andyreagan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@6fe97e821c77d935f933845f313e735f93245dc8 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pobol-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pobol-0.1.2-py3-none-any.whl
- Upload date:
- Size: 18.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
800b07ecca226c09c3af07e1988728356239dbed711167399c8dc97d3326e52a
|
|
| MD5 |
e9a3e1be5ff725f62954cbc3d3462e79
|
|
| BLAKE2b-256 |
a4ff9a431831688cf6de38143605421c229504c40728475643634b7a2f42cb19
|
Provenance
The following attestation bundles were made for pobol-0.1.2-py3-none-any.whl:
Publisher:
ci.yml on andyreagan/pobol
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pobol-0.1.2-py3-none-any.whl -
Subject digest:
800b07ecca226c09c3af07e1988728356239dbed711167399c8dc97d3326e52a - Sigstore transparency entry: 1097461787
- Sigstore integration time:
-
Permalink:
andyreagan/pobol@6fe97e821c77d935f933845f313e735f93245dc8 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/andyreagan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@6fe97e821c77d935f933845f313e735f93245dc8 -
Trigger Event:
push
-
Statement type: