Skip to main content

Convert mainframe EBCDIC data into Unicode ASCII delimited text files

Project description

Mainframe EBCDIC Data Converter to ASCII

Description

A Python application is aimed to convert mainframe EBCDIC data into Unicode ASCII delimited text files.

Features

  • Supported layouts

    1. Single schema
    2. Multi-schema fixed record length
    3. Multi-schema variable record length
    4. Single schema variable record length
  • Fixing anomalies in EBCDIC files

    1. Skip header
    2. Skip footer
    3. Remove invalid characters
  • Adding relationship keys

    1. parent-child
    2. parent-child-grandchild
  • Applicable encodings

    1. Python
    2. Java
  • Debug mode to troubleshoot layouts

  • Run the tool

    1. Pip package
    2. Command prompt
    3. python -m ebcdic_parser
  • Exclude data from final output for a given field based on conditions.

  • Remove a delimiter, carriage return and newline characters from final output for string fields.

Pip Usage

Installation

pip install ebcdic_parser

Sample

from ebcdic_parser.convert import run 

returnCode=run(r"D:\Projects\ebcdic-parser\tests\test_data\311_calls_for_service_requests_all_strings\311_calls_for_service_requests_sample.dat", 
               r"D:\Projects\test_project",
               r"D:\Projects\ebcdic-parser\tests\layout_repository\311_calls_for_service_requests_all_strings.json",
               outputDelimiter=',',
               logfolder=r'D:\Projects\test_project\log')

print(returnCode)

Run Function

returnCode=run(inputFile, 
               outputFolder,
               layoutFile,
               logfolder='',
               pythonEncoding=True,
               encodingName='cp037',
               outputDelimiter=`\t`,
               outputFileExtension=`.txt`,
               ignoreConversionErrors=False,
               groupRecords=False,
               groupRecordsLevel2=False,
               verbose=True,
               debug=False,
               cliMode=False)
  • inputfile - input EBCDIC file path. Mandatory parameter. Absolute and relative paths are acceptable.
  • outputfolder - output folder to store delimited files. Mandatory parameter. Absolute and relative paths are acceptable.
  • layoutfile - layout file path. Mandatory parameter. Absolute and relative paths are acceptable.
  • outputdelimiter - output text file delimiter. Optional parameter. Default value is \t.
  • outputfileextension - output text file extension. Optional parameter. Default value is .txt.
  • ignoreconversionerrors - ignore any conversion error. Optional parameter. Default value is False.
  • logfolder - output folder to store log file. Optional parameter. Default value is the current folder. Absolute and relative paths are acceptable.
  • pythonencoding - use Python encoding rather than Java. Optional parameter. Default value is True.
  • encodingname - code page name to encode characters (Python or Java). Optional parameter. Default value is cp037.
  • grouprecords - create relationships between records. Optional parameter. Default value is False.
  • grouprecordslevel2 - create relationships between records for level 2. Optional parameter. Default value is False.
  • verbose - show extended information on screen. Optional parameter. Default value is True.
  • debug - show debug information. Optional parameter. Default value is False.
  • cliMode - a flag how it run in command prompt or Pip installation.
  • stripDelimiterValues - remove any delimiter (outputdelimiter) and carriage return\newline characters found in string type field values. Optional parameter. Default value is False.
  • returnCode - exit codes: 0 - successful completion, 1 - completion with any error

python -m ebcdic_parser Usage

python -m ebcdic_parser --inputfile "D:\Projects\ebcdic-parser\tests\test_data\311_calls_for_service_requests_all_strings\311_calls_for_service_requests_sample.dat" --outputfolder "D:\Projects\test_project" --layoutfile "D:\Projects\ebcdic-parser\tests\layout_repository\311_calls_for_service_requests_all_strings.json"  --outputdelimiter "," --logfolder "D:\Projects\test_project\log"

echo %ERRORLEVEL%

Arguments

[-h] 
--inputfile "input file path" 
--outputfolder "output folder" 
--layoutfile "layout file"
[--outputdelimiter [delimiter]]
[--outputfileextension [extension]]
[--ignoreconversionerrors [yes/no]]
[--logfolder [log folder]] [--pythonencoding [yes/no]]
[--encodingname [encoding name]]
[--grouprecords [yes/no]]
[--grouprecordslevel2 [yes/no]] [--verbose [yes/no]]
[--debug [yes/no]]

Arguments:
  -h, --help            show this help message and exit
  --inputfile "input file path"
                        Input EBCDIC file path
  --outputfolder "output folder"
                        Output folder to store delimited files
  --layoutfile "layout file"
                        Layout file path
  --outputdelimiter [delimiter]
                        output text file delimiter
  --outputfileextension [extension]
                        output text file extension
  --ignoreconversionerrors [yes/no]
                        ignore any conversion error
  --logfolder [log folder]
                        Output folder to store log file
  --pythonencoding [yes/no]
                        use Python encoding rather than Java
  --encodingname [encoding name]
                        Code page name to encode characters (Python or Java)
  --grouprecords [yes/no]
                        create relationships between records
  --grouprecordslevel2 [yes/no]
                        create relationships between records for level 2
  --verbose [yes/no]    show information on screen
  --debug [yes/no]      show debug information
  --stripdelimitervalues [yes/no]
                        strip delimiter characters from field values

Exit codes: 0 - successful completion, 1 - completion with any error

Java Encodings

It doesn't request any special installation if you are planning to use only Python encodings.

In case of using Java encodings, it has to be installed javabridge Python library. There is a constant in the code for including the javabridge library.

Release Notes

Release 3.4.0

  • Exclude data from final output for a given field based on conditions.
  • Remove a delimiter, carriage return and newline characters from final output for string fields.

Release 3.3.0

  • Return output to default device after completion.
  • Modified run() to pass returnCode to main() and as the final exit code.

Release 3.2.2

  • Automate version management.
  • Add CHANGES.md documentation.

Release 3.2.1

  • Added dedicated PyPI README file.
  • Refined documentation.

Release 3.2.0

  • Provide a command-line interface for ebcdic_parser package running via python -m ebcdic_parser.

Release 3.1.0

  • Fixed relative path parameters in command prompt mode.
  • Refined documentation.

Release 3.0.0

  • Added to PyPI.

Release 2.4.1

  • Fixed issue with "Multi-schema fixed record length" layout type.

Release 2.4.0

  • Added "Single schema variable record length" layout type.

Release 2.3.0

  • Implemented debug mode.

Release 2.1.1

  • Improved sign parsing in packedDecimal data type.
  • Added sign parsing to decimal data type.
  • Developed functional tests.

Release 2.1.0

  • The first public available release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ebcdic_parser-3.4.0.tar.gz (23.2 kB view details)

Uploaded Source

Built Distribution

ebcdic_parser-3.4.0-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file ebcdic_parser-3.4.0.tar.gz.

File metadata

  • Download URL: ebcdic_parser-3.4.0.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for ebcdic_parser-3.4.0.tar.gz
Algorithm Hash digest
SHA256 568eb633b5000e3ad986b37d3954a4474734dcff8786372cb7606487ac7b63d0
MD5 0668015f19e656ed97318b2f8428a8ec
BLAKE2b-256 8f4bc2a41fb96a050f866379a940510533d73b9ce6f33d18454614f92fd53fe0

See more details on using hashes here.

File details

Details for the file ebcdic_parser-3.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ebcdic_parser-3.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6db25ef34ecae9867cabed22df5f1bc0410621521116d6774b7f847bbe7e3e90
MD5 8b02c3be2a333ca6f3574e751307cafe
BLAKE2b-256 0e66e4bd3cf7401a8f5340d66f9b32dad2b00d035bef1178e9e3873d2406d0b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page