Cobol IBM Mainframe syntax lexer for Pygments
Project description
=============================================
Cobol IBM Mainframe syntax lexer for Pygments
=============================================
**This package contains a Pygments Lexer for mainframe cobol.**
**The lexer parses the Enterprise Cobol dialect (V3R4) for z/OS, including utilizing embedded Db2/Sql, Cics and DLi**
.. _onlinePygmentsLexer: http://datim.fr/cobtools/pygments/
.. contents:: Contents
:depth: 5
mainframe cobol coding form
===========================
Many early programming languages, including PL/1, Fortran, Cobol and the various IBM assembler languages,
used only the first 7-72 columns of a 80-column card
+----------+--------------------------------------------------------------------------+
| Columns | |
+==========+==========================================================================+
| 1- 6 |Tags, Remarks or Sequence numbers identifying pages or lines of a program |
+----------+--------------------------------------------------------------------------+
| 7 | - ``*`` (asterisk) designates entire line as comment |
| | - ``/`` (slash) forces page break when printing source listing |
| | - ``-`` (dash) to indicate continuation of nonnumeric literal |
| | - ``D`` to indicate debug line cobol statements |
+----------+--------------------------------------------------------------------------+
| 8 - 72 |COBOL program statements, divided into two areas : |
| | - Area A : columns 8 to 11 |
| | - Area B : columns 12 to 72 |
+----------+--------------------------------------------------------------------------+
| 73 - 80 | Tags, Remarks or Sequence numbers (often garbage...) |
+----------+------------+-------------------------------------------------------------+
Division, section and paragraph-names must all begin in Area A and end with a period.
``CBL/PROCESS`` directives statement can start in columns 1 through 70
Installation
============
The lexer is available as a Pip package:
``$ sudo pip install pygments_ibm_cobol_lexer``
Or using easy_install:
``$ sudo easy_install pygments_ibm_cobol_lexer``
Usage
=====
After installation the ibmcobol Lexer and ibmcobol Style automatically registers itself for files with the ".cbl" extensions.
Therefore, cmdline usage is easy:
+ Ascii input :
``pygmentize -O full,style=ibmcobol,encoding=latin1 -o HORREUR.html HORREUR.ascii.cbl``
+ Ebcdic input (in this case it's necessary to specify outencoding value):
``pygmentize -O full,style=ibmcobol,encoding=cp1147,outencoding=latin1 -o COB001.html COB001.cp1147.cbl``
Or as library usage:
::
from pygments import highlight
from pygments.formatters import HtmlFormatter
from pygments_ibm_cobol_lexer import IBMCOBOLLexer, IBMCOBOLStyle
my_code = open("cobol_ebcdic.cbl",'rb').read()
highlight(my_code,IBMCOBOLLexer(encoding='cp1140'),
HtmlFormatter(style=IBMCOBOLStyle, full=True),
open('test.html','w'))
Also see the ``pygments_tests/`` directory
You can test this lexer online here: onlinePygmentsLexer_
About cp1147
============
I have files coded IBM1147 (EBCDIC french + euro sign), I was forced to write my own codec ``cp1147``, very close to the ``cp500`` (Canada, Belgium), it diverges on the characters "@\°{}§ùµ£à[€`¨#]~éè¦ç" :
This codec is available as a Pip package:
``$ sudo pip install cp1147``
Changelog
=========
1.2 - (2013-11-25)
- reordering variable,number,operator regex in cics/dli/sql
- Tokenize cobol special-register
- codec module cp1147 removed
- change in DB2 and COBOL keywords
- if encoding is in ('cp037','cp297','cp1140','cp1047','cp1147','cp500'), it's managed as 'latin-1'
1.1 - (2012-11-19)
Minor Fix + EBCDIC enhancements:
- Fix : float regex detection before integer detection
- Add inline-commentaire ``*>`` (not the IBM default)
- Change cics/dli keywords color...
- Extend CICS_KEYWORDS, remove EJECT/SKIP from COBOL_KEYWORDS (treated as comments)
- each ASCII input lines is padded to 80 columns
- Add EBCDIC features:
* add my own french codec cp1147
* if EBCDIC encoding is passed (cp500,cp1140,...) or detected,convert the binary input raw text in 80 columns fixed lines
* ``encoding=chardet`` (slowly) does not detect EBCDIC chart,it's override with ``encoding=guess``
* "guess EBCDIC" is defaulted to ``self.encoding='cp500'``
1.0 - (2012-11-12)
Initial release.
Cobol IBM Mainframe syntax lexer for Pygments
=============================================
**This package contains a Pygments Lexer for mainframe cobol.**
**The lexer parses the Enterprise Cobol dialect (V3R4) for z/OS, including utilizing embedded Db2/Sql, Cics and DLi**
.. _onlinePygmentsLexer: http://datim.fr/cobtools/pygments/
.. contents:: Contents
:depth: 5
mainframe cobol coding form
===========================
Many early programming languages, including PL/1, Fortran, Cobol and the various IBM assembler languages,
used only the first 7-72 columns of a 80-column card
+----------+--------------------------------------------------------------------------+
| Columns | |
+==========+==========================================================================+
| 1- 6 |Tags, Remarks or Sequence numbers identifying pages or lines of a program |
+----------+--------------------------------------------------------------------------+
| 7 | - ``*`` (asterisk) designates entire line as comment |
| | - ``/`` (slash) forces page break when printing source listing |
| | - ``-`` (dash) to indicate continuation of nonnumeric literal |
| | - ``D`` to indicate debug line cobol statements |
+----------+--------------------------------------------------------------------------+
| 8 - 72 |COBOL program statements, divided into two areas : |
| | - Area A : columns 8 to 11 |
| | - Area B : columns 12 to 72 |
+----------+--------------------------------------------------------------------------+
| 73 - 80 | Tags, Remarks or Sequence numbers (often garbage...) |
+----------+------------+-------------------------------------------------------------+
Division, section and paragraph-names must all begin in Area A and end with a period.
``CBL/PROCESS`` directives statement can start in columns 1 through 70
Installation
============
The lexer is available as a Pip package:
``$ sudo pip install pygments_ibm_cobol_lexer``
Or using easy_install:
``$ sudo easy_install pygments_ibm_cobol_lexer``
Usage
=====
After installation the ibmcobol Lexer and ibmcobol Style automatically registers itself for files with the ".cbl" extensions.
Therefore, cmdline usage is easy:
+ Ascii input :
``pygmentize -O full,style=ibmcobol,encoding=latin1 -o HORREUR.html HORREUR.ascii.cbl``
+ Ebcdic input (in this case it's necessary to specify outencoding value):
``pygmentize -O full,style=ibmcobol,encoding=cp1147,outencoding=latin1 -o COB001.html COB001.cp1147.cbl``
Or as library usage:
::
from pygments import highlight
from pygments.formatters import HtmlFormatter
from pygments_ibm_cobol_lexer import IBMCOBOLLexer, IBMCOBOLStyle
my_code = open("cobol_ebcdic.cbl",'rb').read()
highlight(my_code,IBMCOBOLLexer(encoding='cp1140'),
HtmlFormatter(style=IBMCOBOLStyle, full=True),
open('test.html','w'))
Also see the ``pygments_tests/`` directory
You can test this lexer online here: onlinePygmentsLexer_
About cp1147
============
I have files coded IBM1147 (EBCDIC french + euro sign), I was forced to write my own codec ``cp1147``, very close to the ``cp500`` (Canada, Belgium), it diverges on the characters "@\°{}§ùµ£à[€`¨#]~éè¦ç" :
This codec is available as a Pip package:
``$ sudo pip install cp1147``
Changelog
=========
1.2 - (2013-11-25)
- reordering variable,number,operator regex in cics/dli/sql
- Tokenize cobol special-register
- codec module cp1147 removed
- change in DB2 and COBOL keywords
- if encoding is in ('cp037','cp297','cp1140','cp1047','cp1147','cp500'), it's managed as 'latin-1'
1.1 - (2012-11-19)
Minor Fix + EBCDIC enhancements:
- Fix : float regex detection before integer detection
- Add inline-commentaire ``*>`` (not the IBM default)
- Change cics/dli keywords color...
- Extend CICS_KEYWORDS, remove EJECT/SKIP from COBOL_KEYWORDS (treated as comments)
- each ASCII input lines is padded to 80 columns
- Add EBCDIC features:
* add my own french codec cp1147
* if EBCDIC encoding is passed (cp500,cp1140,...) or detected,convert the binary input raw text in 80 columns fixed lines
* ``encoding=chardet`` (slowly) does not detect EBCDIC chart,it's override with ``encoding=guess``
* "guess EBCDIC" is defaulted to ``self.encoding='cp500'``
1.0 - (2012-11-12)
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pygments_ibm_cobol_lexer-1.2.tar.gz
.
File metadata
- Download URL: pygments_ibm_cobol_lexer-1.2.tar.gz
- Upload date:
- Size: 78.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f84b236d812126fe15fbb2a6d8a5acda7e4cc0cfbbb267a478ece6588e1385ec |
|
MD5 | d1bdc9e544e4986f3f214e6da4413701 |
|
BLAKE2b-256 | f7b91973103f672c7379caf668dd972ab4ebf40821273ecb69ec02128dd5aac6 |