Parse VBA grammar using ANTLR4 and python
Project description
antlr4-vba-parser
Navigate antlr VBA Parse Trees in python.
This python package provides an interface to the the antlr4 tooling and allows parsing and lexing of VBA grammar.
>>> from antlr4_vba_parser.vba_parser import Antlr4VbaParser
>>> parsed = Antlr4VbaParser("""
... SUB square(x)
... DIM y: REM Some comment
... y = x * x ' same as x**2
... END SUB
... """) # also accepts a filepath
>>> from pprint import pprint
>>> pprint(parsed)
('(startRule (module (endOfLine \\n) (moduleBody (moduleBodyElement (subStmt '
'SUB (ambiguousIdentifier square) (argList ( (arg (ambiguousIdentifier x)) '
')) (endOfStatement (endOfLine \\n )) (block (blockStmt (variableStmt DIM '
'(variableListStmt (variableSubStmt (ambiguousIdentifier y))))) '
'(endOfStatement : (endOfLine (remComment REM Some comment)) (endOfLine '
'\\n )) (blockStmt (letStmt (implicitCallStmt_InStmt '
'(iCS_S_VariableOrProcedureCall (ambiguousIdentifier y))) = (valueStmt '
'(valueStmt (implicitCallStmt_InStmt (iCS_S_VariableOrProcedureCall '
'(ambiguousIdentifier x)))) * (valueStmt (implicitCallStmt_InStmt '
'(iCS_S_VariableOrProcedureCall (ambiguousIdentifier x))))))) (endOfStatement '
"(endOfLine (comment ' same as x**2)) (endOfLine \\n))) END SUB)) "
'(endOfLine \\n))) <EOF>)')
Installation
antlr4_vba_parser
itself is a pure python package, but depends on a java
runtime in order to run.
The ANTLR4 jar needed to perform the parsing/lexing is included in the package distribution and
is bundled from third-party sources at the time of packaging with setup.py build
.
To install, simply try:
pip install antlr4_vba_parser
Development
To set up a development environment, first create either a new virtual or conda environment before activating it and then run the following:
git clone https://github.com/Liam-Deacon/antlr4-vba-parser
cd antlr4-vba-parser
pip install -r requirements-dev.txt requirements-test.txt -r requirements.txt
python setup.py build_antlr4 # needed to generate python bindings
pip install -e .
This will install the package in development mode. Note that is you have forked the repo then change the URL as appropriate.
Documentation
Documentation can be found within the docs/
directory. This project
uses sphinx to autogenerate API documentation by scraping python docstrings.
To generate the HTML documentation, simply do the following:
cd docs
make html
Contribution Guidelines
Contributions are extremely welcome and highly encouraged. To help with consistency please can the following areas be considered before submitting a PR for review:
- Use
autopep8 -a -a -i -r .
to run over any modified files to ensure basic pep8 conformance, allowing the code to be read in a style expected for most python projects. - New or changed functionality should be tested, running
pytest
should - Try to document any new or changed functionality. Note: this project uses numpydoc for it's docstring documentation style.
License
Released under the BSD license.
TODO
This package is mostly a proof of concept and as such there are a number of areas to add to, fix and improve.
- Create listener(s) capable of capturing contextual information and creating a JSON-friendly dictionary output.
- Produce simple script turns the above into a command line tool.
- Contribute to
oletools.vba
to hopefully extend capabilities using this package.
Acknowledgements
- Andrew Lockhart for the initial idea of combining ANTLR4 and python to handle VBA grammar
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.