Skip to main content

jAST: Analyzing and Modifying Java ASTs with Python

Project description

jAST: Analyzing and Modifying Java ASTs with Python

Python Version GitHub release PyPI Build Status Coverage Status Licence Code style: black

The jast module helps Python applications to process trees of the Java abstract syntax grammar.

An abstract syntax tree can be generated by using the parse() function from this module. The result will be a tree of objects whose classes all inherit from jast.JAST.

A modified abstract syntax tree can be written back to Java source code by using the unparse() function. This function takes a tree of objects and returns a string with the Java source code.

Additionally various helper functions are provided that make working with the trees simpler. The main intention of the helper functions and this module in general is to provide an easy-to-use interface for libraries that work tightly with Java.

Formal Specification

A formal specification of the Java abstract syntax tree can be represented in abstract syntax description language (ASDL) as follows:

-- builtin types are:
-- identifier, int, float, bool, char, string

module Java
{
    mod = CompilationUnit(Package? package, Import* imports, declaration* body)
        | ModularUnit(Import* imports, Module body)

    declaration = EmptyDecl()
        | CompoundDecl(declaration* body)
        | Package(Annotation* annotations, qname name)
        | Import(bool? static, qname name, bool? on_demand)
        | Module(bool? open, qname name, directive* directives)
        | Field(modifier* modifiers, jtype type, declarator+ declarators)
        | Method(modifier* modifiers, typeparams? type_params, Annotation* annotations,
                 jtype return_type, identifier id, params? parameters, dim* dims,
                 qname* throws, Block? body)
        | Constructor(modifier* modifiers, typeparams? type_params, identifier id,
                      params? parameters, Block body)
        | AnnotationMethod(modifier* modifiers, jtype type, identifier id,
                           (elementarrayinit | Annotation | expr)? default_value)
        | Initializer(Block body, bool? static)
        | Class(modifier* modifiers, identifier id, typeparams? type_params,
                jtype? extends, jtype* implements, jtype* permits, declaration* body)
        | Enum(modifier* modifiers, identifier id, jtype* implements,
               enumconstant* constants, declaration* body)
        | Interface(modifier* modifiers, identifier id, typeparams? type_params,
                    jtype? extends, jtype* implements, declaration* body)
        | AnnotationDecl(modifier* modifiers, identifier id, declaration* body)
        | Record(modifier* modifiers, identifier id, typeparams? type_params,
                 recordcomponent* components, jtype* implements, declaration* body)
        attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)

    directive = Requires(modifier* modifiers, qname name)
        | Exports(qname name, qname to)
        | Opens(qname name, qname to)
        | Uses(qname name)
        | Provides(qname name, qname with_)
        attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)

    stmt = Empty()
        | Block(stmt* body)
        | Compound(stmt* body)

        | LocalType((Class | Interface | Record) decl)
        | LocalVariable(modifier* modifiers, jtype type, declarator+ variables)

        | Labeled(identifier label, stmt body)

        | If(expr test, stmt body, stmt? orelse)
        | Switch(expr value, switchblock body)
        | While(expr test, stmt body)
        | DoWhile(stmt body, expr test)
        | For((expr* | LocalVariable?) init, expr? test, expr* update, stmt body)
        | ForEach(modifier* modifiers, jtype type,
                  identifier id, expr iter, stmt body)
        | Try(Block body, catch* catches, Block? final)
        | TryWithResources((resource | qname)+ resources,
                           Block body, catch* catches, Block? final)
        | Assert(expr test, expr? msg)
        | Throw(expr exc)
        | Expr(expr value)
        | Return(expr? value)
        | Yield(expr value)
        | Break(identifier? label)
        | Continue(identifier? label)
        | Synch(expr lock, Block body)
        attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)

    expr = Lambda((identifier | identifier* | params) args, (expr | Block) body)
        | Assign(expr target, operator? op, expr value)
        | IfExp(expr test, expr body, expr orelse)
        | BinOp(expr left, operator op, expr right)
        | InstanceOf(expr value, (jtype | pattern) type)
        | UnaryOp(unaryop op, expr operand)
        | PostOp(expr operand, operator op)
        | Cast(Annotation* annotations, typebound type, expr value)
        | NewObject(typeargs? type_args, jtype type, expr* args, declaration* body)
        | NewArray(jtype type, expr* expr_dims, dim* dims, arrayinit? initializer)
        | SwitchExp(expr value, switchexprule* rules)
        | This()
        | Super(typeargs? type_args, identifier? id)
        | Constant(literal value)
        | Name(identifier id)
        | ClassExpr(jtype type)
        | ExplicitGenericInvocation(typeargs? type_args, expr value)
        | Subscript(expr value, expr index)
        | Member(expr value, expr member)
        | Call(expr func, expr* args)
        | Reference((expr | jtype ) type, typeargs? type_args, identifier? id, bool? new)
        | Match(jtype type, identifier id)
        attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)

    operator = Or() | And() | BitOr() | BitXor() | BitAnd() | Eq() | NotEq()
        | Lt() | LtE() | Gt() | GtE() | LShift() | RShift() | URShift()
        | Add() | Sub() | Mult() | Div() | Mod()

    unaryop = PreInc() | PreDec() | UAdd() | USub() | Invert() | Not()

    postop = PostInc() | PostDec()

    enumconstant = (Annotation* annotations, identifier id, expr* args, declaration* body)

    recordcomponent = (jtype type, identifier id)

    switchlabel = Case(expr guard) | DefaultCase()
    switchgroup = (switchlabel* labels, stmt* body)
    switchblock = (switchgroup* groups, switchlabel* labels)

    catch = (modifier* modifiers, qname+ excs, identifier id, Block body)

    resource = (modifier* modifiers, jtype type, declarator variable)

    switchexplabel = ExpCase() | ExpDefault()
    switchexprule = (switchexplabel label, (expr | guardedpattern)* cases,
                     bool arrow, stmt* body)

    arrayinit = ((expr | arrayinit)* values)

    receiver = (jtype type, identifier* identifiers)
    param = (modifier* modifiers, jtype type, variabledeclaratorid id)
    arity = (modifier* modifiers, jtype type, Annotation* annotations, variabledeclaratorid id)
    params = (receiver? receiver_param, (param | arity)* parameters)

    literal = IntLiteral(int value, bool? long) | FloatLiteral(float value, bool? double)
        | BoolLiteral(bool value) | CharLiteral(char value) | StringLiteral(string value)
        | TextBlock(string* value) | NullLiteral()

    modifier = Abstract() | Default() | Final() | Native() | NonSealed() | Private()
        | Protected() | Public() | Sealed() | Static() | Strictfp() | Synchronized()
        | Transient() | Transitive() | Volatile()
        | Annotation(qname name, (elementvaluepair | elementarrayinit | Annotation | expr)* elements)


    elementvaluepair = (identifier id, (elementarrayinit | Annotation | expr) value)
    elementarrayinit = ((elementarrayinit | Annotation | expr)* values)

    jtype = Void() | Var() | Boolean() | Byte() | Short()
        | Int() | Long() | Char() | Float() | Double()
        | Wildcard(Annotation* annotations, wildcardbound? bound)
        | Coit(Annotation* annotations, identifier id, typeargs? type_args)
        | ClassType(Annotation* annotations, Coit+ coits)
        | ArrayType(Annotation* annotations, jtype type, dim* dims)


    wildcardbound = (jtype type, bool? extends, bool? super_)

    typeargs = (jtype+ types)

    dim = (Annotation* annotations)

    variabledeclaratorid = (identifier id, dim* dims)
    declarator = (variabledeclaratorid id, expr? init)

    typebound = (Annotation* annotations, jtype+ types)
    typeparam = (Annotation* annotations, identifier id, typebound? bound)
    typeparams = (typeparam* parameters)

    pattern = (modifier* modifiers, jtype type, Annotation* annotations, identifier id)
    guardedpattern = (pattern value, expr* conditions)

    qname = (identifier+ identifiers)
}

Installation

The jast module can be installed from the Python Package Index (PyPI) by running the following command:

pip install java-ast

Depending on your system configuration you might need to use pip3:

pip3 install java-ast

Usage

jast provides a simple interface to parse Java source code and work with the resulting abstract syntax tree.

Parsing Java Source Code

The following example demonstrates how to parse a Java source file and print the names of all classes:

import jast

# Parse the Java source file
with open("HelloWorld.java") as file:
    tree = jast.parse(file.read())

The tree object is now a tree of objects that represent the Java source as an abstract syntax tree.

Visiting Nodes

The following code snippet demonstrates how to print the names of all classes in the tree:

# Print the names of all classes
class NameVisitor(jast.JNodeVisitor):
    def visit_identifier(self, node):
        print(node)


visitor = NameVisitor()
visitor.visit(tree)

The NameVisitor class is a simple visitor that prints the name of all Identifier nodes in the tree. The visit() method is called with the root node of the tree to start the traversal.

Modifying Nodes

The following code snippet demonstrates how to modify the tree:

# Modify the tree
class NameModifier(jast.JNodeTransformer):
    def visit_identifier(self, node):
        if node == "HelloWorld":
            return jast.identifier("HelloWorld2")
        return node


modifier = NameModifier()
tree = modifier.visit(tree)

The NameModifier class is a simple transformer that changes the name of the class HelloWorld to HelloWorld2. The visit() method is called with the root node of the tree to start the transformation.

Writing Java Source Code

The following code snippet demonstrates how to write the modified tree back to Java source code:

# Write the modified tree back to Java source code
with open("HelloWorld2.java", "w") as file:
    file.write(jast.unparse(tree))

The unparse() function takes the modified tree and returns a string with the Java source code.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

java_ast-1.0.2.tar.gz (139.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

java_ast-1.0.2-py3-none-any.whl (114.1 kB view details)

Uploaded Python 3

File details

Details for the file java_ast-1.0.2.tar.gz.

File metadata

  • Download URL: java_ast-1.0.2.tar.gz
  • Upload date:
  • Size: 139.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.5

File hashes

Hashes for java_ast-1.0.2.tar.gz
Algorithm Hash digest
SHA256 74dc06405d06eb35265cc9621f62d2d3013f59c9e3a82365731e62cd7de3405a
MD5 52c94e05058d381cb8ad5479c661aa1d
BLAKE2b-256 44979bd619db876e104b8b6677bf5545beda9849837dc7fc37e9c0812535e67f

See more details on using hashes here.

File details

Details for the file java_ast-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: java_ast-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 114.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.5

File hashes

Hashes for java_ast-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8538f177169d9e2bb6d6c122f5221e1046e7aa47887982d7bd929ccdcf297256
MD5 d27528ee22a5b631746a7a5e51c4a1fc
BLAKE2b-256 ba7f109ee06c04be773cb2b9df6a1be67ab3eda2de7fb8a3338aa749c0896aee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page