Skip to main content

Static code analysis tool for Java repositories

Project description

Inspect4j

Python License: MIT Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Student: Liru Qu
EPCC Supervisor: Steven Carlysle-Davies

Overview

Inspect4j is a comprehensive static code analysis framework designed to automatically extract metadata, documentation, and structural information from Java code repositories. Built upon the robust Javalang parser, the tool provides detailed insights into Java codebases, making it invaluable for software analysis, documentation generation, and code understanding.

Features

Given a Java project folder, Inspect4j will:

  • Code Structure Analysis:

    • Extract all classes, interfaces, JavaDoc comments and enums with their complete metadata
    • Analyze methods, constructors, and fields including their signatures and modifiers
    • Support nested and anonymous classes, lambda expressions, and local classes
    • Track inheritance hierarchies and interface implementations
  • Dependency Analysis:

    • Identify all import statements and their types (internal/external)
    • Process wildcard imports and static imports
    • Extract Java project dependencies from Maven (pom.xml) and Gradle (build.gradle) files
    • Classify dependencies by scope and build tool
  • Method Call Analysis:

    • Extract complete method call lists with proper type resolution
    • Handle method overloading and chain calls accurately
    • Support cross-file method resolution through imports
    • Track constructor calls and super method invocations
  • Annotations Support:

    • Extract all annotations at class, method, field, and parameter levels
    • Preserve annotation parameters and values
  • Project Metadata:

    • Extract project directory tree and file hierarchy
    • Detect and analyze software licenses
    • Extract README files and project documentation
    • Retrieve GitHub metadata when available (if a .git folder is in the repository)
  • Advanced Features:

    • Generate Abstract Syntax Trees (AST) in JSON format
    • Extract source code snippets for each analyzed element
    • Support both single file and directory-wide analysis
    • Generate HTML reports for better visualization

All metadata is extracted and stored as structured JSON files for easy integration with other tools.

Background

Inspect4j draws inspiration from Inspect4py, a successful static analysis framework for Python projects. While maintaining similar command-line interfaces and overall architecture for consistency, Inspect4j is built from the ground up to handle Java's unique language features:

  • Parser Differences: Uses Javalang instead of Python's native ast library
  • Language-Specific Features: Handles Java packages, imports, interfaces, annotations, nested structures and method overloading
  • Build System Integration: Supports Maven and Gradle dependency extraction
  • Type System: Accounts for Java's static typing and inheritance model

This design ensures that Inspect4j and Inspect4py can potentially be integrated into a unified multi-language analysis suite while maintaining language-specific accuracy.

Requirements

  • Python: 3.8+ (recommended: Python 3.9+)
  • Java Source Code: Local Java files (.java) on your filesystem
  • Dependencies: See requirements.txt for exact package versions

Key Dependencies

javalang>=0.13.0
click>=8.0.0
pathlib
json2html
requests
gitpython
beautifulsoup4

Installation

From Source

  1. Clone the repository:
git clone https://git.ecdf.ed.ac.uk/msc-24-25/inspect4j.git
cd inspect4j
  1. Install required Python packages:
pip install -r requirements.txt
  1. Install the package:
pip install -e .

Using pip

pip install inspect4j

Usage

Command Line Interface

The tool can analyze individual Java files or entire directory structures:

inspect4j -i --input_path <FILE.java | DIRECTORY> [OPTIONS]

Basic Examples

Analyze a single Java file:

inspect4j -i MyClass.java

Analyze an entire Java project:

inspect4j -i /path/to/java/project -o analysis_results

Generate comprehensive analysis with all features:

inspect4j -i ./my-java-project \
    -o ./output \
    -r \
    -html \
    -cl \
    -dt \
    -ast \
    -sc \
    -ld \
    -rm \
    -md

Command Line Options

Options:
  -i, --input_path TEXT           Input path of the Java file or directory to
                                  inspect. [required]
  -o, --output_dir TEXT           Output directory path to store results. If
                                  the directory does not exist, the tool will
                                  create it. [default: output_dir]
  -ignore_dir, --ignore_dir_pattern TEXT
                                  Ignore directories starting with a certain
                                  pattern. Can be used multiple times.
                                  [default: .git, target, build, bin]
  -ignore_file, --ignore_file_pattern TEXT
                                  Ignore files starting with a certain pattern.
                                  Can be used multiple times. [default: ., _]
  -r, --requirements              Extract Java project dependencies (Maven/Gradle).
  -html, --html_output            Generate HTML visualization of results.
  -cl, --call_list                Generate method call list analysis.
  -dt, --directory_tree           Extract project directory tree structure.
  -ast, --abstract_syntax_tree    Generate Abstract Syntax Tree in JSON format.
  -sc, --source_code              Include source code in AST nodes.
  -ld, --license_detection        Detect project license automatically.
  -rm, --readme                   Extract all README files in the repository.
  -md, --metadata                 Extract GitHub metadata (requires .git folder).
  --help                          Show help message and exit.

Output Structure

The tool generates structured output in the specified directory:

output_dir/
├── directory_info.json          # Repository-level analysis
├── call_graph.json             # Method call relationships
├── call_graph.html             # Interactive call graph visualization
├── src/
│   └── main/
│       └── java/
│           └── com/
│               └── example/
│                   └── json_files/
│                       ├── MyClass.json      # Individual class analysis
│                       └── MyInterface.json  # Interface analysis
└── license_info.json           # License detection results

JSON Output Format

Each analyzed Java file produces a detailed JSON structure containing:

{
  "file": {
    "path": "/path/to/MyClass.java",
    "fileNameBase": "MyClass",
    "extension": "java"
  },
  "package": {
    "name": "com.example.myproject"
  },
  "dependencies": [...],           // Import statements
  "classes": {
    "MyClass": {
      "name": "MyClass",
      "modifiers": ["public"],
      "extends": "BaseClass",
      "implements": ["MyInterface"],
      "methods": {...},            // Method details
      "fields": {...},             // Field information
      "calls": [...],              // Method call list
      "doc": {...}                 // JavaDoc information
    }
  },
  "interfaces": {...},             // Interface definitions
  "enums": {...}                   // Enum definitions
}

Testing

The project includes comprehensive test cases:

# Run basic functionality unit tests
cd inspect4j
python Test/test_java_inspector.py

# Test directory analysis
python -m inspect4j.main -i Test/test_repos/Mines -o mines_analysis -r -cl -dt

Project Structure

inspect4j/
├── inspect4j/
│   ├── __init__.py
│   ├── main.py                  # CLI entry point
│   ├── java_inspector.py        # Core analysis engine
│   ├── java_utils.py           # Utility functions
│   ├── structure_tree.py       # Directory tree extraction
│   └── licenses/               # License templates
├── Test/
│   ├── test_repos/             # Test Java projects
│   ├── test_files/             # Test files for unit tests
│   └── test_java_inspector.py  # Unit tests
├── requirements.txt
├── setup.py
├── README.md
└── LICENSE

License

This project is licensed under the MIT License - see the LICENSE file for details.


This project is part of an MSc dissertation at the University of Edinburgh's EPCC (Edinburgh Parallel Computing Centre).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inspect4j-1.0.0.tar.gz (79.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inspect4j-1.0.0-py3-none-any.whl (70.9 kB view details)

Uploaded Python 3

File details

Details for the file inspect4j-1.0.0.tar.gz.

File metadata

  • Download URL: inspect4j-1.0.0.tar.gz
  • Upload date:
  • Size: 79.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.8

File hashes

Hashes for inspect4j-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f9afdb8424e3a8790689cee461436279a1298f410dad587c76fba1dda0b1a95f
MD5 a4769ee003baeafea9d03ef9900097fc
BLAKE2b-256 2b436a613d2eb5752659fe486b96b1615cf1b13f4312dcd6d83a816c9e5661d0

See more details on using hashes here.

File details

Details for the file inspect4j-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: inspect4j-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 70.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.8

File hashes

Hashes for inspect4j-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d57be9db4260547c3a09f6b86728747519422536f857c8f6b4756a33a8a41be6
MD5 02d94576182c50ebd4aaf6642ada1713
BLAKE2b-256 799291d1b694f1b7d49750cdfd020d10607cf1a93b064160d46c036893f72ee2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page