Skip to main content

Static Analysis on Python source code using Jedi, CodeQL and Treesitter.

Project description

logo

A Python Static Analysis Toolkit (and Library)

A comprehensive static analysis tool for Python source code that provides symbol table generation, call graph analysis, and semantic analysis using Jedi, CodeQL, and Tree-sitter.

Installation

pip install codeanalyzer-python

Prerequisites

  • Python 3.12 or higher

System Package Requirements

The tool creates virtual environments internally using Python's built-in venv module.

Ubuntu/Debian systems:

sudo apt update
sudo apt install python3.12-venv python3-dev build-essential

Fedora/RHEL/CentOS systems:

sudo dnf group install "Development Tools"
sudo dnf install python3-pip python3-venv python3-devel

or on older versions:

sudo yum groupinstall "Development Tools"
sudo yum install python3-pip python3-venv python3-devel

macOS systems:

# Install Xcode Command Line Tools (for compilation)
xcode-select --install

# If using Homebrew Python (recommended)
brew install python@3.12

# If using pyenv (popular Python version manager)
# First ensure pyenv is properly installed and configured
pyenv install 3.12.0  # or latest 3.12.x version
pyenv global 3.12.0   # or pyenv local 3.12.0 for project-specific

# If using system Python, you may need to install certificates
/Applications/Python\ 3.12/Install\ Certificates.command

Note: These packages are required as the tool uses Python's built-in venv module to create isolated environments for analysis.

Usage

The codeanalyzer provides a command-line interface for performing static analysis on Python projects.

Basic Usage

codeanalyzer --input /path/to/python/project

Command Line Options

To view the available options and commands, run codeanalyzer --help. You should see output similar to the following:

 codeanalyzer --help

 Usage: codeanalyzer [OPTIONS] COMMAND [ARGS]...

 Static Analysis on Python source code using Jedi, CodeQL and Tree sitter.


╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --input           -i                  PATH            Path to the project root directory. [default: None] [required]     │
│    --output          -o                  PATH            Output directory for artifacts. [default: None]                    │
│    --format          -f                  [json|msgpack]  Output format: json or msgpack. [default: json]                    │
│    --codeql              --no-codeql                     Enable CodeQL-based analysis. [default: no-codeql]                 │
│    --eager               --lazy                          Enable eager or lazy analysis. Defaults to lazy. [default: lazy]   │
│    --cache-dir       -c                  PATH            Directory to store analysis cache. [default: None]                 │
│    --clear-cache         --keep-cache                    Clear cache after analysis. [default: clear-cache]                 │
│                      -v                  INTEGER         Increase verbosity: -v, -vv, -vvv [default: 0]                     │
│    --help                                                Show this message and exit.                                        │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Examples

  1. Basic analysis with symbol table:

    codeanalyzer --input ./my-python-project
    

    This will print the symbol table to stdout in JSON format to the standard output. If you want to save the output, you can use the --output option.

    codeanalyzer --input ./my-python-project --output /path/to/analysis-results
    

    Now, you can find the analysis results in analysis.json in the specified directory.

  2. Change output format to msgpack:

    codeanalyzer --input ./my-python-project --output /path/to/analysis-results --format msgpack
    

    This will save the analysis results in analysis.msgpack in the specified directory.

  3. Analysis with CodeQL enabled:

    codeanalyzer --input ./my-python-project --codeql
    

    Every run produces a symbol table and a call graph. By default, edges come from Jedi's lexical analysis. Adding --codeql resolves additional edges (including RPC / third-party / dynamically-dispatched targets) and merges them with the Jedi-derived edges. CodeQL also backfills resolved callees on Jedi-emitted call sites where Jedi couldn't resolve them.

    Note: CodeQL integration is experimental. The CLI is downloaded into <cache_dir>/codeql/ on first use and reused thereafter.

  4. Eager analysis with custom cache directory:

    codeanalyzer --input ./my-python-project --eager --cache-dir /path/to/custom-cache
    

    This will rebuild the analysis cache at every run and store it in /path/to/custom-cache/.codeanalyzer. The cache will be cleared by default after analysis unless you specify --keep-cache.

    If you provide --cache-dir, the cache will be stored in that directory. If not specified, it defaults to .codeanalyzer in the current working directory ($PWD).

  5. Quiet mode (minimal output):

    codeanalyzer --input /path/to/my-python-project --quiet
    

Output

By default, analysis results are printed to stdout in JSON format. When using the --output option, results are saved to analysis.json in the specified directory. If you use the --format=msgpack option, the results will be saved in analysis.msgpack, which is a binary format that can be more efficient for storage and transmission.

Development

This project uses uv for dependency management during development.

Development Setup

  1. Install uv logo

A Python Static Analysis Toolkit (and Library)

A comprehensive static analysis tool for Python source code that provides symbol table generation, call graph analysis, and semantic analysis using Jedi, CodeQL, and Tree-sitter.

Installation

pip install codeanalyzer-python

Prerequisites

  • Python 3.12 or higher

System Package Requirements

The tool creates virtual environments internally using Python's built-in venv module.

Ubuntu/Debian systems:

sudo apt update
sudo apt install python3.12-venv python3-dev build-essential

Fedora/RHEL/CentOS systems:

sudo dnf group install "Development Tools"
sudo dnf install python3-pip python3-venv python3-devel

or on older versions:

sudo yum groupinstall "Development Tools"
sudo yum install python3-pip python3-venv python3-devel

macOS systems:

# Install Xcode Command Line Tools (for compilation)
xcode-select --install

# If using Homebrew Python (recommended)
brew install python@3.12

# If using pyenv (popular Python version manager)
# First ensure pyenv is properly installed and configured
pyenv install 3.12.0  # or latest 3.12.x version
pyenv global 3.12.0   # or pyenv local 3.12.0 for project-specific

# If using system Python, you may need to install certificates
/Applications/Python\ 3.12/Install\ Certificates.command

Note: These packages are required as the tool uses Python's built-in venv module to create isolated environments for analysis.

Usage

The codeanalyzer provides a command-line interface for performing static analysis on Python projects.

Basic Usage

codeanalyzer --input /path/to/python/project

Command Line Options

To view the available options and commands, run codeanalyzer --help. You should see output similar to the following:

 codeanalyzer --help

 Usage: codeanalyzer [OPTIONS] COMMAND [ARGS]...

 Static Analysis on Python source code using Jedi, CodeQL and Tree sitter.


╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --input           -i                  PATH     Path to the project root directory. [default: None] [required]   │
│    --output          -o                  PATH     Output directory for artifacts. [default: None]                  │
│    --format          -f           [json|msgpack]  Output format: json or msgpack. [default: json].                 │
│    --codeql              --no-codeql              Enable CodeQL-based analysis. [default: no-codeql]               │
│    --eager               --lazy                   Enable eager or lazy analysis. Defaults to lazy. [default: lazy] │
│    --cache-dir       -c                  PATH     Directory to store analysis cache. [default: None]               │
│    --clear-cache         --keep-cache             Clear cache after analysis. [default: clear-cache]               │
│                      -v                  INTEGER  Increase verbosity: -v, -vv, -vvv [default: 0]                   │
│    --help                                         Show this message and exit.                                      │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Examples

  1. Basic analysis with symbol table:

    codeanalyzer --input ./my-python-project
    

    This will print the symbol table to stdout in JSON format to the standard output. If you want to save the output, you can use the --output option.

    codeanalyzer --input ./my-python-project --output /path/to/analysis-results
    

    Now, you can find the analysis results in analysis.json in the specified directory.

  2. Analysis with CodeQL enabled:

    codeanalyzer --input ./my-python-project --codeql
    

    Every run produces a symbol table and a call graph. By default, edges come from Jedi's lexical analysis. Adding --codeql resolves additional edges (including RPC / third-party / dynamically-dispatched targets) and merges them with the Jedi-derived edges. CodeQL also backfills resolved callees on Jedi-emitted call sites where Jedi couldn't resolve them.

    Note: CodeQL integration is experimental. The CLI is downloaded into <cache_dir>/codeql/ on first use and reused thereafter.

  3. Eager analysis with custom cache directory:

    codeanalyzer --input ./my-python-project --eager --cache-dir /path/to/custom-cache
    

    This will rebuild the analysis cache at every run and store it in /path/to/custom-cache/.codeanalyzer. The cache will be cleared by default after analysis unless you specify --keep-cache.

    If you provide --cache-dir, the cache will be stored in that directory. If not specified, it defaults to .codeanalyzer in the current working directory ($PWD).

  4. Save output in msgpack format:

    codeanalyzer --input ./my-python-project --output /path/to/analysis-results --format msgpack
    

Output

By default, analysis results are printed to stdout in JSON format. When using the --output option, results are saved to analysis.json in the specified directory.

Development

This project uses uv for dependency management during development.

Development Setup

  1. Install uv

  2. Clone the repository:

    git clone https://github.com/codellm-devkit/codeanalyzer-python
    cd codeanalyzer-python
    
  3. Install dependencies using uv:

    uv sync --all-groups
    

    This will install all dependencies including development and test dependencies.

Running from Source

When developing, you can run the tool directly from source:

uv run codeanalyzer --input /path/to/python/project

Running Tests

uv run pytest --pspec -s

Development Dependencies

The project includes additional dependency groups for development:

  • test: pytest and related testing tools
  • dev: development tools like ipdb

Install all groups with:

uv sync --all-groups
  1. Clone the repository:

    git clone https://github.com/codellm-devkit/codeanalyzer-python
    cd codeanalyzer-python
    
  2. Install dependencies using uv:

    uv sync --all-groups
    

    This will install all dependencies including development and test dependencies.

Running from Source

When developing, you can run the tool directly from source:

uv run codeanalyzer --input /path/to/python/project

Running Tests

uv run pytest --pspec -s

Development Dependencies

The project includes additional dependency groups for development:

  • test: pytest and related testing tools
  • dev: development tools like ipdb

Install all groups with:

uv sync --all-groups

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codeanalyzer_python-0.1.14.tar.gz (49.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codeanalyzer_python-0.1.14-py3-none-any.whl (48.9 kB view details)

Uploaded Python 3

File details

Details for the file codeanalyzer_python-0.1.14.tar.gz.

File metadata

  • Download URL: codeanalyzer_python-0.1.14.tar.gz
  • Upload date:
  • Size: 49.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for codeanalyzer_python-0.1.14.tar.gz
Algorithm Hash digest
SHA256 70d58ebcf758d373e441dfbb4f5c43dbb40dbdf51b9ab94b082d0660742de7a8
MD5 b3bc2c370c199e01f3f9e9ebb3c81b82
BLAKE2b-256 6839d396c51264c6e73e8cdbf62594f748a198c2a1b4ef2f54fa2d8b67b55eb1

See more details on using hashes here.

File details

Details for the file codeanalyzer_python-0.1.14-py3-none-any.whl.

File metadata

  • Download URL: codeanalyzer_python-0.1.14-py3-none-any.whl
  • Upload date:
  • Size: 48.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for codeanalyzer_python-0.1.14-py3-none-any.whl
Algorithm Hash digest
SHA256 0b8800e6895264b4fd1f249a2f35c0db9b64b9f4a119f42738922c4e333dfdb6
MD5 e08311f55f400b7e9f1a81671d9dd9d7
BLAKE2b-256 ae49dae2be8f365780adba909b6481c772981a133e03a7919ae7d3735f2f6854

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page