Skip to main content

PostgreSQL Query Lineage and Performance Analyzer

Project description

PostgreSQL Data Lineage Analyzer

License: MIT Python 3.8+ GitHub release Build Status PyPI version Python Versions

A tool for analyzing query performance and building data lineage graphs from PostgreSQL databases.

Features

  • Identify the most expensive queries in your PostgreSQL database
  • Build data lineage graphs showing how data flows between tables
  • Analyze table usage statistics
  • Get performance insights and optimization recommendations
  • Export results in various formats (CSV, PNG, GraphML)

Requirements

  • Python 3.8+
  • PostgreSQL database with pg_stat_statements extension enabled
  • Required Python packages (installed automatically):
    • Flask
    • psycopg2
    • pandas
    • networkx
    • matplotlib
    • sqlparse
    • waitress

Installation

Option 1: Install from PyPI

pip install pg_lineagelens

Then run the application:

# Basic usage (starts server and opens browser)
pg_lineagelens

# Show help and available options
pg_lineagelens --help

# Run on a specific port
pg_lineagelens --port 8080

# Run on a specific host (e.g., to allow external connections)
pg_lineagelens --host 0.0.0.0

# Run without opening browser automatically
pg_lineagelens --no-browser

# Show version
pg_lineagelens --version

Option 2: Install from source

  1. Clone the repository:
git clone https://github.com/somasays/pg_lineagelens.git
cd pg_lineagelens
  1. Install dependencies:
pip install -r requirements.txt
  1. Run the application:
python app_launcher.py

Setting up pg_stat_statements in PostgreSQL

To use this tool, you need to enable the pg_stat_statements extension in your PostgreSQL database:

  1. Edit your postgresql.conf file and add:

    shared_preload_libraries = 'pg_stat_statements'
    pg_stat_statements.track = all
    
  2. Restart PostgreSQL

  3. Connect to your database and run:

    CREATE EXTENSION pg_stat_statements;
    

Usage

  1. Launch the application
  2. Enter your PostgreSQL connection details
  3. Configure the analysis settings
  4. Run the analysis
  5. Explore the results:
    • View expensive queries
    • Examine table statistics
    • Visualize data lineage
    • Download reports

Building from Source

Requirements

  • Python 3.8+
  • pipenv (optional, for development)

Build and Publish

To build the Python package:

# Install build tools
pip install build twine

# Build both wheel and source distribution
python -m build

# Check the built package
twine check dist/*

# Upload to TestPyPI (optional, for testing)
twine upload --repository-url https://test.pypi.org/legacy/ dist/*

# Upload to PyPI (when ready for release)
twine upload dist/*

Or use the provided script:

# Make script executable
chmod +x build_and_publish.sh

# Run the build script
./build_and_publish.sh

Using Pipenv (Optional)

# Install dependencies
pipenv install
pipenv install --dev

# Run with pipenv
pipenv run python app_launcher.py

Continuous Integration/Continuous Deployment

This project uses GitHub Actions for automated builds and releases.

Automated Package Publishing

Every new tag with version format (v*) triggers the PyPI publishing workflow which:

  • Builds the Python package (source and wheel)
  • Verifies the package contents
  • Publishes the package to PyPI

Creating Releases

  1. Tag a commit with a version number:

    git tag -a v1.0.0 -m "Version 1.0.0"
    git push origin v1.0.0
    
  2. This will trigger the build workflow and automatically:

    • Build and publish the Python package to PyPI
    • Create a GitHub release

Manual Publishing

You can also manually trigger the publishing workflow from the GitHub Actions tab.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • PostgreSQL - The world's most advanced open source database
  • Flask - Web framework for Python
  • NetworkX - Network analysis in Python
  • SQLParse - SQL parser for Python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg_lineagelens-1.0.3.tar.gz (55.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pg_lineagelens-1.0.3-py3-none-any.whl (55.0 kB view details)

Uploaded Python 3

File details

Details for the file pg_lineagelens-1.0.3.tar.gz.

File metadata

  • Download URL: pg_lineagelens-1.0.3.tar.gz
  • Upload date:
  • Size: 55.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pg_lineagelens-1.0.3.tar.gz
Algorithm Hash digest
SHA256 f9fe392ee2f982b39062d64cf8512e0ce50a20fea295344e84842d2fd4b38bef
MD5 17885247df703e2f039a26d63b8af27a
BLAKE2b-256 e439bd5eafe1e258198e540c8ffb694c22173d6d7334724f3cbf0a7581fae9bd

See more details on using hashes here.

File details

Details for the file pg_lineagelens-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: pg_lineagelens-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 55.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pg_lineagelens-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 fe0515f2bd844aea200f26a89f05b8af3ed9cc30f78cb987fc1224d0592b894c
MD5 8a8d9d1e24fe9af566fbe8b8b39b0494
BLAKE2b-256 c57237de99fb9da2cfa040e93fc41ba6b5ef4a1010cac1ed464ee032e0813bd2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page