Skip to main content

Python library for DPM-XL data processing and analysis

Project description

pyDPM

A Python library for processing DPM (Data Point Model) expressions and working with regulatory reporting data.

Overview

pyDPM provides two main areas of functionality:

  • DPM-XL Processing: Parse, validate, and generate ASTs for DPM-XL expressions
  • DPM Utilities: Work with DPM databases, explore data dictionaries, and manage operation scopes

Installation

Using Poetry (Recommended)

poetry install

Using pip

pip install .

Architecture

py_dpm/
├── api/              # Public APIs
│   ├── dpm_xl/      # DPM-XL expression APIs
│   └── dpm/         # General DPM APIs
├── dpm_xl/          # DPM-XL Processing Engine
│   ├── grammar/     # ANTLR grammar
│   ├── ast/         # AST generation
│   ├── operators/   # Expression operators
│   ├── types/       # Type system
│   ├── validation/  # Validation logic
│   └── utils/       # DPM-XL utilities
├── dpm/             # General DPM Core
│   ├── db/          # Database models & views
│   ├── scopes/      # Operation scopes
│   └── explorer/    # Data dictionary explorer
├── cli/             # Command-line interface
├── exceptions/      # Custom exceptions
└── utils/           # Shared utilities

Database Configuration

pyDPM supports multiple database backends. It selects the connection method based on the following hierarchy of preference:

  1. Explicit Argument: Passing a connection URL or path directly in Python code overrides all configuration.
  2. Unified RDBMS Configuration: If PYDPM_RDBMS and the PYDPM_DB_* variables are set, it connects to the configured server database.
  3. Legacy PostgreSQL: If USE_POSTGRES=true in .env, it connects to the configured Postgres server.
  4. SQLite: If USE_SQLITE=true (default), it connects to a local SQLite file.
  5. SQL Server: Legacy fallback if no other option is selected.

Environment Variables (.env)

Configure your database connection in the .env file:

# --- Option 1: SQLite (Default) ---
USE_SQLITE=true
SQLITE_DB_PATH=database.db

# --- Option 2: Unified server database (recommended) ---
# PYDPM_RDBMS=postgres        # or "sqlserver"
# PYDPM_DB_HOST=localhost
# PYDPM_DB_PORT=5432          # defaults: 5432 for postgres, 1433 for sqlserver
# PYDPM_DB_NAME=dpm_db
# PYDPM_DB_USER=myuser
# PYDPM_DB_PASSWORD=mypassword

# --- Option 3: Legacy PostgreSQL (backward compatible) ---
# USE_POSTGRES=true
# POSTGRES_HOST=localhost
# POSTGRES_PORT=5432
# POSTGRES_DB=dpm_db
# POSTGRES_USER=myuser
# POSTGRES_PASS=mypassword

Usage

Command Line Interface

Migrate Access Database

poetry run pydpm migrate-access ./path-to-release.accdb

Syntax Validation

poetry run pydpm syntax "{tT_01.00, r0010, c0010}"

Semantic Validation

poetry run pydpm semantic "{tT_01.00, r0010, c0010}"

Python API

DPM-XL Expression Processing

from py_dpm.api import SyntaxAPI, SemanticAPI, ASTGenerator

# Syntax validation
syntax_api = SyntaxAPI()
is_valid = syntax_api.is_valid_syntax("{tT_01.00, r0010, c0010}")
print(f"Valid syntax: {is_valid}")

# Get detailed syntax errors
errors = syntax_api.validate_syntax("invalid expression")
for error in errors:
    print(f"Line {error.line}, Col {error.column}: {error.message}")

# Generate AST
ast_gen = ASTGenerator()
ast = ast_gen.generate("{tT_01.00, r0010, c0010}")
print(f"AST: {ast}")

# Semantic validation (requires database)
semantic_api = SemanticAPI()
result = semantic_api.validate("{tT_01.00, r0010, c0010}", release_id=123)
if result.is_valid:
    print("Semantically valid!")
else:
    for error in result.errors:
        print(f"Error: {error.message}")

Working with DPM Database

from py_dpm.api import DataDictionaryAPI, DPMExplorer

# Query data dictionary
dd_api = DataDictionaryAPI()

# Get all tables
tables = dd_api.get_all_tables(release_id=123)
for table in tables:
    print(f"Table: {table.code} - {table.name}")

# Get table details
table = dd_api.get_table_by_code("T_01.00", release_id=123)
print(f"Table headers: {len(table.headers)}")

# Explore database structure
explorer = DPMExplorer()
modules = explorer.get_modules(release_id=123)
for module in modules:
    print(f"Module: {module.code}")

Operation Scopes

from py_dpm.api import OperationScopesAPI, calculate_scopes_from_expression

# Calculate scopes for an expression
scopes_api = OperationScopesAPI()
result = calculate_scopes_from_expression(
    expression="{tT_01.00, r0010, c0010}",
    release_id=123
)

if result.success:
    for scope in result.scopes:
        print(f"Table: {scope.table_code}")
        print(f"Headers: {[h.header_code for h in scope.headers]}")
else:
    print(f"Error: {result.error}")

Validations Script Generation

from py_dpm.api import generate_validations_script

# Generate engine-ready validations script
result = generate_validations_script(
    "{tT_01.00, r0010, c0010}",
    database_path="data.db",
    release_code="4.2"
)
if result["success"]:
    print(f"Enriched AST: {result['enriched_ast']}")
else:
    print(f"Error: {result['error']}")

Migration

from py_dpm.api import MigrationAPI

# Migrate Access database to SQLAlchemy
migration_api = MigrationAPI()
migration_api.migrate_from_access(
    access_db_path="./release.accdb",
    release_id=123
)

XBRL-CSV Instance Generation

from py_dpm.api import InstanceAPI

api = InstanceAPI()

# Build package from dictionary
data = {
    "module_code": "F_01.01",
    "parameters": {"refPeriod": "2024-12-31"},
    "facts": [
        {"table_code": "t001", "row_code": "r010", "column_code": "c010", "value": 1000000}
    ]
}
output_path = api.build_package_from_dict(data, "/tmp/output")

# Build package from JSON file
output_path = api.build_package_from_json("instance_data.json", "/tmp/output")

DPM Explorer - Introspection Queries

from py_dpm.api import ExplorerQueryAPI

with ExplorerQueryAPI() as api:
    # Find all properties using a specific item
    properties = api.get_properties_using_item("EUR")

    # Get module URL for documentation
    url = api.get_module_url(module_code="F_01.01")

    # Explore variable usage
    tables = api.get_tables_using_variable(variable_code="mi123")

Hierarchical Queries

from py_dpm.api import HierarchicalQueryAPI

with HierarchicalQueryAPI() as api:
    # Get hierarchy for a domain
    hierarchy = api.get_hierarchy(domain_code="DOM_001")

    # Navigate parent-child relationships
    children = api.get_children(item_code="PARENT_001")

    # Get all ancestors
    ancestors = api.get_ancestors(item_code="LEAF_001")

Development

Running Tests

poetry run pytest

Code Structure

Important Notes

ANTLR Version

This project uses ANTLR 4.9.2. Always run Python scripts using Poetry to ensure the correct runtime version:

poetry run python your_script.py  # ✅ Correct
python your_script.py             # ❌ May use wrong ANTLR version

Database Sessions

When using database APIs without explicit connection configuration, pyDPM uses global session management. For concurrent usage or testing, pass explicit database paths or connection URLs:

api = DataDictionaryAPI(database_path="./test.db")
# or
api = DataDictionaryAPI(connection_url="postgresql://user:pass@localhost/db")

License

[Add your license information here]

Project details


Release history Release notifications | RSS feed

This version

0.3.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydpm_xl-0.3.2.tar.gz (231.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydpm_xl-0.3.2-py3-none-any.whl (256.7 kB view details)

Uploaded Python 3

File details

Details for the file pydpm_xl-0.3.2.tar.gz.

File metadata

  • Download URL: pydpm_xl-0.3.2.tar.gz
  • Upload date:
  • Size: 231.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for pydpm_xl-0.3.2.tar.gz
Algorithm Hash digest
SHA256 2582e92e6f18e58eda4913285fb18f322367e34326d5d9bcd3c1560109e80b9f
MD5 b95bc66b8c651512e78db46afb883cba
BLAKE2b-256 d232ed0d582b2c96e271a7b0f25f1a3a9e3e0d46c92950f62211ad129ef103ad

See more details on using hashes here.

File details

Details for the file pydpm_xl-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: pydpm_xl-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 256.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for pydpm_xl-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0dd88fd02a7b808eac05daaf2fec9fce2328ccf33dca81a2580567d31561ee91
MD5 2d47d5f857c1a8c3038e083ab59d34c7
BLAKE2b-256 2d545e48b28cb8c0c78d2da514f678a3dee22204a5f9a7cc58d26b10a687a748

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page