Skip to main content

Airgapped Informatica PowerCenter automated conversion tool with signed audit trails and governance evidence

Project description

Cepheus Benza

Airgapped Informatica PowerCenter conversion tool that produces signed audit trails, equivalence reports, risk scores, and pre-conversion analysis — all as tamper-evident PDFs with embedded JSON attachments.

Cepheus Benza reads a PowerCenter repository XML export and writes production-ready code for your target platform — without connecting to any database or network during conversion. Before converting, the analysis report tells you exactly what can be automated, what requires manual effort, and the complexity tier of each workflow, so you know what you're getting into before committing.


Target platforms

Platform Output Orchestration
dbt SQL models, macros, snapshots, sources.yml Airflow DAG, Dagster job, Prefect flow, Python script
PySpark / Databricks PySpark job scripts, notebook cells Databricks Workflows
Snowpark / Snowflake Snowpark stored procedures Snowflake Tasks
AWS Glue Glue PySpark job scripts Step Functions, Glue Workflows, Airflow DAG
Azure Data Factory Mapping Data Flow JSON, Pipeline JSON ADF Pipelines
Dataform / BigQuery SQLX models (dbt-compatible) Dataform schedules

Key capabilities

Capability Details
Airgapped operation Core conversion runs with no network access; reconciliation is optional and separate
Pre-conversion analysis Analysis report shows automation rate, complexity tiers (simple/moderate/complex), and manual session count per workflow — before you convert
Signed audit trails Per-node translation records with confidence levels, rule attribution, and reviewer actions — PDF/A-3 with embedded JSON
Equivalence reports Per-mapping behavioural equivalence assessments with step-level detail
Risk scores Risk classification for each mapping, surfaced in the Workbench heatmap
Transformation coverage 17 Informatica types: Source, Target, Expression, Filter, Lookup, Aggregator, Joiner, Router, Union, Sequence Generator, Normalizer, Rank, Sorter, Mapplet, Stored Procedure, Update Strategy, SCD
Expression functions 111 Informatica PC 10.5 functions translated by formal grammar (Lark + sqlglot)
SQL dialect transpilation Snowflake, BigQuery, Redshift, Databricks, Azure Synapse, generic ANSI SQL
SCD support Type 1, 2, and 3 detection with dbt snapshots for Type 2
Workflow orchestration Full conversion of Informatica workflows, sessions, worklets, and decision tasks to target orchestration
Workbench UI Local Flask dashboard for reviewing and approving converted mappings

Requirements

  • Python 3.9 or later
  • A licensed Cepheus Benza installation (see Licensing below)

Installation

pip install cepheus-benza

Verify the installation:

cepheus-benza --version

Quick start

Every command takes a spec.toml project file as its first argument. Copy docs/spec.toml as a starting point and fill in the paths for your project. CLI flags override spec values when provided.

1. Analyze your exports

cepheus-benza analyze spec.toml

Scans the directory for PowerCenter XML files and produces analysis_report.pdf — automation rate, per-mapping confidence levels, complexity tiers, and degradation addendum. The PDF embeds SHA256 checksums as a JSON attachment. Submit it to benza@cepheus.in to obtain a license file.

2. Verify your license

Add the license path to your spec.toml:

[verify_license]
license = "./license.pdf"

Then verify:

cepheus-benza verify-license spec.toml

The output shows VALID, MISMATCH, or UNLICENSED for each export file. If a file shows MISMATCH, it was modified since analysis — re-run analyze and request a new license.

3. Convert

# dbt (default)
cepheus-benza convert spec.toml

# dbt with Snowflake dialect and Airflow orchestration
cepheus-benza convert spec.toml --dialect snowflake --orchestrator airflow

# PySpark / Databricks
cepheus-benza convert spec.toml --target pyspark

# Snowpark / Snowflake
cepheus-benza convert spec.toml --target snowpark

# AWS Glue
cepheus-benza convert spec.toml --target glue

# Azure Data Factory
cepheus-benza convert spec.toml --target adf

4. Review in the Workbench

cepheus-benza serve spec.toml

Open http://localhost:8080 in your browser. The Workbench shows a dashboard of all converted mappings with risk scores, SQL previews, audit trails, and an approval workflow.


Output structure (dbt target)

dbt_project/
├── dbt_project.yml          # dbt project config; var declarations for $$ parameters
├── README.md                # Conversion summary (mapping count, model count, warnings)
├── models/
│   ├── sources.yml          # Source table definitions
│   ├── <model>.sql          # SELECT with {{ source() }} and {{ ref() }} macros
│   └── <model>.yml          # Column-level tests and documentation
├── macros/
│   └── <mapplet>.sql        # Mapplet logic as reusable dbt macros
├── snapshots/
│   └── <snapshot>.sql       # SCD Type 2 snapshot configs
├── stubs/
│   └── <procedure>.py       # Stored procedure Python stubs for manual implementation
├── orchestration/
│   └── <workflow>.py        # Workflow DAG / script
├── params/
│   ├── dbt_vars.yml         # dbt variable declarations for workflow parameters
│   └── env_template.sh      # Shell env file template
├── validation/
│   └── <model>_validation.sql   # Row-count and aggregate comparison queries
└── reports/
    ├── audit_trail.json         # Per-node translation records with confidence levels
    ├── audit_report.pdf         # Signed audit trail PDF (embedded JSON attachment)
    ├── equivalence_report.json  # Behavioural equivalence assessments
    ├── equivalence_report.pdf   # Signed equivalence PDF (embedded JSON attachment)
    └── risk_scores.json         # Risk scoring for each translated mapping

Other targets produce equivalent structures: PySpark job scripts, Snowpark stored procedures, Glue job scripts, or ADF JSON definitions — each with the same reports/ directory containing signed audit trails and equivalence reports.


Licensing

Cepheus Benza uses a checksum-based license that ties a specific license file to specific export files.

analyze  →  submit report  →  receive license  →  convert

The analyze command produces a PDF containing SHA256 checksums of your export files embedded as a JSON attachment. Cepheus issues a license that approves exactly those files. The convert command validates the checksums before running the pipeline. This prevents accidental conversion of modified or untested exports.

To produce a black-on-white printable version of any PDF output, set printable = true in [global] or use the --printable flag:

cepheus-benza --printable analyze spec.toml

Governance artifacts

Four signed PDF reports are produced across the analyze and convert workflow. They cannot be disabled.

analysis_report.pdf — produced by analyze, before conversion:

  • Automation rate, mapping classification, node confidence breakdown
  • Mapping and orchestration complexity tiers (simple / moderate / complex)
  • Degradation addendum listing untranslatable nodes and manual sessions
  • Embedded analysis_checksums.json with SHA256 fingerprints for licensing

audit_report.pdf — produced by convert:

  • Every translation decision: node name, type, mapping, handling rule
  • Confidence level: exact_equivalent, semantic_equivalent, behavioural_equivalent, approximation, or stub
  • Informatica input snapshot and target output snapshot
  • Reviewer action required per node

equivalence_report.pdf — produced by convert:

  • Per-mapping behavioural equivalence assessment with step-level detail

risk_scores.json — produced by convert:

  • Risk classification for each mapping, surfaced in the Workbench risk heatmap

All PDFs are PDF/A-3 with embedded JSON attachments and HMAC integrity signatures.


Reconciliation (optional)

The reconciliation engine requires SQLAlchemy and a database driver for your warehouse. Install both before using the reconcile command — run cepheus-benza howto for the full guide with driver packages and connection setup.

After loading your converted output into the target warehouse:

cepheus-benza reconcile spec.toml

Results are written to reports/reconciliation_report.json and reports/reconciliation_report.pdf, and surfaced in the Workbench.


Further reading

The complete How-To Guide is bundled with the package. After installation:

cepheus-benza howto

This opens the full guide in your browser — installation, spec.toml configuration, all six target platforms, reconciliation setup, governance artifacts, custom rules, environment variables, and troubleshooting. Works offline.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.9+Windows x86-64

cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl (29.3 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl (11.0 MB view details)

Uploaded CPython 3.9+macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 052fe207e985ab83530cdf7f3a8bcdb1a84ea7165688176c59e012dd0912b942
MD5 476f73902d1fb43cd61eb5699985051b
BLAKE2b-256 0ea87aac91ca9b5e8592a48dc5649eb4e5a2993ffded8a3d94ed04fc9713e8b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl:

Publisher: release.yml on cepheus-engg/benza

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl.

File metadata

File hashes

Hashes for cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl
Algorithm Hash digest
SHA256 cc78355ec1e91e2cdd21593adf8998fed317150ec59bb4d7ebc8710d65911a00
MD5 8c8d2016cfacd13fae64b36b6ca5078f
BLAKE2b-256 b33d23c7191ecc539d33e97d6a1fd22ff3905cff1cd43c4a6cc46ad37b4f8499

See more details on using hashes here.

Provenance

The following attestation bundles were made for cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl:

Publisher: release.yml on cepheus-engg/benza

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 140904b4fc78df8f4c6c7e913699c61dffdf1439a6a950e5e69237e84716b7b1
MD5 6d39fd8e1d391379101496fb8bc14cca
BLAKE2b-256 d4848ac6a0d5d5b22b7fd62df7f46776ddea0db489c2f5aa96fb68f3e7fa3ceb

See more details on using hashes here.

Provenance

The following attestation bundles were made for cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl:

Publisher: release.yml on cepheus-engg/benza

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page