Airgapped Informatica PowerCenter automated conversion tool with signed audit trails and governance evidence
Project description
Cepheus Benza
Airgapped Informatica PowerCenter conversion tool that produces signed audit trails, equivalence reports, risk scores, and pre-conversion analysis — all as tamper-evident PDFs with embedded JSON attachments.
Cepheus Benza reads a PowerCenter repository XML export and writes production-ready code for your target platform — without connecting to any database or network during conversion. Before converting, the analysis report tells you exactly what can be automated, what requires manual effort, and the complexity tier of each workflow, so you know what you're getting into before committing.
Target platforms
| Platform | Output | Orchestration |
|---|---|---|
| dbt | SQL models, macros, snapshots, sources.yml | Airflow DAG, Dagster job, Prefect flow, Python script |
| PySpark / Databricks | PySpark job scripts, notebook cells | Databricks Workflows |
| Snowpark / Snowflake | Snowpark stored procedures | Snowflake Tasks |
| AWS Glue | Glue PySpark job scripts | Step Functions, Glue Workflows, Airflow DAG |
| Azure Data Factory | Mapping Data Flow JSON, Pipeline JSON | ADF Pipelines |
| Dataform / BigQuery | SQLX models (dbt-compatible) | Dataform schedules |
Key capabilities
| Capability | Details |
|---|---|
| Airgapped operation | Core conversion runs with no network access; reconciliation is optional and separate |
| Pre-conversion analysis | Analysis report shows automation rate, complexity tiers (simple/moderate/complex), and manual session count per workflow — before you convert |
| Signed audit trails | Per-node translation records with confidence levels, rule attribution, and reviewer actions — PDF/A-3 with embedded JSON |
| Equivalence reports | Per-mapping behavioural equivalence assessments with step-level detail |
| Risk scores | Risk classification for each mapping, surfaced in the Workbench heatmap |
| Transformation coverage | 17 Informatica types: Source, Target, Expression, Filter, Lookup, Aggregator, Joiner, Router, Union, Sequence Generator, Normalizer, Rank, Sorter, Mapplet, Stored Procedure, Update Strategy, SCD |
| Expression functions | 111 Informatica PC 10.5 functions translated by formal grammar (Lark + sqlglot) |
| SQL dialect transpilation | Snowflake, BigQuery, Redshift, Databricks, Azure Synapse, generic ANSI SQL |
| SCD support | Type 1, 2, and 3 detection with dbt snapshots for Type 2 |
| Workflow orchestration | Full conversion of Informatica workflows, sessions, worklets, and decision tasks to target orchestration |
| Workbench UI | Local Flask dashboard for reviewing and approving converted mappings |
Requirements
- Python 3.9 or later
- A licensed Cepheus Benza installation (see Licensing below)
Installation
pip install cepheus-benza
Verify the installation:
cepheus-benza --version
Quick start
Every command takes a spec.toml project file as its first argument. Copy docs/spec.toml as a starting point and fill in the paths for your project. CLI flags override spec values when provided.
1. Analyze your exports
cepheus-benza analyze spec.toml
Scans the directory for PowerCenter XML files and produces analysis_report.pdf — automation rate, per-mapping confidence levels, complexity tiers, and degradation addendum. The PDF embeds SHA256 checksums as a JSON attachment. Submit it to benza@cepheus.in to obtain a license file.
2. Verify your license
Add the license path to your spec.toml:
[verify_license]
license = "./license.pdf"
Then verify:
cepheus-benza verify-license spec.toml
The output shows VALID, MISMATCH, or UNLICENSED for each export file. If a file shows MISMATCH, it was modified since analysis — re-run analyze and request a new license.
3. Convert
# dbt (default)
cepheus-benza convert spec.toml
# dbt with Snowflake dialect and Airflow orchestration
cepheus-benza convert spec.toml --dialect snowflake --orchestrator airflow
# PySpark / Databricks
cepheus-benza convert spec.toml --target pyspark
# Snowpark / Snowflake
cepheus-benza convert spec.toml --target snowpark
# AWS Glue
cepheus-benza convert spec.toml --target glue
# Azure Data Factory
cepheus-benza convert spec.toml --target adf
4. Review in the Workbench
cepheus-benza serve spec.toml
Open http://localhost:8080 in your browser. The Workbench shows a dashboard of all converted mappings with risk scores, SQL previews, audit trails, and an approval workflow.
Output structure (dbt target)
dbt_project/
├── dbt_project.yml # dbt project config; var declarations for $$ parameters
├── README.md # Conversion summary (mapping count, model count, warnings)
├── models/
│ ├── sources.yml # Source table definitions
│ ├── <model>.sql # SELECT with {{ source() }} and {{ ref() }} macros
│ └── <model>.yml # Column-level tests and documentation
├── macros/
│ └── <mapplet>.sql # Mapplet logic as reusable dbt macros
├── snapshots/
│ └── <snapshot>.sql # SCD Type 2 snapshot configs
├── stubs/
│ └── <procedure>.py # Stored procedure Python stubs for manual implementation
├── orchestration/
│ └── <workflow>.py # Workflow DAG / script
├── params/
│ ├── dbt_vars.yml # dbt variable declarations for workflow parameters
│ └── env_template.sh # Shell env file template
├── validation/
│ └── <model>_validation.sql # Row-count and aggregate comparison queries
└── reports/
├── audit_trail.json # Per-node translation records with confidence levels
├── audit_report.pdf # Signed audit trail PDF (embedded JSON attachment)
├── equivalence_report.json # Behavioural equivalence assessments
├── equivalence_report.pdf # Signed equivalence PDF (embedded JSON attachment)
└── risk_scores.json # Risk scoring for each translated mapping
Other targets produce equivalent structures: PySpark job scripts, Snowpark stored procedures, Glue job scripts, or ADF JSON definitions — each with the same reports/ directory containing signed audit trails and equivalence reports.
Licensing
Cepheus Benza uses a checksum-based license that ties a specific license file to specific export files.
analyze → submit report → receive license → convert
The analyze command produces a PDF containing SHA256 checksums of your export files embedded as a JSON attachment. Cepheus issues a license that approves exactly those files. The convert command validates the checksums before running the pipeline. This prevents accidental conversion of modified or untested exports.
To produce a black-on-white printable version of any PDF output, set printable = true in [global] or use the --printable flag:
cepheus-benza --printable analyze spec.toml
Governance artifacts
Four signed PDF reports are produced across the analyze and convert workflow. They cannot be disabled.
analysis_report.pdf — produced by analyze, before conversion:
- Automation rate, mapping classification, node confidence breakdown
- Mapping and orchestration complexity tiers (simple / moderate / complex)
- Degradation addendum listing untranslatable nodes and manual sessions
- Embedded
analysis_checksums.jsonwith SHA256 fingerprints for licensing
audit_report.pdf — produced by convert:
- Every translation decision: node name, type, mapping, handling rule
- Confidence level:
exact_equivalent,semantic_equivalent,behavioural_equivalent,approximation, orstub - Informatica input snapshot and target output snapshot
- Reviewer action required per node
equivalence_report.pdf — produced by convert:
- Per-mapping behavioural equivalence assessment with step-level detail
risk_scores.json — produced by convert:
- Risk classification for each mapping, surfaced in the Workbench risk heatmap
All PDFs are PDF/A-3 with embedded JSON attachments and HMAC integrity signatures.
Reconciliation (optional)
The reconciliation engine requires SQLAlchemy and a database driver for your warehouse. Install both before using the reconcile command — run cepheus-benza howto for the full guide with driver packages and connection setup.
After loading your converted output into the target warehouse:
cepheus-benza reconcile spec.toml
Results are written to reports/reconciliation_report.json and reports/reconciliation_report.pdf, and surfaced in the Workbench.
Further reading
The complete How-To Guide is bundled with the package. After installation:
cepheus-benza howto
This opens the full guide in your browser — installation, spec.toml configuration, all six target platforms, reconciliation setup, governance artifacts, custom rules, environment variables, and troubleshooting. Works offline.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 5.1 MB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
052fe207e985ab83530cdf7f3a8bcdb1a84ea7165688176c59e012dd0912b942
|
|
| MD5 |
476f73902d1fb43cd61eb5699985051b
|
|
| BLAKE2b-256 |
0ea87aac91ca9b5e8592a48dc5649eb4e5a2993ffded8a3d94ed04fc9713e8b4
|
Provenance
The following attestation bundles were made for cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl:
Publisher:
release.yml on cepheus-engg/benza
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cepheus_benza-5.3.98-cp39-abi3-win_amd64.whl -
Subject digest:
052fe207e985ab83530cdf7f3a8bcdb1a84ea7165688176c59e012dd0912b942 - Sigstore transparency entry: 1324747854
- Sigstore integration time:
-
Permalink:
cepheus-engg/benza@a1da1336d21d7ce3d2773dd1c1981f123613a6f4 -
Branch / Tag:
refs/heads/release - Owner: https://github.com/cepheus-engg
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a1da1336d21d7ce3d2773dd1c1981f123613a6f4 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl.
File metadata
- Download URL: cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl
- Upload date:
- Size: 29.3 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64, manylinux: glibc 2.5+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc78355ec1e91e2cdd21593adf8998fed317150ec59bb4d7ebc8710d65911a00
|
|
| MD5 |
8c8d2016cfacd13fae64b36b6ca5078f
|
|
| BLAKE2b-256 |
b33d23c7191ecc539d33e97d6a1fd22ff3905cff1cd43c4a6cc46ad37b4f8499
|
Provenance
The following attestation bundles were made for cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl:
Publisher:
release.yml on cepheus-engg/benza
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cepheus_benza-5.3.98-cp39-abi3-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl -
Subject digest:
cc78355ec1e91e2cdd21593adf8998fed317150ec59bb4d7ebc8710d65911a00 - Sigstore transparency entry: 1324747939
- Sigstore integration time:
-
Permalink:
cepheus-engg/benza@a1da1336d21d7ce3d2773dd1c1981f123613a6f4 -
Branch / Tag:
refs/heads/release - Owner: https://github.com/cepheus-engg
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a1da1336d21d7ce3d2773dd1c1981f123613a6f4 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl.
File metadata
- Download URL: cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl
- Upload date:
- Size: 11.0 MB
- Tags: CPython 3.9+, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
140904b4fc78df8f4c6c7e913699c61dffdf1439a6a950e5e69237e84716b7b1
|
|
| MD5 |
6d39fd8e1d391379101496fb8bc14cca
|
|
| BLAKE2b-256 |
d4848ac6a0d5d5b22b7fd62df7f46776ddea0db489c2f5aa96fb68f3e7fa3ceb
|
Provenance
The following attestation bundles were made for cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl:
Publisher:
release.yml on cepheus-engg/benza
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cepheus_benza-5.3.98-cp39-abi3-macosx_10_9_universal2.whl -
Subject digest:
140904b4fc78df8f4c6c7e913699c61dffdf1439a6a950e5e69237e84716b7b1 - Sigstore transparency entry: 1324747746
- Sigstore integration time:
-
Permalink:
cepheus-engg/benza@a1da1336d21d7ce3d2773dd1c1981f123613a6f4 -
Branch / Tag:
refs/heads/release - Owner: https://github.com/cepheus-engg
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a1da1336d21d7ce3d2773dd1c1981f123613a6f4 -
Trigger Event:
workflow_dispatch
-
Statement type: