SAP <-> Salesforce Account Data Reconciliation Utility

Project description

SAP ↔ Salesforce Account Data Reconciliation Utility

Reconcile SAP and Salesforce Account master data at bulk scale (300K–400K records). Produces a 10-tab Excel workbook and an HTML dashboard with KPIs, field-level diffs, fuzzy match candidates, and a prioritised action plan.

Quick Start

Installation

# Install from PyPI
py -m pip install phani-data-recon

# Upgrade to the latest version
py -m pip install --upgrade phani-data-recon

# Verify installed version
py -m pip show phani-data-recon

Run the Application

Using the CLI command:

# Verify the CLI is available
reconcile-accounts --help

# Run with explicit SAP and Salesforce input files
reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv

Using py module (alternative if CLI is not on PATH):

py -m phani_data_recon.cli --sap input/sap_accounts.csv --sf input/sf_accounts.csv

Configuration & Input/Output

Configure input and output folders via config file:

Edit config/rules.yaml to set default paths:

input:
  sap:
    directory: "input"
    file_name: "sap_accounts.csv"
  sf:
    directory: "input"
    file_name: "sf_accounts.csv"

output:
  formats: ["excel", "html"]
  report:
    directory: "output"
    file_name: "reconciliation_report"

Then run with config:

reconcile-accounts --config config/rules.yaml
py -m phani_data_recon.cli --config config/rules.yaml

Override output directory at runtime:

reconcile-accounts --config config/rules.yaml --output-dir output/custom_run
py -m phani_data_recon.cli --config config/rules.yaml --output-dir output/custom_run

Validation & Advanced Options

Validate headers and config only (dry-run):

reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv --dry-run
py -m phani_data_recon.cli --sap input/sap_accounts.csv --sf input/sf_accounts.csv --dry-run

Generate only HTML output:

reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv --formats html
py -m phani_data_recon.cli --sap input/sap_accounts.csv --sf input/sf_accounts.csv --formats html

Skip fuzzy matching (faster for large files):

reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv --no-fuzzy
py -m phani_data_recon.cli --sap input/sap_accounts.csv --sf input/sf_accounts.csv --no-fuzzy

Enable verbose logging:

reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv --verbose
py -m phani_data_recon.cli --sap input/sap_accounts.csv --sf input/sf_accounts.csv --verbose

Platform-specific path examples:

Windows:

reconcile-accounts --sap .\input\sap_accounts.csv --sf .\input\sf_accounts.csv
py -m phani_data_recon.cli --config .\config\rules.yaml --dry-run

# macOS
reconcile-accounts --sap ./input/sap_accounts.csv --sf ./input/sf_accounts.csv
python3 -m phani_data_recon.cli --config ./config/rules.yaml --dry-run

If Windows cmd does not recognize reconcile-accounts, add your Python Scripts directory to PATH and reopen cmd:

setx PATH "%PATH%;C:\Users\SeshaphaniBysani\AppData\Local\Python\pythoncore-3.14-64\Scripts"

Then verify:

where reconcile-accounts
reconcile-accounts --help

For local development in this repository, editable install still works:

py -m pip install -e .

Package Usage

# Explicit input files
reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv

# Config-driven execution
reconcile-accounts --config config/rules.yaml

# Override only the output directory
reconcile-accounts --config config/rules.yaml --output-dir output/run_2026_05_11

# Generate only HTML output
reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv --formats html

If the console script is not available on your PATH, use:

py -m phani_data_recon.cli --dry-run

Production verification on Windows cmd:

py -m pip show phani-data-recon
where reconcile-accounts
py -m phani_data_recon.cli --sap input/sap_accounts.csv --sf input/sf_accounts.csv --dry-run

Expected state:

Installed version should be 1.0.1.
If where reconcile-accounts is empty but module execution works, only PATH needs to be fixed.

Python API

Run reconciliation from another Python application:

from phani_data_recon.api import run_reconciliation

exit_code = run_reconciliation(
    sap="input/sap_accounts.csv",
    sf="input/sf_accounts.csv",
    config="config/rules.yaml",
    output_dir="output/api_run",
    formats=["excel", "html"],
    dry_run=False,
    no_fuzzy=False,
    verbose=True,
)

print(exit_code)

The API mirrors the CLI behavior and returns a process-style exit code.

Options

--sap         Path to SAP accounts CSV (optional if config input.sap is set)
--sf          Path to Salesforce accounts CSV (optional if config input.sf is set)
--config      Path to rules YAML (default: ./config/rules.yaml, then packaged default)
--output-dir  Output directory (default: from config)
--formats     excel html (default: both)
--dry-run     Validate config + headers only; no report written
--no-fuzzy    Skip fuzzy matching (faster for large files)
--verbose     Verbose logging

Path resolution precedence:

If --sap / --sf are passed, CLI values are used.
If not passed, values are resolved from config/rules.yaml under input.sap and input.sf.
If --config is not passed, the CLI tries local config/rules.yaml first and then the packaged default config.
If --output-dir is passed, it overrides output.report.directory.
If neither CLI nor config provides paths, the run exits with an input path error.

Configuration

Edit config/rules.yaml to change:

Default input files via input.sap and input.sf (directory + file_name)
Join key columns (SAP ↔ SF linking fields)
Fallback-key matching toggle via join.fallback.enabled (default: false = primary-key-only matching)
Field comparison rules, severity levels, and normalize modes
Deduplication strategy (keep_first / keep_last / flag_all)
Fuzzy match threshold and fields
Output formats and directory
Output report location/name via output.report.directory + output.report.file_name

When using the package outside this repository, pass your own config file with --config if you do not want to rely on the packaged defaults.

Output Report Configuration

Use the output.report block in config/rules.yaml to control where reports are written and what base filename is used.

output:
	formats: ["excel", "html"]
	report:
		directory: "output/month_end"
		file_name: "customer_reconciliation"

This writes reports under output/month_end/ using customer_reconciliation as the base name, for example:

output/month_end/customer_reconciliation_<run_id>.html
output/month_end/customer_reconciliation_<run_id>.xlsx

Rules:

--output-dir overrides output.report.directory
output.report.file_name sets the report filename prefix
output.formats selects Excel, HTML, or both

Example commands:

# Use output settings from config
reconcile-accounts --config config/rules.yaml

# Override only the output directory at runtime
reconcile-accounts --config config/rules.yaml --output-dir output/ad_hoc_run

Config Reference (Input + Join)

input:
	sap:
		directory: "input"
		file_name: "sap_accounts.csv"
	sf:
		directory: "input"
		file_name: "sf_accounts.csv"

join:
	primary:
		sap_col: "SAP_Unique_ID"
		sf_col:  "BP_PowerCerv_Account_Id__c"
	fallback:
		enabled: false
		sap_col: "SAP_Unique_ID"
		sf_col:  "WC_SAP_Identification__c"

output:
	formats: ["excel", "html"]
	report:
		directory: "output"
		file_name: "reconciliation_report"

Notes:

Set join.fallback.enabled: false for strict primary-key-only matching (default).
Set join.fallback.enabled: true only when you explicitly want fallback-key matching.

Report Tabs

Tab	Content
Summary	KPI counts, match rate, exception rate
Exact_Matches	Records found in both systems
Field_Mismatches	Field-level diffs (CRITICAL / HIGH / INFO)
SAP_Only	SAP records missing from Salesforce
SF_Only	Salesforce records missing from SAP
SAP_Duplicates	Duplicate SAP rows before dedup
SF_Duplicates	Duplicate SF rows before dedup
Fuzzy_Match_Candidates	Likely-same accounts not linked by ID
Data_Quality_Issues	Null IDs, bad formats, validation failures
Action_Plan	P1–P4 prioritised remediation table

Run Tests

py -m pip install pytest
py -m pytest tests/ -v

Distribution (Business Rollout)

# Build wheel + source distribution
py -m pip install build
py -m build

# Install locally from wheel
py -m pip install dist/phani_data_recon-1.0.3-py3-none-any.whl

If reconcile-accounts is not on PATH, run:

py -m phani_data_recon.cli --dry-run

Published package:

py -m pip install --upgrade phani-data-recon

Legacy script usage inside this repository still works:

python run_reconciliation.py --dry-run

CI Publishing (GitHub Actions)

This repository includes .github/workflows/publish-pypi.yml to publish new releases to PyPI without storing a PyPI API token in GitHub.

One-time PyPI setup:

In PyPI, open the phani-data-recon project settings.
Add a Trusted Publisher for this GitHub repository.
Set the workflow name to publish-pypi.yml.
Set the environment name to pypi.

Release flow:

Bump the version in pyproject.toml.
Create a GitHub release or run the workflow manually from the Actions tab.
The workflow builds dist/ artifacts and publishes them with PyPI trusted publishing.

Notes:

This workflow uses GitHub OIDC via id-token: write, so no TWINE_PASSWORD secret is required in GitHub.
Keep local twine usage only for manual emergency releases.

Project Structure

reconciliation_project/
├── input/           ← Place source CSVs here
├── config/          ← rules.yaml + schema
├── src/             ← All Python modules
├── templates/       ← Jinja2 HTML template
├── tests/           ← pytest test suite
├── output/          ← Reports generated here
└── run_reconciliation.py

Project details

Release history Release notifications | RSS feed

1.0.7

May 11, 2026

1.0.6

May 11, 2026

1.0.5

May 11, 2026

This version

1.0.4

May 11, 2026

1.0.2

May 11, 2026

1.0.1

May 11, 2026

1.0.0

May 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phani_data_recon-1.0.4.tar.gz (36.0 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

phani_data_recon-1.0.4-py3-none-any.whl (33.5 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file phani_data_recon-1.0.4.tar.gz.

File metadata

Download URL: phani_data_recon-1.0.4.tar.gz
Upload date: May 11, 2026
Size: 36.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phani_data_recon-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`2bdc7c62e274a12c4790c471138b94d7b0a01d778845c7b0945b268e54117042`
MD5	`673d818802492c522bbab1909fc45ca2`
BLAKE2b-256	`4877a74fc364fbe6e98b643e903292e6f134b07a69b84fcac98319c5ad63f9d8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for phani_data_recon-1.0.4.tar.gz:

Publisher: publish-pypi.yml on phanimca/PYTHON_PH_ACCOUNT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: phani_data_recon-1.0.4.tar.gz
- Subject digest: 2bdc7c62e274a12c4790c471138b94d7b0a01d778845c7b0945b268e54117042
- Sigstore transparency entry: 1507650924
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: phanimca/PYTHON_PH_ACCOUNT@26aefc76e664459dead5c5821caf5975b313311b
- Branch / Tag: refs/tags/v1.0.4
- Owner: https://github.com/phanimca
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@26aefc76e664459dead5c5821caf5975b313311b
- Trigger Event: release

File details

Details for the file phani_data_recon-1.0.4-py3-none-any.whl.

File metadata

Download URL: phani_data_recon-1.0.4-py3-none-any.whl
Upload date: May 11, 2026
Size: 33.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phani_data_recon-1.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`528f0994e9f95c8e9128351b39872faf61f45c06295e79a47b7af7721810f3b0`
MD5	`2c362707c9e68c7eb18f3e634df4f674`
BLAKE2b-256	`b63261882ee8afa0e2c1eb080695b17dcd7cb31f8774a3963929573778f8b69b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for phani_data_recon-1.0.4-py3-none-any.whl:

Publisher: publish-pypi.yml on phanimca/PYTHON_PH_ACCOUNT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: phani_data_recon-1.0.4-py3-none-any.whl
- Subject digest: 528f0994e9f95c8e9128351b39872faf61f45c06295e79a47b7af7721810f3b0
- Sigstore transparency entry: 1507651148
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: phanimca/PYTHON_PH_ACCOUNT@26aefc76e664459dead5c5821caf5975b313311b
- Branch / Tag: refs/tags/v1.0.4
- Owner: https://github.com/phanimca
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@26aefc76e664459dead5c5821caf5975b313311b
- Trigger Event: release

phani-data-recon 1.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

SAP ↔ Salesforce Account Data Reconciliation Utility

Quick Start

Installation

Run the Application

Configuration & Input/Output

Validation & Advanced Options

Package Usage

Python API

Options

Configuration

Output Report Configuration

Config Reference (Input + Join)

Report Tabs

Run Tests

Distribution (Business Rollout)

CI Publishing (GitHub Actions)

Project Structure

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance