SAP <-> Salesforce Account Data Reconciliation Utility
Project description
SAP ↔ Salesforce Account Data Reconciliation Utility
Reconcile SAP and Salesforce Account master data at bulk scale (300K–400K records). Produces a 10-tab Excel workbook and an HTML dashboard with KPIs, field-level diffs, fuzzy match candidates, and a prioritised action plan.
Quick Start
# 1. Install package in editable mode (library + CLI)
pip install -e .
# 2. Place your CSV files in input/
# input/sap_accounts.csv
# input/sf_accounts.csv
# 3. Run using installed CLI command
reconcile-accounts --sap input/sap_accounts.csv --sf input/sf_accounts.csv
# 4. Or run using config input paths (config/rules.yaml -> input.sap/input.sf)
reconcile-accounts
# Output written to output/
Legacy script invocation still works:
python run_reconciliation.py
Options
--sap Path to SAP accounts CSV (optional if config input.sap is set)
--sf Path to Salesforce accounts CSV (optional if config input.sf is set)
--config Path to rules YAML (default: ./config/rules.yaml, then packaged default)
--output-dir Output directory (default: from config)
--formats excel html (default: both)
--dry-run Validate config + headers only; no report written
--no-fuzzy Skip fuzzy matching (faster for large files)
--verbose Verbose logging
Path resolution precedence:
- If
--sap/--sfare passed, CLI values are used. - If not passed, values are resolved from
config/rules.yamlunderinput.sapandinput.sf. - If neither CLI nor config provides paths, the run exits with an input path error.
Configuration
Edit config/rules.yaml to change:
- Default input files via
input.sapandinput.sf(directory+file_name) - Join key columns (SAP ↔ SF linking fields)
- Fallback-key matching toggle via
join.fallback.enabled(default:false= primary-key-only matching) - Field comparison rules, severity levels, and normalize modes
- Deduplication strategy (
keep_first/keep_last/flag_all) - Fuzzy match threshold and fields
- Output formats and directory
- Output report location/name via
output.report.directory+output.report.file_name
Config Reference (Input + Join)
input:
sap:
directory: "input"
file_name: "sap_accounts.csv"
sf:
directory: "input"
file_name: "sf_accounts.csv"
join:
primary:
sap_col: "SAP_Unique_ID"
sf_col: "BP_PowerCerv_Account_Id__c"
fallback:
enabled: false
sap_col: "SAP_Unique_ID"
sf_col: "WC_SAP_Identification__c"
output:
formats: ["excel", "html"]
report:
directory: "output"
file_name: "reconciliation_report"
Notes:
- Set
join.fallback.enabled: falsefor strict primary-key-only matching (default). - Set
join.fallback.enabled: trueonly when you explicitly want fallback-key matching.
Report Tabs
| Tab | Content |
|---|---|
| Summary | KPI counts, match rate, exception rate |
| Exact_Matches | Records found in both systems |
| Field_Mismatches | Field-level diffs (CRITICAL / HIGH / INFO) |
| SAP_Only | SAP records missing from Salesforce |
| SF_Only | Salesforce records missing from SAP |
| SAP_Duplicates | Duplicate SAP rows before dedup |
| SF_Duplicates | Duplicate SF rows before dedup |
| Fuzzy_Match_Candidates | Likely-same accounts not linked by ID |
| Data_Quality_Issues | Null IDs, bad formats, validation failures |
| Action_Plan | P1–P4 prioritised remediation table |
Run Tests
pip install pytest
python -m pytest tests/ -v
Distribution (Business Rollout)
# Build wheel + source distribution
python -m build
# Install locally from wheel
pip install dist/phani_data_recon-1.0.0-py3-none-any.whl
If reconcile-accounts is not on PATH, run:
python -m phani_data_recon.cli --dry-run
Project Structure
reconciliation_project/
├── input/ ← Place source CSVs here
├── config/ ← rules.yaml + schema
├── src/ ← All Python modules
├── templates/ ← Jinja2 HTML template
├── tests/ ← pytest test suite
├── output/ ← Reports generated here
└── run_reconciliation.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file phani_data_recon-1.0.1.tar.gz.
File metadata
- Download URL: phani_data_recon-1.0.1.tar.gz
- Upload date:
- Size: 33.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
680542a34ba6fdacfe3bbaee7f7e5f2a4f039463b496343453e86c883132ed12
|
|
| MD5 |
dc6ff47162a000693ab7472f855c4b84
|
|
| BLAKE2b-256 |
56262030e00885e5828bb0133b75a6d534a25df50d2ce0e5e83a721bd1648efd
|
File details
Details for the file phani_data_recon-1.0.1-py3-none-any.whl.
File metadata
- Download URL: phani_data_recon-1.0.1-py3-none-any.whl
- Upload date:
- Size: 31.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06c04ac3095369be5217a6de2ab4106480693416bea760372acc9dfa887dabc7
|
|
| MD5 |
e047504274e5e1e3d0c1b1b893d1fb8f
|
|
| BLAKE2b-256 |
fc239123787ab243bd206b6e6b592c8054ad7fc42fa965e6277fdb3a823c891b
|