SDMF - Standard Data Management Framework

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

Standard Data Management Framework (SDMF)

A modular, scalable, and Python-based Data Management Framework designed to standardize data ingestion, validation, transformation, metadata handling, and storage across enterprise workflows.

This framework eliminates repetitive boilerplate and provides a consistent structure for building reliable, maintainable data pipelines.

✅ Key Features

Modular Design – Plug-and-play components for ingestion, validation, transformation, and storage.
Schema Alignment & Partitioning – Built-in support for CDC (Change Data Capture) and MERGE operations.
Metadata Management – Centralized handling of feed specifications and lineage.
Scalable – Works seamlessly with Spark, Delta Lake, and distributed environments like Databricks.
Logging & Monitoring – Custom logging with retention and rotation policies.

📂 Project Structure

sdmf/
├── cli/                # Command-line interface for orchestration
├── config/             # Configurations (logging, paths, retention)
├── orchestrator/       # Pipeline orchestration logic
├── result_generator/   # Excel/Report generation utilities
├── utils/              # Helper functions
└── ...

⚙️ Installation

Option 1 (Recommended): Editable Install

From the project root (where pyproject.toml is located):

pip install -e .
python -m build

Then run:

python -m sdmf.cli.main

🔗 Dependencies

Install required packages:

pip install pyspark==3.5.1 delta-spark==3.1.0

🚀 Usage

Run the main orchestrator:

python -m sdmf.cli.main --config config/config.ini --run_id <unique_run_id>

🛠 Configuration

Update config.ini:

[DEFAULT]
outbound_directory_name=sdmf_outbound
log_directory_name=sdmf_logs
temp_log_location=/tmp/
file_hunt_path=/dbfs/FileStore/sdmf/
log_retention_policy_in_days=7

[FILES]
master_spec_name=master_specs.xlsx

✅ Logging

Logs are first written to /tmp/sdmf_logs for speed.
After job completion, logs are moved to the final directory (file_hunt_path).
Automatic cleanup of logs older than 7 days.

✅ Best Practices

Use editable install for development.
Keep configs modular for different environments (Dev, QA, Prod).
Ensure DBFS or UC volumes for persistent storage in Databricks.

📌 Next Steps

Add unit tests for core modules.
Integrate structured logging (JSON) for ELK/Splunk.
Enable compression for archived logs.

Project details

Release history Release notifications | RSS feed

0.1.16

Feb 7, 2026

0.1.15

Feb 7, 2026

0.1.14

Feb 7, 2026

0.1.13

Feb 7, 2026

0.1.12

Feb 7, 2026

0.1.11

Feb 7, 2026

0.1.10

Feb 7, 2026

0.1.9

Feb 6, 2026

0.1.8

Feb 5, 2026

0.1.7

Feb 4, 2026

0.1.6

Feb 3, 2026

0.1.5

Feb 3, 2026

0.1.4

Feb 3, 2026

0.1.3

Feb 3, 2026

0.1.2

Feb 3, 2026

0.1.1

Jan 28, 2026

This version

0.1.0

Jan 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdmf-0.1.0.tar.gz (37.8 kB view details)

Uploaded Jan 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sdmf-0.1.0-py3-none-any.whl (59.2 kB view details)

Uploaded Jan 28, 2026 Python 3

File details

Details for the file sdmf-0.1.0.tar.gz.

File metadata

Download URL: sdmf-0.1.0.tar.gz
Upload date: Jan 28, 2026
Size: 37.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for sdmf-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`585f665df8f0ff9dbbd3adb2f24cede3312ff1dde882308fdefa9b2c29dcd880`
MD5	`c11e8eeddd6296358e61252496018bec`
BLAKE2b-256	`6b9bf425e87d5c7d7756590f15786914de1248d0abc589c9e419f3bd2373e77c`

See more details on using hashes here.

File details

Details for the file sdmf-0.1.0-py3-none-any.whl.

File metadata

Download URL: sdmf-0.1.0-py3-none-any.whl
Upload date: Jan 28, 2026
Size: 59.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for sdmf-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aa41e3e57a235374783e6973e41620078a034866fe9d1beb1036606fa36c3971`
MD5	`966ab870cfef53b75852f1dfc1da4964`
BLAKE2b-256	`ad3c34b5bab4276d309c683757cb8b96306ea5f7f66e9ae3a44423ac6db09f55`

See more details on using hashes here.

sdmf 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Standard Data Management Framework (SDMF)

✅ Key Features

📂 Project Structure

⚙️ Installation

Option 1 (Recommended): Editable Install

🔗 Dependencies

🚀 Usage

🛠 Configuration

✅ Logging

✅ Best Practices

📌 Next Steps

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes