A dependency-free audit trail system for Python data transformations.
Project description
CleanCore Python
A lightweight, dependency-free audit trail system for Python data pipelines.
CleanCore automatically creates immutable, row-level audit logs for every data transformation. It helps with debugging, compliance, and understanding how your data changes across cleaning steps.
Features
- Automatic row-level audit logs
- Zero external dependencies (pure Python)
- JSON-based audit output for compliance and record keeping
- Works with lists, dictionaries, and CSV-style data
- Simple decorator-based API
Installation
pip install cleancore-python
Quick Start
Audit a Single Function
python
Copy code
from cleancore import audit_trail, ProvenaLogger, generate_terminal_report
@audit_trail(rule_id="GDPR_EMAIL_MASKING")
def clean_emails(data):
result = []
for row in data:
new_row = row.copy()
if '@' in new_row.get('email', ''):
new_row['email'] = '***@***.***'
result.append(new_row)
return result
logger = ProvenaLogger("Single_Transformation")
data = [
{'id': 1, 'email': 'test@example.com'},
{'id': 2, 'email': 'user'}
]
cleaned = clean_emails(data, provena_logger=logger)
print(generate_terminal_report(logger))
Pipeline Usage
python
Copy code
from cleancore import audit_pipeline, audit_trail
import csv
def load_data(path):
with open(path) as f:
return list(csv.DictReader(f))
@audit_trail(rule_id="STANDARDIZE_NAMES")
def standardize_names(data):
return data
@audit_trail(rule_id="FILL_MISSING_VALUES")
def fill_missing(data):
return data
with audit_pipeline("Customer_Onboarding_Pipeline") as logger:
data = load_data("customers.csv")
data = standardize_names(data, provena_logger=logger)
data = fill_missing(data, provena_logger=logger)
logger.export_json("customer_pipeline_audit.json")
Output
CleanCore generates a human-readable terminal report and a machine-readable JSON audit log containing:
Transformation name
Rule ID
Rows before and after
Number of changed rows
Sample value changes
Execution timestamps
API Overview
audit_trail – Decorator for auditing functions
ProvenaLogger – Collects audit events
audit_pipeline – Context manager for multi-step pipelines
generate_terminal_report() – Prints terminal summary
export_json() – Saves audit log to file
Source Code
GitHub Repository
https://github.com/Sidra-009/cleancore-python-library
Issues, feature requests, and pull requests are welcome.
License
MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cleancore_python-0.1.2.tar.gz.
File metadata
- Download URL: cleancore_python-0.1.2.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c3541a235b341662482008d7296312d0cf5c0395fd51d19536a99b48ba7ce5d
|
|
| MD5 |
827d5a50595af6dd34e3ae379c6e1238
|
|
| BLAKE2b-256 |
1c7097907182a073795b1a658996577d06a7209a45cc2928c49a52370189fa79
|
File details
Details for the file cleancore_python-0.1.2-py3-none-any.whl.
File metadata
- Download URL: cleancore_python-0.1.2-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14a57c51f4e4a9386144b27d8509e7b70b051ee64409dbc40d4e299a29eacbec
|
|
| MD5 |
5868630ab2678c24854583879597d25a
|
|
| BLAKE2b-256 |
05a287be07efccb3c2ee2d649dc2ade75596dec656d7bddc168638d702b7b744
|