Skip to main content

Zero-overhead data observability with row-level immutability

Project description

CleanCore 🔍

Stop shipping unobserved data. cleancore is a zero-dependency, high-performance tool to inject row-level immutability and audit trails into your Python pipelines.

Think of it like Git for your Data Rows.


Why CleanCore?

Data pipelines often fail silently. cleancore automates the "observability" phase by tracking every mutation and flagging schema drifts before they break your production models.

Key Features

Feature What it does
Audit Trail Decorator that captures row-level changes (Old -> New).
Schema Sentinel Flags type drifts (e.g., int -> str) and null regressions.
Big Data Engine Chunk-based processing (10k batches) to prevent memory crashes.
Zero Config Works with plain Python Lists, Generators, and Pandas out of the box.

[+] Installation

pip install cleancore

[+] Quick Start
from cleancore import audit_trail, ProvenaLogger

# 1. Wrap your transformation
@audit_trail(rule_id="MASK_PII")
def clean_step(data):
    for row in data:
        row['email'] = "***@***"
    return data

# 2. Run with automated reporting
with ProvenaLogger("Production_Pipeline") as logger:
    processed = clean_step(my_data, provena_logger=logger)

# That's it. Professional dashboard prints automatically on exit.

[+] Schema Sentinel (Type Drift)
CleanCore catches silent killers in your data types:

[WARN] age: int -> str (Unexpected type swap)

[WARN] price: float -> NoneType (Null regression)

[+] Contributing
CleanCore is open-source! Want to add a new audit rule or engine optimization?
Check out our GitHub Repository.

[+] License
MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cleancore-1.0.1.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cleancore-1.0.1-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file cleancore-1.0.1.tar.gz.

File metadata

  • Download URL: cleancore-1.0.1.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for cleancore-1.0.1.tar.gz
Algorithm Hash digest
SHA256 1f7874b021681b4285f6bdb25d6c6ca39aeaf3db8fed33f0fa1e7dcf3b672506
MD5 769f4389053484f3f7082abd51da0b7a
BLAKE2b-256 0ca58b2b010ea4a2c973d1c06b36287c4c05776c94de7a28941513070ead125f

See more details on using hashes here.

File details

Details for the file cleancore-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: cleancore-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for cleancore-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bf3083c32033972ccd6f2a6aa423105ca53844ddce071a902f144f12b27896cb
MD5 fb4c463883e57a3955d9c8d0ea5beff0
BLAKE2b-256 bc618b4ac8342f5b3aabe16b798cad2ecc4bee324089ead1f9b94ef25ff89b6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page