Skip to main content

Zero-overhead data observability with row-level immutability

Project description

CleanCore

Stop shipping unobserved data.

cleancore is a zero-dependency, high-performance tool to inject row-level immutability and audit trails into your Python pipelines.

Think of it like Git for your Data Rows.


Why CleanCore?

Data pipelines often fail silently. cleancore automates the observability phase by tracking every mutation and flagging schema drifts before they break your production models.

Features:

  • Audit Trail: Captures row-level changes (Old -> New)
  • Schema Sentinel: Flags type drifts (e.g., int -> str) and null regressions
  • Big Data Engine: Chunk-based processing (10k batches) to prevent memory crashes
  • Zero Config: Works with plain Python Lists, Generators, and Pandas

Installation

pip install cleancore

Quick Start

from cleancore import audit_trail, ProvenaLogger

@audit_trail(rule_id="MASK_PII")
def clean_step(data):
    for row in data:
        row['email'] = "***@***"
    return data

my_data = [{"id": 1, "email": "test@mail.com"}]

with ProvenaLogger("Production_Pipeline") as logger:
    processed = clean_step(my_data, provena_logger=logger)

Schema Sentinel (Type Drift)

CleanCore catches silent killers in your data types:

  • [WARN] age: int -> str (Unexpected type swap)
  • [WARN] price: float -> NoneType (Null regression)

Contributing

CleanCore is open-source! GitHub Repository: https://github.com/Sidra-009/cleancore-python-library


License

MIT License


Author: Sidra Saqlain GitHub: @Sidra-009

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cleancore-1.0.2.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cleancore-1.0.2-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file cleancore-1.0.2.tar.gz.

File metadata

  • Download URL: cleancore-1.0.2.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for cleancore-1.0.2.tar.gz
Algorithm Hash digest
SHA256 4e41e8f248a4dded5c735d18ed47e0730a0d58c569b78262a0c2ba20d25812a1
MD5 7dffd77ea900ad4dd0f0e924c696f3d0
BLAKE2b-256 ff9168fead5ae5c5668226ac01255dd9c254cbb7c308f88f8c1caa45be43d775

See more details on using hashes here.

File details

Details for the file cleancore-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: cleancore-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for cleancore-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 91adf2832eb6291439ad0acd970991fe04623309a40b469d12116a7ea92c802f
MD5 3a9e1b372dd4fc63757592887bea2c45
BLAKE2b-256 e6f8a2fffe0ed57ee6ad7ffd8624a9b068f513e5ecf9495d561284d8e4aeb4ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page