Skip to main content

Blazing fast hardware-accelerated tabular firewall and regulated sanitization engine.

Project description

LightningClean

Hardware-Accelerated Tabular Firewall and Regulated Low-Latency Data Sanitization Engine.

LightningClean is an enterprise-grade high-performance Python package built with a native C++ backend designed to sanitize massive tabular datasets at bare-metal speeds. By utilizing hardware-level AVX-512/AVX2 SIMD vectorization and breaking Python's execution limits via OpenMP multi-core multithreading, it isolates and rectifies structural data anomalies seamlessly without memory copies.


Key Architectural Capabilities

  1. CPUID Dynamic Dispatcher: Automatically detects host microprocessors at runtime to deploy tailored hardware vectors seamlessly.
  2. True Zero-Copy Memory Linkage: Shares direct RAM memory data pointer tracks across Python/C++ loops to preserve system memory space.
  3. Shield Mode Page Isolation: Wraps executions inside sandboxed memory barriers to capture segmentation faults safely without process termination.
  4. Deterministic Audit Control: Locks parallel reduction variances to deliver strict bit-for-bit mathematical reproducibility across compliance audits.
  5. PII Masking Engine: Performs regex scans during string array extraction to obscure protected data structures natively.

Installation

Standard Production Core

pip install lightningclean

Full Enterprise Web Extra

pip install "lightningclean[web]"

Operational Code Example

import pandas as pd
import lightningclean as lc

# Load a massive contaminated analytical database
df = pd.read_csv("unstable_enterprise_dataset.csv")

# 1-Line Execution Pass with Compliance Constraints Enabled
clean_df = lc.clean(
    df, 
    shield=True, 
    deterministic=True,      # Bit-exact audit reproducibility
    pii_mode=True,           # Automatic masking of protected rows
    numa_aware=True          # Pins tasks directly across physical hardware CPU sockets
)

# Extract structured summaries instantly
report = clean_df.attrs["shield_report"]
print(f"Sanitized: {report['cleaned_count']} | Quarantined: {report['corrupted_count']}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightningclean-1.2.1.tar.gz (211.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lightningclean-1.2.1-py3-none-any.whl (211.6 kB view details)

Uploaded Python 3

File details

Details for the file lightningclean-1.2.1.tar.gz.

File metadata

  • Download URL: lightningclean-1.2.1.tar.gz
  • Upload date:
  • Size: 211.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for lightningclean-1.2.1.tar.gz
Algorithm Hash digest
SHA256 f59be83534c4a84314d1a08725f3ca8a468c537a8fce1f6fb793a6df2f4e7e16
MD5 08fd539968e5d42eb3114a6b61faca22
BLAKE2b-256 2d5dd2a9cf3a3b6e322d65f98ac36801c6e51d025533882f3ef738698956e6be

See more details on using hashes here.

File details

Details for the file lightningclean-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: lightningclean-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 211.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for lightningclean-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2bef1fb9fd17b9b091a677152acba45dfbd457e12fa372a15e4bd7275e9949bf
MD5 908e4b7316dbea841956e1c473d29b6f
BLAKE2b-256 6c26a343a3c81f0e99403a9e76bdd927f6981b948ebb5c6ecc6001c7aca1d113

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page