Skip to main content

Copy-pasteable data transformation primitives for PySpark. Inspired by shadcn-svelte.

Project description

Datacompose

PyPI version Python 3.8+ Coverage License: MIT

A powerful data transformation framework for building reusable, composable data cleaning pipelines in PySpark.

Installation

pip install datacompose

What is Datacompose?

Datacompose provides production-ready PySpark data transformation primitives that become part of YOUR codebase. Inspired by shadcn's approach to components, we believe in giving you full ownership and control over your code.

Key Features

  • No Runtime Dependencies: Standalone PySpark code that runs without Datacompose
  • Composable Primitives: Build complex transformations from simple, reusable functions
  • Smart Partial Application: Pre-configure transformations with parameters for reuse
  • Optimized Operations: Efficient Spark transformations with minimal overhead
  • Comprehensive Libraries: Pre-built primitives for emails, addresses, and phone numbers

Available Transformers

  • Emails: Validation, extraction, standardization, typo correction
  • Addresses: Street parsing, state/zip validation, PO Box detection
  • Phone Numbers: NANP/international validation, formatting, toll-free detection

Documentation

For detailed documentation, examples, and API reference, visit datacompose.io.

Philosophy

This is NOT a traditional library - it gives you production-ready data transformation primitives that you can modify to fit your exact needs. You own the code, with no external dependencies to manage or worry about breaking changes.

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacompose-0.2.6.0.tar.gz (148.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datacompose-0.2.6.0-py3-none-any.whl (54.0 kB view details)

Uploaded Python 3

File details

Details for the file datacompose-0.2.6.0.tar.gz.

File metadata

  • Download URL: datacompose-0.2.6.0.tar.gz
  • Upload date:
  • Size: 148.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.19

File hashes

Hashes for datacompose-0.2.6.0.tar.gz
Algorithm Hash digest
SHA256 db5e27298a760efa92b66f986fabd77d110618001e6bee8b40df69af9856129f
MD5 2762e183cdfa04ccd3123f8b81eea27b
BLAKE2b-256 aed999addca68ea1f6ed9595bb9de43d412586410784c8722704702274c33862

See more details on using hashes here.

File details

Details for the file datacompose-0.2.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for datacompose-0.2.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6e79b5ef054e0c9c658c3efd220c29f0ce5151da3d09ec3d1e0ade34d1f5281
MD5 2e258387343e89b811b893e65d6ed0ff
BLAKE2b-256 02e67f0c8a85137db8f08281c9927aea4375adf06e7545311876cf37cfd97ed2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page