Skip to main content

Copy-pasteable data transformation primitives for PySpark. Inspired by shadcn-svelte.

Project description

Datacompose

PyPI version Python 3.8+ Coverage License: MIT

A powerful data transformation framework for building reusable, composable data cleaning pipelines in PySpark.

Installation

pip install datacompose

What is Datacompose?

Datacompose provides production-ready PySpark data transformation primitives that become part of YOUR codebase. Inspired by shadcn's approach to components, we believe in giving you full ownership and control over your code.

Key Features

  • No Runtime Dependencies: Standalone PySpark code that runs without Datacompose
  • Composable Primitives: Build complex transformations from simple, reusable functions
  • Smart Partial Application: Pre-configure transformations with parameters for reuse
  • Optimized Operations: Efficient Spark transformations with minimal overhead
  • Comprehensive Libraries: Pre-built primitives for emails, addresses, and phone numbers

Available Transformers

  • Emails: Validation, extraction, standardization, typo correction
  • Addresses: Street parsing, state/zip validation, PO Box detection
  • Phone Numbers: NANP/international validation, formatting, toll-free detection

Documentation

For detailed documentation, examples, and API reference, visit datacompose.io.

Philosophy

This is NOT a traditional library - it gives you production-ready data transformation primitives that you can modify to fit your exact needs. You own the code, with no external dependencies to manage or worry about breaking changes.

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacompose-0.2.5.2.tar.gz (143.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datacompose-0.2.5.2-py3-none-any.whl (53.6 kB view details)

Uploaded Python 3

File details

Details for the file datacompose-0.2.5.2.tar.gz.

File metadata

  • Download URL: datacompose-0.2.5.2.tar.gz
  • Upload date:
  • Size: 143.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.19

File hashes

Hashes for datacompose-0.2.5.2.tar.gz
Algorithm Hash digest
SHA256 a22f4657769f405430eeea1d042921fe65ef49192217027ac85de904016677e3
MD5 32fa66f382a3d0cd83b98960731d6cc8
BLAKE2b-256 cb7a61efa07151f5892c3676b9861acd28e334796a2b8ce6b66a6e4b2ca19b37

See more details on using hashes here.

File details

Details for the file datacompose-0.2.5.2-py3-none-any.whl.

File metadata

File hashes

Hashes for datacompose-0.2.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e75e211ac32a80198fc4c411743e17985e3246efc090be9af7e0f8da9334ffbf
MD5 20dc244d77f997e4eac750d58c0a1f0b
BLAKE2b-256 59876c615cb72e6a340889ae92a82ad6458746a736dacfede2d852e9d573b591

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page