Copy-pasteable data transformation primitives for PySpark. Inspired by shadcn-svelte.
Project description
Datacompose
A powerful data transformation framework for building reusable, composable data cleaning pipelines in PySpark.
Installation
pip install datacompose
What is Datacompose?
Datacompose provides production-ready PySpark data transformation primitives that become part of YOUR codebase. Inspired by shadcn's approach to components, we believe in giving you full ownership and control over your code.
Key Features
- No Runtime Dependencies: Standalone PySpark code that runs without Datacompose
- Composable Primitives: Build complex transformations from simple, reusable functions
- Smart Partial Application: Pre-configure transformations with parameters for reuse
- Optimized Operations: Efficient Spark transformations with minimal overhead
- Comprehensive Libraries: Pre-built primitives for emails, addresses, and phone numbers
Available Transformers
- Emails: Validation, extraction, standardization, typo correction
- Addresses: Street parsing, state/zip validation, PO Box detection
- Phone Numbers: NANP/international validation, formatting, toll-free detection
Documentation
For detailed documentation, examples, and API reference, visit datacompose.io.
Philosophy
This is NOT a traditional library - it gives you production-ready data transformation primitives that you can modify to fit your exact needs. You own the code, with no external dependencies to manage or worry about breaking changes.
License
MIT License - see LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datacompose-0.2.6.0.tar.gz.
File metadata
- Download URL: datacompose-0.2.6.0.tar.gz
- Upload date:
- Size: 148.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db5e27298a760efa92b66f986fabd77d110618001e6bee8b40df69af9856129f
|
|
| MD5 |
2762e183cdfa04ccd3123f8b81eea27b
|
|
| BLAKE2b-256 |
aed999addca68ea1f6ed9595bb9de43d412586410784c8722704702274c33862
|
File details
Details for the file datacompose-0.2.6.0-py3-none-any.whl.
File metadata
- Download URL: datacompose-0.2.6.0-py3-none-any.whl
- Upload date:
- Size: 54.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6e79b5ef054e0c9c658c3efd220c29f0ce5151da3d09ec3d1e0ade34d1f5281
|
|
| MD5 |
2e258387343e89b811b893e65d6ed0ff
|
|
| BLAKE2b-256 |
02e67f0c8a85137db8f08281c9927aea4375adf06e7545311876cf37cfd97ed2
|