A library for reading and writing partitioned data
Project description
Partitioneer
Partitioneer is a Python library that provides utilities for managing data files in a date-partitioned format. It offers functions for writing data to partitions, reading data from partitions with filtering capabilities, and retrieving partition date information.
Installation
You can install Partitioneer using pip:
pip install partitioneer
Usage
Writing Data to Partitions
To write data to partitioned Parquet files:
from partitioneer import write_data_to_partitions
import pandas as pd
df = pd.DataFrame(...) # Your data
write_data_to_partitions(
df,
base_path="/path/to/data",
date_col="date_column",
override_existing=False
)
Reading Data from Partitions
To read data from partitioned Parquet files:
from partitioneer import read_data_from_partitions, PartitionFilter
df = read_data_from_partitions(
base_path="/path/to/data",
filters=[
PartitionFilter("category", "in", ["A", "B"]),
PartitionFilter("value", "greater_than", 100)
],
add_partition_date=True,
start_date="2024-01-01",
end_date="2024-12-31"
)
Getting Partition Date Information
To get the latest or first partition date:
from partitioneer import get_latest_partition_date, get_first_partition_date
latest_date = get_latest_partition_date("/path/to/data")
first_date = get_first_partition_date("/path/to/data")
Build Instructions
To build the package:
python setup.py sdist bdist_wheel
To upload to PyPI:
pip install twine
twine upload dist/*
Automated build and publish script:
python setup.py sdist bdist_wheel
pip install twine
twine upload dist/* --password <add_pypi_token_here>
rm -r ./build
rm -r ./dist
rm -r ./partitioneer.egg-info
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file partitioneer-0.2.13.tar.gz.
File metadata
- Download URL: partitioneer-0.2.13.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cd23fcc6a958bb6ffd2a2b20cf199111f2049c16dc437a69ee58527180d44be
|
|
| MD5 |
4c9b06d9d342fc45c6b7f7ed64d9615b
|
|
| BLAKE2b-256 |
8ba201cd33f9e34583949fd0d25ffe54eb37877d1f7577550eb91a97e8bdff4e
|
File details
Details for the file partitioneer-0.2.13-py3-none-any.whl.
File metadata
- Download URL: partitioneer-0.2.13-py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1838208e573d7d26326a1bf7e862db3e6c854ac2fca8165e855116041c9edc2
|
|
| MD5 |
ad43edb4325c0f73a7f927af0226d9c3
|
|
| BLAKE2b-256 |
fd329341b31c398cd1e82a66c065c2bb55d2e995dd3eaea54c8e549049bc14a6
|