A library for reading and writing partitioned data
Project description
Partitioneer
Partitioneer is a Python library that provides utilities for managing data files in a date-partitioned format. It offers functions for writing data to partitions, reading data from partitions with filtering capabilities, and retrieving partition date information.
Installation
You can install Partitioneer using pip:
pip install partitioneer
Usage
Writing Data to Partitions
To write data to partitioned Parquet files:
from partitioneer import write_data_to_partitions
import pandas as pd
df = pd.DataFrame(...) # Your data
write_data_to_partitions(
df,
base_path="/path/to/data",
date_col="date_column",
override_existing=False
)
Reading Data from Partitions
To read data from partitioned Parquet files:
from partitioneer import read_data_from_partitions, PartitionFilter
df = read_data_from_partitions(
base_path="/path/to/data",
filters=[
PartitionFilter("category", "in", ["A", "B"]),
PartitionFilter("value", "greater_than", 100)
],
add_partition_date=True,
start_date="2024-01-01",
end_date="2024-12-31"
)
Getting Partition Date Information
To get the latest or first partition date:
from partitioneer import get_latest_partition_date, get_first_partition_date
latest_date = get_latest_partition_date("/path/to/data")
first_date = get_first_partition_date("/path/to/data")
Build Instructions
To build the package:
python setup.py sdist bdist_wheel
To upload to PyPI:
pip install twine
twine upload dist/*
Automated build and publish script:
python setup.py sdist bdist_wheel
pip install twine
twine upload dist/* --password <add_pypi_token_here>
rm -r ./build
rm -r ./dist
rm -r ./partitioneer.egg-info
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
partitioneer-0.2.13.tar.gz
(7.0 kB
view hashes)
Built Distribution
Close
Hashes for partitioneer-0.2.13-py3-none-any.whl
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 | a1838208e573d7d26326a1bf7e862db3e6c854ac2fca8165e855116041c9edc2 |
|
| MD5 | ad43edb4325c0f73a7f927af0226d9c3 |
|
| BLAKE2b-256 | fd329341b31c398cd1e82a66c065c2bb55d2e995dd3eaea54c8e549049bc14a6 |