No project description provided
Project description
parq-tools
Overview
parq-tools is a collection of utilities for efficiently working with large-scale Parquet datasets.
A typical use case is asset-based workflows with large scientific datasets.
:::note
If your datasets are not large, you might find the pandas library more convenient.
:::
Features
- Filtering → Efficiently filter large parquet files.
- Concatenation → Combines multiple Parquet files efficiently along rows (
axis=0) or columns (axis=1). - Tokenized Filtering → Converts pandas-style expressions into efficient PyArrow queries.
- Profiling Enhancements → Improves
ydata-profilingby profiling specific columns incrementally, merging results for large files. - Block Model Generation → Create a parquet block model that exceeds the machine memory capacity, useful for testing pipelines.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parq_tools-0.2.1.tar.gz
(15.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parq_tools-0.2.1.tar.gz.
File metadata
- Download URL: parq_tools-0.2.1.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8492a0a14c324f9d96292b0a910629e4d74de7d208431e0fdac9a748452aeeb8
|
|
| MD5 |
d407d1f33e2c68c565361ea058f1352f
|
|
| BLAKE2b-256 |
ad6f7f9653970fc03df57b13c320a9995aaab543b64ac1957aab56fbe61fb0c5
|
File details
Details for the file parq_tools-0.2.1-py3-none-any.whl.
File metadata
- Download URL: parq_tools-0.2.1-py3-none-any.whl
- Upload date:
- Size: 20.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
772d94bff5f93e641b5fb5d025c4d264d20184ac806b84a5859f544ea879153d
|
|
| MD5 |
bea69a070fd1e86fc24d06196d92785e
|
|
| BLAKE2b-256 |
8fef940fcf7c70c001d7c6726b18d61d839430f135cc2c8e4a75e02a11c85008
|