Skip to main content

A simple command-line interface & Python API for parquet

Project description

PyPI version

Parquet-Py

Parquet-Py is a simple command-line interface & Python API designed to facilitate the interaction with Parquet files. It allows users to convert Parquet files into CSV, JSON, lists, and iterators for easy manipulation and access in Python applications.

Using Rust bindings under the hood, Parquet-Py provides a fast and efficient way to work with Parquet files, making it ideal for converting or processing large datasets.

Features

  • Convert Parquet to CSV: Convert your Parquet files into CSV format for easy viewing and processing in spreadsheet applications.
  • Convert Parquet to JSON / JSON Lines: Easily convert your Parquet files into a JSON Array or JSON Lines format for quick inspection or processing.
  • Iterable Parquet Rows: Access Parquet file rows through an iterator, allowing for efficient row-by-row processing without loading the entire file into memory.
  • Convert Parquet to Python List: Transform your Parquet files into Python lists, where each row is represented as a dictionary within the list.

Installation

PyPI

pip install parquet-py

Usage

Command-Line Interface

[!WARNING]

The CLI is still under development and may not be fully functional.

Breaking changes may occur in future releases.

[!TIP]

Multiple input files can be specified with --input option. For example, --input file1.parquet --input file2.parquet.

Converting Parquet to CSV

To convert a Parquet file into a CSV file, use the parq convert command.

parq convert --input path/to/your/file.parquet --format csv --output example.csv

Converting Parquet to JSON Array

To convert a Parquet file into a JSON Array, use the parq convert command.

parq convert --input path/to/your/file.parquet --format json --output example.json

Converting Parquet to JSON Lines

To convert a Parquet file into a JSON Lines, use the parq convert command.

parq convert --input path/to/your/file.parquet --format jsonl --output example.jsonl

Python

Iterating Over Parquet Rows

To iterate over the rows of a Parquet file, use the iter_rows function. This allows for efficient row-by-row processing without loading the entire file into memory.

from parq import to_iter

# Path to your Parquet file
file_path = "path/to/your/file.parquet"

# Iterate over Parquet rows
for row in to_iter(file_path):
    print(row)

Converting Parquet to CSV String

To convert a Parquet file into a CSV string, use the to_csv_str function.

from parq import to_csv_str

# Path to your Parquet file
file_path = "path/to/your/file.parquet"

# Convert to CSV string
csv_str = to_csv_str(file_path)
print(csv_str)

Converting Parquet to JSON String

To convert a Parquet file into a JSON string, use the to_json_str function.

from parq import to_json_str

# Path to your Parquet file
file_path = "path/to/your/file.parquet"

# Convert to JSON string
json_str = to_json_str(file_path)
print(json_str)

Converting Parquet to Python List

To convert a Parquet file into a Python list, where each row is represented as a dictionary within the list, use the to_list function.

from parq import to_list

# Path to your Parquet file
file_path = "path/to/your/file.parquet"

# Convert to Python list
data_list = to_list(file_path)
print(len(data_list))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parquet_py-0.2.1b0.tar.gz (15.7 kB view hashes)

Uploaded Source

Built Distributions

parquet_py-0.2.1b0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-pp310-pypy310_pp73-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-pp310-pypy310_pp73-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

parquet_py-0.2.1b0-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-pp39-pypy39_pp73-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-pp39-pypy39_pp73-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-pp39-pypy39_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-pp39-pypy39_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-pp39-pypy39_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

parquet_py-0.2.1b0-pp38-pypy38_pp73-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-pp38-pypy38_pp73-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-pp38-pypy38_pp73-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-pp38-pypy38_pp73-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-pp38-pypy38_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-pp38-pypy38_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-pp38-pypy38_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-cp312-none-win_amd64.whl (1.3 MB view hashes)

Uploaded CPython 3.12 Windows x86-64

parquet_py-0.2.1b0-cp312-none-win32.whl (1.2 MB view hashes)

Uploaded CPython 3.12 Windows x86

parquet_py-0.2.1b0-cp312-cp312-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-cp312-cp312-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-cp312-cp312-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-cp312-cp312-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.5+ i686

parquet_py-0.2.1b0-cp312-cp312-macosx_11_0_arm64.whl (1.5 MB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

parquet_py-0.2.1b0-cp312-cp312-macosx_10_12_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

parquet_py-0.2.1b0-cp311-none-win_amd64.whl (1.3 MB view hashes)

Uploaded CPython 3.11 Windows x86-64

parquet_py-0.2.1b0-cp311-none-win32.whl (1.2 MB view hashes)

Uploaded CPython 3.11 Windows x86

parquet_py-0.2.1b0-cp311-cp311-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-cp311-cp311-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-cp311-cp311-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-cp311-cp311-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.5+ i686

parquet_py-0.2.1b0-cp311-cp311-macosx_11_0_arm64.whl (1.5 MB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

parquet_py-0.2.1b0-cp311-cp311-macosx_10_12_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

parquet_py-0.2.1b0-cp310-none-win_amd64.whl (1.3 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

parquet_py-0.2.1b0-cp310-none-win32.whl (1.3 MB view hashes)

Uploaded CPython 3.10 Windows x86

parquet_py-0.2.1b0-cp310-cp310-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-cp310-cp310-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-cp310-cp310-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-cp310-cp310-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.5+ i686

parquet_py-0.2.1b0-cp310-cp310-macosx_11_0_arm64.whl (1.5 MB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

parquet_py-0.2.1b0-cp39-none-win_amd64.whl (1.3 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

parquet_py-0.2.1b0-cp39-none-win32.whl (1.3 MB view hashes)

Uploaded CPython 3.9 Windows x86

parquet_py-0.2.1b0-cp39-cp39-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-cp39-cp39-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-cp39-cp39-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-cp39-cp39-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.5+ i686

parquet_py-0.2.1b0-cp39-cp39-macosx_11_0_arm64.whl (1.5 MB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

parquet_py-0.2.1b0-cp38-none-win_amd64.whl (1.3 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

parquet_py-0.2.1b0-cp38-none-win32.whl (1.3 MB view hashes)

Uploaded CPython 3.8 Windows x86

parquet_py-0.2.1b0-cp38-cp38-musllinux_1_2_x86_64.whl (1.8 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ x86-64

parquet_py-0.2.1b0-cp38-cp38-musllinux_1_2_i686.whl (1.8 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ i686

parquet_py-0.2.1b0-cp38-cp38-musllinux_1_2_armv7l.whl (1.9 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARMv7l

parquet_py-0.2.1b0-cp38-cp38-musllinux_1_2_aarch64.whl (1.8 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARM64

parquet_py-0.2.1b0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

parquet_py-0.2.1b0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.8 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

parquet_py-0.2.1b0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

parquet_py-0.2.1b0-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.6 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARMv7l

parquet_py-0.2.1b0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

parquet_py-0.2.1b0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.5+ i686

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page