Skip to main content

pqcat is a fast command-line tool for inspecting Parquet files

Project description

pypi Release Build License

pqcat

Fast command-line tool for inspecting Parquet files.

Installation

pip install pqcat

Usage

$ pqcat
Usage: pqcat [OPTIONS] COMMAND [ARGS]...

 Fast CLI tool for Parquet using Polars


╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.                                │
│ --show-completion             Show completion for the current shell, to copy it or customize the       │
│                               installation.                                                            │
│ --help                        Show this message and exit.                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────╮
│ head        Show first N rows                                                                          │
│ tail        Show last N rows                                                                           │
│ cat         Show all rows (alias for show)                                                             │
│ schema      Show schema of Parquet file                                                                │
│ row-count   Show number of rows                                                                        │
│ stats       Show Parquet stats                                                                         │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Usage: pqcat [OPTIONS] COMMAND [ARGS]...

cat command

$ pqcat cat examples/retail.parquet
shape: (50, 8)
┌─────────┬────────────┬────────────┬───────────┬───────────┬──────────┬────────┬─────────────┐
│ OrderID  CustomerID  OrderDate   ProductID  Category   Quantity  Price   TotalAmount │
│ ---      ---         ---         ---        ---        ---       ---     ---         │
│ str      str         str         str        str        i64       f64     f64         │
╞═════════╪════════════╪════════════╪═══════════╪═══════════╪══════════╪════════╪═════════════╡
│ O0001    C1007       2024-01-27  P016       Clothing   1         159.57  159.57      │
│ O0002    C1001       2024-01-01  P011       Toys       3         10.54   31.62       │
│ O0003    C1011       2024-01-23  P011       Clothing   1         82.04   82.04       │
│ O0004    C1020       2024-01-18  P006       Toys       5         174.84  874.2       │
│ O0005    C1017       2024-01-26  P001       Clothing   1         16.62   16.62       │
│                                                                              │
│ O0046    C1010       2024-01-25  P007       Toys       4         93.06   372.24      │
│ O0047    C1017       2024-01-13  P009       Books      5         14.31   71.55       │
│ O0048    C1013       2024-01-28  P015       Books      5         106.63  533.15      │
│ O0049    C1005       2024-01-27  P011       Toys       1         105.6   105.6       │
│ O0050    C1000       2024-01-06  P019       Groceries  2         47.31   94.62       │
└─────────┴────────────┴────────────┴───────────┴───────────┴──────────┴────────┴─────────────┘

head command

$ pqcat head examples/retail.parquet -n 5 --columns Category,Price --format csv --filter "Price>100"
Category,Price
Clothing,159.57
Toys,174.84
Groceries,133.34
Clothing,152.98
Toys,119.59

schema command

$ pqcat schema examples/retail.parquet
OrderID: String
CustomerID: String
OrderDate: String
ProductID: String
Category: String
Quantity: Int64
Price: Float64
TotalAmount: Float64

row-count command

$ pqcat row-count examples/retail.parquet
50

stats command

$ pqcat stats examples/retail.parquet
┌────────────┬─────────┬────────────┬────────────┬───┬──────────┬──────────┬───────────┬─────────────┐
│ statistic   OrderID  CustomerID  OrderDate     Category  Quantity  Price      TotalAmount │
│ ---         ---      ---         ---            ---       ---       ---        ---         │
│ str         str      str         str            str       f64       f64        f64         │
╞════════════╪═════════╪════════════╪════════════╪═══╪══════════╪══════════╪═══════════╪═════════════╡
│ count       50       50          50            50        50.0      50.0       50.0        │
│ null_count  0        0           0             0         0.0       0.0        0.0         │
│ mean        null     null        null          null      2.74      88.6722    231.8516    │
│ std         null     null        null          null      1.454199  59.740505  201.475002  │
│ min         O0001    C1000       2024-01-01    Books     1.0       10.54      11.73       │
│ 25%         null     null        null          null      1.0       24.91      76.8        │
│ 50%         null     null        null          null      3.0       93.06      159.57      │
│ 75%         null     null        null          null      4.0       147.87     372.24      │
│ max         O0050    C1020       2024-01-30    Toys      5.0       197.7      874.2       │
└────────────┴─────────┴────────────┴────────────┴───┴──────────┴──────────┴───────────┴─────────────┘

References

This project is inspired by https://github.com/hangxie/parquet-tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pqcat-0.0.3.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pqcat-0.0.3-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file pqcat-0.0.3.tar.gz.

File metadata

  • Download URL: pqcat-0.0.3.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pqcat-0.0.3.tar.gz
Algorithm Hash digest
SHA256 6dc2379bb4caba640a92356817029f7f59d6e0662d7498f4df7c718f159492ef
MD5 62769f0a0c74b27a6204dcfd946596e5
BLAKE2b-256 efd89eeb00fa71d4bcde1647b665f835463406a0bf0ff3baba1d5b95af52103c

See more details on using hashes here.

Provenance

The following attestation bundles were made for pqcat-0.0.3.tar.gz:

Publisher: publish.yml on speed1313/pqcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pqcat-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: pqcat-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pqcat-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9e29b0bb5578fcafe145a573e3910ef4ebf099455c048dcca009fa2a61234afb
MD5 e716f99ecb00087965b141d3536b01d8
BLAKE2b-256 6113b1ca4a740bc3d40612e2fc1891e7d7da54e1006903addc251d0a23a37b79

See more details on using hashes here.

Provenance

The following attestation bundles were made for pqcat-0.0.3-py3-none-any.whl:

Publisher: publish.yml on speed1313/pqcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page