Skip to main content

pqcat is a fast command-line tool for inspecting Parquet files

Project description

pypi Release Build License

pqcat

Fast command-line tool for inspecting Parquet files.

Installation

pip install pqcat

Usage

$ pqcat
Usage: pqcat [OPTIONS] COMMAND [ARGS]...

 Fast CLI tool for Parquet using Polars


╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.                                │
│ --show-completion             Show completion for the current shell, to copy it or customize the       │
│                               installation.                                                            │
│ --help                        Show this message and exit.                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────╮
│ head        Show first N rows                                                                          │
│ tail        Show last N rows                                                                           │
│ cat         Show all rows (alias for show)                                                             │
│ schema      Show schema of Parquet file                                                                │
│ row-count   Show number of rows                                                                        │
│ stats       Show Parquet stats                                                                         │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Usage: pqcat [OPTIONS] COMMAND [ARGS]...

cat command

$ pqcat cat examples/retail.parquet
shape: (50, 8)
┌─────────┬────────────┬────────────┬───────────┬───────────┬──────────┬────────┬─────────────┐
│ OrderID  CustomerID  OrderDate   ProductID  Category   Quantity  Price   TotalAmount │
│ ---      ---         ---         ---        ---        ---       ---     ---         │
│ str      str         str         str        str        i64       f64     f64         │
╞═════════╪════════════╪════════════╪═══════════╪═══════════╪══════════╪════════╪═════════════╡
│ O0001    C1007       2024-01-27  P016       Clothing   1         159.57  159.57      │
│ O0002    C1001       2024-01-01  P011       Toys       3         10.54   31.62       │
│ O0003    C1011       2024-01-23  P011       Clothing   1         82.04   82.04       │
│ O0004    C1020       2024-01-18  P006       Toys       5         174.84  874.2       │
│ O0005    C1017       2024-01-26  P001       Clothing   1         16.62   16.62       │
│                                                                              │
│ O0046    C1010       2024-01-25  P007       Toys       4         93.06   372.24      │
│ O0047    C1017       2024-01-13  P009       Books      5         14.31   71.55       │
│ O0048    C1013       2024-01-28  P015       Books      5         106.63  533.15      │
│ O0049    C1005       2024-01-27  P011       Toys       1         105.6   105.6       │
│ O0050    C1000       2024-01-06  P019       Groceries  2         47.31   94.62       │
└─────────┴────────────┴────────────┴───────────┴───────────┴──────────┴────────┴─────────────┘

head command

$ pqcat head examples/retail.parquet -n 5 --columns Category,Price --format csv --filter "Price>100"
Category,Price
Clothing,159.57
Toys,174.84
Groceries,133.34
Clothing,152.98
Toys,119.59

schema command

$ pqcat schema examples/retail.parquet
OrderID: String
CustomerID: String
OrderDate: String
ProductID: String
Category: String
Quantity: Int64
Price: Float64
TotalAmount: Float64

row-count command

$ pqcat row-count examples/retail.parquet
50

stats command

$ pqcat stats examples/retail.parquet
┌────────────┬─────────┬────────────┬────────────┬───┬──────────┬──────────┬───────────┬─────────────┐
│ statistic   OrderID  CustomerID  OrderDate     Category  Quantity  Price      TotalAmount │
│ ---         ---      ---         ---            ---       ---       ---        ---         │
│ str         str      str         str            str       f64       f64        f64         │
╞════════════╪═════════╪════════════╪════════════╪═══╪══════════╪══════════╪═══════════╪═════════════╡
│ count       50       50          50            50        50.0      50.0       50.0        │
│ null_count  0        0           0             0         0.0       0.0        0.0         │
│ mean        null     null        null          null      2.74      88.6722    231.8516    │
│ std         null     null        null          null      1.454199  59.740505  201.475002  │
│ min         O0001    C1000       2024-01-01    Books     1.0       10.54      11.73       │
│ 25%         null     null        null          null      1.0       24.91      76.8        │
│ 50%         null     null        null          null      3.0       93.06      159.57      │
│ 75%         null     null        null          null      4.0       147.87     372.24      │
│ max         O0050    C1020       2024-01-30    Toys      5.0       197.7      874.2       │
└────────────┴─────────┴────────────┴────────────┴───┴──────────┴──────────┴───────────┴─────────────┘

References

This project is inspired by https://github.com/hangxie/parquet-tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pqcat-0.0.4.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pqcat-0.0.4-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file pqcat-0.0.4.tar.gz.

File metadata

  • Download URL: pqcat-0.0.4.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pqcat-0.0.4.tar.gz
Algorithm Hash digest
SHA256 cde68213f27e51b4ecbff07244fa58a22e562513c648c670957b257e0541a3c8
MD5 682debb5977b4dc993a110d407d64b2b
BLAKE2b-256 7214f51ad973c8d5c2585b61abbdb7d41de4f4c0dbdf3cd49921b7950e496109

See more details on using hashes here.

Provenance

The following attestation bundles were made for pqcat-0.0.4.tar.gz:

Publisher: publish.yml on speed1313/pqcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pqcat-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: pqcat-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pqcat-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 65c344aec2bb76a38de0687352e7d82450f1062240e169d0cebfcd3cd5b46e05
MD5 d5b914044271d52771904d6a9d214ae3
BLAKE2b-256 eb8c716b247b9fdcf9b001ec489a07ede3a539f88e75115ca270930855d77f85

See more details on using hashes here.

Provenance

The following attestation bundles were made for pqcat-0.0.4-py3-none-any.whl:

Publisher: publish.yml on speed1313/pqcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page