Skip to main content

pqcat is a fast command-line tool for inspecting Parquet files

Project description

pypi Release Build License

pqcat

Fast command-line tool for inspecting Parquet files.

Installation

pip install pqcat

Usage

$ pqcat
Usage: pqcat [OPTIONS] COMMAND [ARGS]...

 Fast CLI tool for Parquet using Polars


╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.                                │
│ --show-completion             Show completion for the current shell, to copy it or customize the       │
│                               installation.                                                            │
│ --help                        Show this message and exit.                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────╮
│ head        Show first N rows                                                                          │
│ tail        Show last N rows                                                                           │
│ cat         Show all rows (alias for show)                                                             │
│ schema      Show schema of Parquet file                                                                │
│ row-count   Show number of rows                                                                        │
│ stats       Show Parquet stats                                                                         │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Usage: pqcat [OPTIONS] COMMAND [ARGS]...

cat command

$ pqcat cat examples/retail.parquet
shape: (50, 8)
┌─────────┬────────────┬────────────┬───────────┬───────────┬──────────┬────────┬─────────────┐
│ OrderID  CustomerID  OrderDate   ProductID  Category   Quantity  Price   TotalAmount │
│ ---      ---         ---         ---        ---        ---       ---     ---         │
│ str      str         str         str        str        i64       f64     f64         │
╞═════════╪════════════╪════════════╪═══════════╪═══════════╪══════════╪════════╪═════════════╡
│ O0001    C1007       2024-01-27  P016       Clothing   1         159.57  159.57      │
│ O0002    C1001       2024-01-01  P011       Toys       3         10.54   31.62       │
│ O0003    C1011       2024-01-23  P011       Clothing   1         82.04   82.04       │
│ O0004    C1020       2024-01-18  P006       Toys       5         174.84  874.2       │
│ O0005    C1017       2024-01-26  P001       Clothing   1         16.62   16.62       │
│                                                                              │
│ O0046    C1010       2024-01-25  P007       Toys       4         93.06   372.24      │
│ O0047    C1017       2024-01-13  P009       Books      5         14.31   71.55       │
│ O0048    C1013       2024-01-28  P015       Books      5         106.63  533.15      │
│ O0049    C1005       2024-01-27  P011       Toys       1         105.6   105.6       │
│ O0050    C1000       2024-01-06  P019       Groceries  2         47.31   94.62       │
└─────────┴────────────┴────────────┴───────────┴───────────┴──────────┴────────┴─────────────┘

head command

$ pqcat head examples/retail.parquet -n 5 --columns Category,Price --format csv --filter "Price>100"
Category,Price
Clothing,159.57
Toys,174.84
Groceries,133.34
Clothing,152.98
Toys,119.59

schema command

$ pqcat schema examples/retail.parquet
OrderID: String
CustomerID: String
OrderDate: String
ProductID: String
Category: String
Quantity: Int64
Price: Float64
TotalAmount: Float64

row-count command

$ pqcat row-count examples/retail.parquet
50

stats command

$ pqcat stats examples/retail.parquet
┌────────────┬─────────┬────────────┬────────────┬───┬──────────┬──────────┬───────────┬─────────────┐
│ statistic   OrderID  CustomerID  OrderDate     Category  Quantity  Price      TotalAmount │
│ ---         ---      ---         ---            ---       ---       ---        ---         │
│ str         str      str         str            str       f64       f64        f64         │
╞════════════╪═════════╪════════════╪════════════╪═══╪══════════╪══════════╪═══════════╪═════════════╡
│ count       50       50          50            50        50.0      50.0       50.0        │
│ null_count  0        0           0             0         0.0       0.0        0.0         │
│ mean        null     null        null          null      2.74      88.6722    231.8516    │
│ std         null     null        null          null      1.454199  59.740505  201.475002  │
│ min         O0001    C1000       2024-01-01    Books     1.0       10.54      11.73       │
│ 25%         null     null        null          null      1.0       24.91      76.8        │
│ 50%         null     null        null          null      3.0       93.06      159.57      │
│ 75%         null     null        null          null      4.0       147.87     372.24      │
│ max         O0050    C1020       2024-01-30    Toys      5.0       197.7      874.2       │
└────────────┴─────────┴────────────┴────────────┴───┴──────────┴──────────┴───────────┴─────────────┘

References

This project is inspired by https://github.com/hangxie/parquet-tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pqcat-0.0.5.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pqcat-0.0.5-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file pqcat-0.0.5.tar.gz.

File metadata

  • Download URL: pqcat-0.0.5.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pqcat-0.0.5.tar.gz
Algorithm Hash digest
SHA256 5677a7d75d3d828eddd4137962994f0b1bc0fa5ae9a022e1c7fbaaf89e5bc43b
MD5 e6cf57a1067a466d6aeca32b2a11b8e9
BLAKE2b-256 760ed5a6ee84e2efcf0cf4b1b51bb3a95de97e64e0f73b5304a7a58d0391ac3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for pqcat-0.0.5.tar.gz:

Publisher: publish.yml on speed1313/pqcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pqcat-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: pqcat-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pqcat-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 81e56a05a032a164a500ad793b9064ac808c9cb1a2b39b73b7758d2d6429ae7d
MD5 eacd53c4f0852b8b3c272c774301ac20
BLAKE2b-256 c9b721b4635c9b92e79502e710a32d201ec118e4d8c3b1179cbd5f32f3e89645

See more details on using hashes here.

Provenance

The following attestation bundles were made for pqcat-0.0.5-py3-none-any.whl:

Publisher: publish.yml on speed1313/pqcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page