Parquet viewer for your terminal.
Project description
ParqInspector
ParqInspector is a Parquet- and Deltatable viewer for your terminal, built with Textual.
ParqInspector can open local or remote Parquet files and delta-tables and lets you view their contents in a table format.
https://github.com/jkausti/parq-inspector/assets/19781820/7ef7657a-0598-4d3e-bab8-3faa8032ff70
👉 Installation
ParqInspector can be installed with pip (or pipx).
$ pip install parq-inspector
👉 Usage
You start ParqInspector simply by running inspector
from your terminal.
Local Files
You can also instantly open a local file by using the options --filepath
and --row_limit
, or their short versions -f
and -rl
.
$ inspector --filepath ./data/my_data.parquet --row_limit 500
If row limit is not provided, it will get the default value of 200. Be careful, setting the row limit to a very high value might make the app take a long time to start, or it may not start at all depending on the size of your data.
Remote files
Currently, ParqInspector supports reading remote files from Azure Data Lake Storage Gen2, Amazon S3 and Google Cloud Storage. In case your storage service does not support anonymous access, you will need to set environment variables in order to make ParqInspector able to authenticate to the service. Currently, ParqInspector supports the following environment variables:
Azure:
AZURE_STORAGE_ACCOUNT_NAME
AZURE_STORAGE_SAS_KEY
AZURE_STORAGE_ACCOUNT_KEY
AZURE_STORAGE_CLIENT_ID
AZURE_STORAGE_CLIENT_SECRET
AZURE_STORAGE_TENANT_ID
AWS:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION
AWS_DEFAULT_REGION
GCP:
GOOGLE_SERVICE_ACCOUNT
GOOGLE_SERVICE_ACCOUNT_KEY
Depending on your method of authentication, not all of the environment variables need to be set.
Remote files can only be opened through the Settings-pane in the UI. Pick the correct cloud provider and in the Path-field, you simply put the URL to your file instead of a local path. ParqInspector uses polars under the hood to read Parquet files and Delta-tales from remote storage, and the supported protocols and url-variants are determined by what polars supports. See more here.
👉 Roadmap
[✓] - reading local single Parquet files
[✓] - reading remote single Parquet files
[] - Reading Parquet datasets
[✓] - Reading Delta tables
If you encounter any issues, bugs or feel there is a feature missing that would be valuable, please create an issue in this repo!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file parq_inspector-0.2.1.tar.gz
.
File metadata
- Download URL: parq_inspector-0.2.1.tar.gz
- Upload date:
- Size: 6.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.8.10 Linux/5.15.133.1-microsoft-standard-WSL2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99a5543eae66739c97c453ac9ced77ff7b03d29ccf4c474c837cc6d315fda616 |
|
MD5 | eb1114b82ecc0958f014c3fe84fdc621 |
|
BLAKE2b-256 | 446fc7d562df2411abdefaf96f1e484d3b44958f5443ce425886ce4e770d4e13 |
File details
Details for the file parq_inspector-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: parq_inspector-0.2.1-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.8.10 Linux/5.15.133.1-microsoft-standard-WSL2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75b3f2f9168f2ba0105b6a5e4b272eb2987befd352db4d3569bb24ed602f83d6 |
|
MD5 | 3977ecfb851f3645308b5c0227828a73 |
|
BLAKE2b-256 | 8ded66708a7437886dcd7309acb8fc622578062c42fde7d1eb8142974ce029e9 |