Skip to main content

A data analysis cli tool using polars lazyframes

Project description

pldatacli

A simple command-line tool for quick CSV data analysis using Polars, with lazy execution for efficiency.


Tech Stack

  • Polars – fast DataFrame engine with lazy execution for efficient data processing
  • Typer – modern CLI framework for building command-line interfaces
  • Rich – beautiful terminal rendering for clean table output

PyPi Repository

Check the Repository on PyPI - https://pypi.org/project/pldatacli/


Installation

  • Option 1: Directly with pip
pip install pldatacli
  • Option2 : with uv package manager (Requires uv to be installed)
uv tool install pldatacli

Usage

Basic query command

pldatacli query FILE [OPTIONS]

Example file:

SampleSuperstore.csv

Filter rows

Single filter:

pldatacli query SampleSuperstore.csv \
  --filter "State=Texas"

Multiple filters:

pldatacli query SampleSuperstore.csv \
  --filter "State=Texas" \
  --filter "Category=Furniture"

Group by columns

Single column:

pldatacli query SampleSuperstore.csv \
  --groupby Region

Multiple columns:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --groupby Category

Aggregations

Single aggregation:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --agg "Profit=sum"

Multiple aggregations:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --agg "Profit=sum,mean"

Multiple columns with aggregations:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --groupby Category \
  --agg "Sales=sum,mean" \
  --agg "Profit=sum"

Sorting

Single sort:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --agg "Profit=sum" \
  --sort "Profit_sum:desc"

Multiple sorts:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --agg "Profit=sum" \
  --sort "Region:asc" \
  --sort "Profit_sum:desc"

Rounding results

Round float columns to 2 digits:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --agg "Profit=mean" \
  --round 2

Custom rounding:

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --agg "Profit=mean" \
  --round 4

Limiting rows

Head:

pldatacli query SampleSuperstore.csv \
  --head 5

Tail:

pldatacli query SampleSuperstore.csv \
  --tail 10

Full query example

pldatacli query SampleSuperstore.csv \
  --groupby Region \
  --groupby Category \
  --agg "Profit=sum,mean" \
  --sort "Profit_sum:desc" \
  --head 5 \
  --round 2

Schema inspection

Get columns, dtypes, and null counts without processing the full dataset:

pldatacli schema SampleSuperstore.csv

Example output:

LazyFrame Schema
┏━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
┃ Column       ┃ Dtype   ┃ Nulls ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
│ Ship Mode    │ String  │     0 │
│ Segment      │ String  │     0 │
│ Country      │ String  │     0 │
│ City         │ String  │     0 │
│ State        │ String  │     0 │
│ Postal Code  │ Int64   │     0 │
│ Region       │ String  │     0 │
│ Category     │ String  │     0 │
│ Sub-Category │ String  │     0 │
│ Sales        │ Float64 │     0 │
│ Quantity     │ Int64   │     0 │
│ Discount     │ Float64 │     0 │
│ Profit       │ Float64 │     0 │
└──────────────┴─────────┴───────┘
Rows: 9994, Columns: 13

⚡ Tip: Use schema before running queries to quickly inspect columns, types, and missing values.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pldatacli-0.1.3.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pldatacli-0.1.3-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file pldatacli-0.1.3.tar.gz.

File metadata

  • Download URL: pldatacli-0.1.3.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pldatacli-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c1f06c3b09a3c96be6ad74bccf9c8ef10b07bdefd0d8894beae6b2b6bd954e8a
MD5 44770aff4bbb5d3af0467635e60f5e3c
BLAKE2b-256 81538a33297ce68d0c529fbabb49fcddf4b276756a06ec598d3fbb5186393655

See more details on using hashes here.

File details

Details for the file pldatacli-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pldatacli-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pldatacli-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a1788ef9318f6650770d2b0401036f6ec1649163021895caeae7224cb4c5f208
MD5 d33c718643c4f25b6c93f55c46b328e0
BLAKE2b-256 ef34c32ca929744d4d6953c51addc1e0eaef146fa74c83f384222fae94541d10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page