Skip to main content

CLI bringing pandas operations to the command line

Project description

pandas-term

pandas-term is a CLI bringing pandas operations to the command line.

Demo

Note: Still in early experimental development and may change

Installation

pipx install pandas-term

Or with pip:

pip install pandas-term

Command Reference

CLI Command Pandas Function Description
pd select df[columns] Select columns
pd drop df.drop() Drop columns
pd rename df.rename() Rename columns
pd sort df.sort_values() Sort by columns
pd dedup df.drop_duplicates() Remove duplicate rows
pd merge pd.merge() Merge two dataframes
pd concat pd.concat() Concatenate dataframes
pd batch df.iloc[] Split dataframe into batches
pd query df.query() Filter using query expressions
pd head df.head() Get first n rows
pd tail df.tail() Get last n rows
pd dropna df.dropna() Drop rows with null values
pd describe df.describe() Descriptive statistics
pd unique df[col].unique() Unique values in column
pd shape df.shape Dimensions (rows, columns)
pd columns df.columns Column names
pd dtypes df.dtypes Column data types
pd value-counts df.value_counts() Count unique values
pd groupby df.groupby().agg() Group by and aggregate

Usage

All commands accept an input file path (or - for stdin) and an optional -o/--output flag for the output file (default: stdout).

Transform commands

# Select columns (comma-separated)
pd select name,age data.csv

# Drop columns (comma-separated)
pd drop unwanted_column data.csv

# Sort by columns (comma-separated for multiple)
pd sort age data.csv --ascending
pd sort "age,name" data.csv --descending

# Remove duplicate rows
pd dedup data.csv
pd dedup --subset name,email data.csv

# Rename columns
pd rename "name:full_name" data.csv
pd rename "name:full_name,age:years" data.csv

# Merge two dataframes
pd merge left.csv right.csv --on user_id --how inner
pd merge left.csv right.csv --left-on id --right-on user_id --how left

# Concatenate multiple dataframes
pd concat file1.csv file2.csv file3.csv

# Split dataframe into batches
pd batch data.csv --sizes 100 -o "batch_{}.csv"
pd batch data.csv --sizes 1,2,10,50 -o "batch_{}.csv"  # variable sizes, last repeats

Filter commands

# Filter using pandas query expressions
pd query "age > 30 and city == 'NYC'" data.csv

# First N rows
pd head --n 100 data.csv

# Last N rows
pd tail --n 50 data.csv

# Drop rows with null values in any column
pd dropna data.csv

# Drop rows with null values in specific columns
pd dropna --subset column_name data.csv
pd dropna --subset "name,age" data.csv

Stats commands

# Descriptive statistics
pd describe data.csv

# Unique values in a column
pd unique country data.csv

# Dimensions (rows, columns)
pd shape data.csv

# Column names
pd columns data.csv

# Column data types
pd dtypes data.csv

Aggregate commands

# Count unique values
pd value-counts city data.csv
pd value-counts department data.csv --normalize

# Group by and aggregate (comma-separated for multiple group columns)
pd groupby department data.csv --col salary --agg sum
pd groupby "city,department" data.csv --col age --agg mean

Piping

All commands support piping through stdin/stdout. When piping, you can omit the input file argument (it defaults to stdin):

cat data.csv | pd head --n 100 | pd select name,age | pd query "age > 30"

# Or chain commands directly
pd sort age data.csv --descending | pd head --n 10 | pd select name,age

Output Formats

Stdout

For stdout, use -f/--format to specify the output format (default: csv):

pd head --n 9 data.csv -f json
pd head --n 9 data.csv -f tsv
pd head --n 9 data.csv -f md
pd query "age > 29" data.csv --format json | jq '.[] | .name'

Supported stdout formats: csv, tsv, json, markdown (md)

The --json/-j flag is shorthand for --format json:

pd head --n 9 data.csv --json

File

When writing to a file with -o, the format is determined by the file extension:

pd select name,age data.csv -o output.xlsx
pd query "age > 30" data.json -o filtered.parquet

Supported file formats are: csv, tsv, xlsx, json, parquet, markdown (md)

For any other extension, use shell redirection:

pd select name,age data.csv -f csv > output.txt

Development

Requires uv

Create virtual environment and install dependencies:

uv sync

Dev commands

Command Description
make format Format code with ruff
make lint Run linting checks (ruff + type checking)
make test Run pytest tests
make check Format, lint, and run tests
make coverage Run tests with coverage report

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_term-0.0.4.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_term-0.0.4-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file pandas_term-0.0.4.tar.gz.

File metadata

  • Download URL: pandas_term-0.0.4.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pandas_term-0.0.4.tar.gz
Algorithm Hash digest
SHA256 7a7170072c1114043cea900b438083a82b2279aa3dd880f04123154af04b0173
MD5 80e976bd507ea3f00466d08376ad5e16
BLAKE2b-256 34d69101c079c0398f09a404025680c2f3742c17d055f27fb12a3da26942ef75

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_term-0.0.4.tar.gz:

Publisher: ci.yaml on KatieLG/pandas-term

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pandas_term-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: pandas_term-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 13.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pandas_term-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 66d021508658b6f417102ace8b417f53d37fbb3cd5608f792f47f54cda02b565
MD5 45d51d6472a276156923aa03e119032d
BLAKE2b-256 6be843a5072c6596ab5c0e8618bb5cd2ac81d6c994e1803e80318e52630e3521

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_term-0.0.4-py3-none-any.whl:

Publisher: ci.yaml on KatieLG/pandas-term

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page