Skip to main content

Tabular manipulation on the command line

Project description

Tabula

Process tabular data on the command line

Overview

Tabula provides a chain-based syntax for data manipulation operations. Methods can be chained together using dot notation: method1().method2().method3(). One can perform operations like selecting columns, filtering rows, transforming data, and aggregating results all on the command line.

Installation

Install Tabula using pip:

pip install tabula-cli

Data Selection Methods

select(col1, col2, ...)

Select specific columns from the dataset.

# Select single column
tabula "select(name)" data.csv

# Select multiple columns
tabula "select(name, age, salary)" data.csv

Data Transformation Methods

upper(col)

Convert text in specified column to uppercase.

tabula "select(name).upper(name)" data.csv

lower(col)

Convert text in specified column to lowercase.

tabula "select(name).lower(name)" data.csv

strlen(col)

Calculate the length of strings in specified column.

tabula "select(name).strlen(name)" data.csv

round(col, decimals)

Round numeric values to specified decimal places.

tabula "select(salary).round(salary, 2)" data.csv

Filtering Methods

where(condition)

Filter rows based on conditions. Supports comparison operators and logical operators.

# Simple condition
tabula "where(age > 30)" data.csv

# Multiple conditions with AND
tabula "where(age > 25 & salary >= 50000)" data.csv

# Multiple conditions with OR
tabula "where(department == 'IT' | department == 'HR')" data.csv

# Complex conditions with parentheses
tabula "where((age > 30 & department == 'IT') | salary < 40000)" data.csv

Data Limiting Methods

head(n)

Return the first n rows (default: 5).

tabula "head(10)" data.csv

tail(n)

Return the last n rows (default: 5).

tabula "tail(3)" data.csv

Sorting Methods

sortby(col, descending=False)

Sort data by specified column.

# Ascending sort
tabula "sortby(age)" data.csv

# Descending sort
tabula "sortby(salary, True)" data.csv

Aggregation Methods (Terminal)

count()

Count the number of rows.

tabula "count()" data.csv
tabula "where(age > 30).count()" data.csv

min(col), max(col), sum(col)

Calculate minimum, maximum, or sum of a column.

tabula "min(age)" data.csv
tabula "max(salary)" data.csv
tabula "sum(salary)" data.csv

mean(col), median(col), mode(col)

Calculate statistical measures.

tabula "mean(salary)" data.csv
tabula "median(age)" data.csv

std(col), var(col)

Calculate standard deviation and variance.

tabula "std(salary)" data.csv
tabula "var(age)" data.csv

first(col), last(col)

Get first or last value from a column.

tabula "first(name)" data.csv
tabula "last(name)" data.csv

Unique Value Methods

uniq(col)

Get unique values from a column.

tabula "uniq(department)" data.csv

uniqc(col)

Count unique values (group by and count).

tabula "uniqc(department)" data.csv

String Methods

strjoin(col, separator)

Join all values in a column with a separator.

tabula "strjoin(name, ', ')" data.csv

Utility Methods

columns()

List all column names.

tabula "columns()" data.csv

Complete Example Workflow

# Sample data.csv:
# name,age,salary,department
# Alice,25,50000,HR
# Bob,30,60000,IT
# Charlie,35,70000,Finance
# David,40,80000,IT

# Complex analysis: Find IT employees over 30, show their names and salaries, sorted by salary
tabula "where(department == 'IT' & age > 30).select(name, salary).sortby(salary)" data.csv

# Output:
# name,salary
# Bob,60000
# David,80000

Method Chaining Rules

  1. Terminal Methods: Methods like count(), sum(), min(), max() must be the last in the chain
  2. Column Selection: Use select() before applying column-specific operations
  3. Filtering: where() conditions support parentheses for complex logic
  4. String Operations: Methods like upper(), lower(), strlen() work on text columns

Output Formats

Use the -o flag to specify output format:

  • --outtype polars: Default table format
  • --outtype csv: CSV format
  • --outtype tsv: Tab-separated values
tabula "select(name, age)"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabula_cli-1.0.0.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tabula_cli-1.0.0-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file tabula_cli-1.0.0.tar.gz.

File metadata

  • Download URL: tabula_cli-1.0.0.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for tabula_cli-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ca85504268980cde1bdd02bf4531705bdbe27610663f3164298a6b93350740d7
MD5 3f136df82b015b55a89d891a5d5cac7a
BLAKE2b-256 d30db8095563fb0e3e42239b1e4b58a6cb8e97c0bbba5cb4e448d39273a52110

See more details on using hashes here.

File details

Details for the file tabula_cli-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tabula_cli-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for tabula_cli-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e866f6b1a54c156b344931e424fda556eef2514a29761e87beee2ba4fdf443f6
MD5 cfbe479c19640076db3b4bfc1cb8eebe
BLAKE2b-256 ec2b84938ad7264e8c78ac78bb79e188b4bc7a4b5ffa35cd8cb77fcbadce3361

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page