Skip to main content

Command-line toolkit for Excel data manipulation and analysis

Project description

Excel CLI Toolkit

Command-line toolkit for Excel data manipulation and analysis.

Python 3.14+ License: MIT

Overview

Excel CLI Toolkit (xl) is a powerful command-line interface for performing data wrangling and analysis operations on Excel files. Designed for both human users and AI systems, it provides fast, predictable operations without requiring scripts or programming.

Features

  • Filter & Search: Filter rows by conditions, search for values
  • Sort & Group: Sort by columns, group and aggregate data
  • Transform: Apply transformations, clean data, deduplicate
  • Multi-format: Support for XLSX, CSV, JSON, and Parquet
  • Functional Programming: Built with Result/Maybe types for robust error handling
  • AI-Friendly: Simple, composable commands perfect for AI automation
  • Fast: Efficient processing of large files

Installation

Using pip

pip install excel-toolkit

Using uv (recommended)

uv pip install excel-toolkit

Development installation

git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit
uv pip install -e ".[dev]"

With Parquet support

pip install "excel-toolkit[parquet]"

Quick Start

Basic filtering

# Filter rows where Amount > 1000
xl filter sales.xlsx --where "Amount > 1000" --output filtered.xlsx

# Filter by multiple conditions
xl filter data.xlsx --where "Region == 'North' and Price > 100" -o result.xlsx

Sorting and aggregation

# Sort by column
xl sort data.xlsx --by "Date" --descending --output sorted.xlsx

# Group and aggregate
xl group sales.xlsx --by "Region" --aggregate "Amount:sum" --output grouped.xlsx

Data cleaning

# Remove duplicates
xl dedupe data.xlsx --by "Email" --output unique.xlsx

# Clean whitespace and standardize
xl clean data.xlsx --trim --lowercase --columns "Name,Email" --output clean.xlsx

File conversion

# Convert Excel to CSV
xl convert data.xlsx --output data.csv

# Convert CSV to Excel
xl convert data.csv --output data.xlsx

Pipeline operations

# Chain operations
xl filter sales.xlsx --where "Amount > 1000" | \
  xl sort --by "Date" --descending | \
  xl group --by "Region" --aggregate "Amount:sum" \
  --output final.xlsx

Documentation

For detailed documentation, see the docs/ directory:

Development

Setup development environment

# Clone repository
git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit

# Install dependencies
uv sync --all-extras

# Install pre-commit hooks
pre-commit install

Running tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=excel_toolkit

# Run specific test file
uv run pytest tests/unit/test_filtering.py

Code quality

# Linting
uv run ruff check .

# Formatting
uv run ruff format .

# Type checking
uv run mypy excel_toolkit/

Command Reference

Core Commands

  • xl filter - Filter rows based on conditions
  • xl select - Select specific columns
  • xl sort - Sort by column(s)
  • xl group - Group and aggregate data
  • xl join - Join multiple files
  • xl clean - Clean data (trim, case, etc.)
  • xl dedupe - Remove duplicates
  • xl transform - Apply transformations
  • xl convert - Convert between formats
  • xl merge - Merge multiple files
  • xl stats - Calculate statistics
  • xl info - Display file information
  • xl validate - Validate data

Getting help

# General help
xl --help

# Command-specific help
xl filter --help

Examples

Data Analysis Workflow

# Extract sales data for Q1, filter high-value orders, group by region
xl filter sales.xlsx --where "Date >= '2024-01-01' and Date <= '2024-03-31'" | \
  xl filter --where "Amount > 1000" | \
  xl group --by "Region" --aggregate "Amount:sum,Orders:count" \
  --output q1_high_value_by_region.xlsx

Data Cleaning Pipeline

# Clean messy CSV file
xl clean messy_data.csv \
  --trim \
  --lowercase \
  --columns "email,name" \
  --output cleaned.csv

# Remove duplicates and validate
xl dedupe cleaned.csv --by "email" --output unique.csv
xl validate unique.csv --columns "email:email,age:int:0-120" --output final.csv

Architecture

Built with a functional programming approach:

  • Result types: Explicit error handling without exceptions
  • Maybe types: Safe handling of optional values
  • Immutable configuration: Predictable behavior
  • Composable operations: Chain commands efficiently

For more details, see FUNCTIONAL_ANALYSIS.md.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excel_toolkit_cwd-0.2.0.tar.gz (194.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excel_toolkit_cwd-0.2.0-py3-none-any.whl (112.3 kB view details)

Uploaded Python 3

File details

Details for the file excel_toolkit_cwd-0.2.0.tar.gz.

File metadata

  • Download URL: excel_toolkit_cwd-0.2.0.tar.gz
  • Upload date:
  • Size: 194.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_toolkit_cwd-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a8a633e903be1646a48331a3255272bef7ea15ffaf3ce9c507fb08a052ed1bb9
MD5 a4400b14d64daf54bfddc97184f97357
BLAKE2b-256 fadb190c18bebfb4859e49ea671f1c70b457a9dadea904a3e01f790cfbf454d6

See more details on using hashes here.

File details

Details for the file excel_toolkit_cwd-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: excel_toolkit_cwd-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 112.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_toolkit_cwd-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 571048adad19b9a0bb436dde9ccc870af355e2f09ee29b2e52e485bce1a34cbb
MD5 4a3fa02a51d228166c526bbb28868887
BLAKE2b-256 c9be831133bfec581ac67ca08eb7a6cf2bf21e38e3f6b0684951090a35de5efc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page