Skip to main content

Command-line toolkit for Excel data manipulation and analysis

Project description

Excel CLI Toolkit

Command-line toolkit for Excel data manipulation and analysis.

Python 3.14+ License: MIT

Overview

Excel CLI Toolkit (xl) is a powerful command-line interface for performing data wrangling and analysis operations on Excel files. Designed for both human users and AI systems, it provides fast, predictable operations without requiring scripts or programming.

Features

  • Filter & Search: Filter rows by conditions, search for values
  • Sort & Group: Sort by columns, group and aggregate data
  • Transform: Apply transformations, clean data, deduplicate
  • Multi-format: Support for XLSX, CSV, JSON, and Parquet
  • Functional Programming: Built with Result/Maybe types for robust error handling
  • AI-Friendly: Simple, composable commands perfect for AI automation
  • Fast: Efficient processing of large files

Installation

Using pip

pip install excel-toolkit

Using uv (recommended)

uv pip install excel-toolkit

Development installation

git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit
uv pip install -e ".[dev]"

With Parquet support

pip install "excel-toolkit[parquet]"

Quick Start

Basic filtering

# Filter rows where Amount > 1000
xl filter sales.xlsx --where "Amount > 1000" --output filtered.xlsx

# Filter by multiple conditions
xl filter data.xlsx --where "Region == 'North' and Price > 100" -o result.xlsx

Sorting and aggregation

# Sort by column
xl sort data.xlsx --by "Date" --descending --output sorted.xlsx

# Group and aggregate
xl group sales.xlsx --by "Region" --aggregate "Amount:sum" --output grouped.xlsx

Data cleaning

# Remove duplicates
xl dedupe data.xlsx --by "Email" --output unique.xlsx

# Clean whitespace and standardize
xl clean data.xlsx --trim --lowercase --columns "Name,Email" --output clean.xlsx

File conversion

# Convert Excel to CSV
xl convert data.xlsx --output data.csv

# Convert CSV to Excel
xl convert data.csv --output data.xlsx

Pipeline operations

# Chain operations
xl filter sales.xlsx --where "Amount > 1000" | \
  xl sort --by "Date" --descending | \
  xl group --by "Region" --aggregate "Amount:sum" \
  --output final.xlsx

Documentation

For detailed documentation, see the docs/ directory:

Development

Setup development environment

# Clone repository
git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit

# Install dependencies
uv sync --all-extras

# Install pre-commit hooks
pre-commit install

Running tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=excel_toolkit

# Run specific test file
uv run pytest tests/unit/test_filtering.py

Code quality

# Linting
uv run ruff check .

# Formatting
uv run ruff format .

# Type checking
uv run mypy excel_toolkit/

Command Reference

Core Commands

  • xl filter - Filter rows based on conditions
  • xl select - Select specific columns
  • xl sort - Sort by column(s)
  • xl group - Group and aggregate data
  • xl join - Join multiple files
  • xl clean - Clean data (trim, case, etc.)
  • xl dedupe - Remove duplicates
  • xl transform - Apply transformations
  • xl convert - Convert between formats
  • xl merge - Merge multiple files
  • xl stats - Calculate statistics
  • xl info - Display file information
  • xl validate - Validate data

Getting help

# General help
xl --help

# Command-specific help
xl filter --help

Examples

Data Analysis Workflow

# Extract sales data for Q1, filter high-value orders, group by region
xl filter sales.xlsx --where "Date >= '2024-01-01' and Date <= '2024-03-31'" | \
  xl filter --where "Amount > 1000" | \
  xl group --by "Region" --aggregate "Amount:sum,Orders:count" \
  --output q1_high_value_by_region.xlsx

Data Cleaning Pipeline

# Clean messy CSV file
xl clean messy_data.csv \
  --trim \
  --lowercase \
  --columns "email,name" \
  --output cleaned.csv

# Remove duplicates and validate
xl dedupe cleaned.csv --by "email" --output unique.csv
xl validate unique.csv --columns "email:email,age:int:0-120" --output final.csv

Architecture

Built with a functional programming approach:

  • Result types: Explicit error handling without exceptions
  • Maybe types: Safe handling of optional values
  • Immutable configuration: Predictable behavior
  • Composable operations: Chain commands efficiently

For more details, see FUNCTIONAL_ANALYSIS.md.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excel_toolkit_cwd-0.1.0.tar.gz (112.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excel_toolkit_cwd-0.1.0-py3-none-any.whl (82.8 kB view details)

Uploaded Python 3

File details

Details for the file excel_toolkit_cwd-0.1.0.tar.gz.

File metadata

  • Download URL: excel_toolkit_cwd-0.1.0.tar.gz
  • Upload date:
  • Size: 112.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_toolkit_cwd-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6df06afe505b0cf1e28093ce1d59da842b4691a73539d493aff3ccd42ba06699
MD5 8acd71cfaae197b25776be194b8ec4aa
BLAKE2b-256 0be209f05913fa6dd0b71d37ec95cf3238eea36b87f669448b8c9840e17bb0f2

See more details on using hashes here.

File details

Details for the file excel_toolkit_cwd-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: excel_toolkit_cwd-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 82.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_toolkit_cwd-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6ede943f41ccf764ed700ada9c17b41ab20ee25325e50a7cba3f1bb3a3a41a7
MD5 cfc10c37437ba111449cbee4c95483fb
BLAKE2b-256 a55cce6e9a82a57158c7fd1b81e84e9989dc0beb75144bcb3b04f642a3c92b2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page