Skip to main content

Command-line toolkit for Excel data manipulation and analysis

Project description

Excel CLI Toolkit

Command-line toolkit for Excel data manipulation and analysis.

Python 3.14+ License: MIT

Overview

Excel CLI Toolkit (xl) is a powerful command-line interface for performing data wrangling and analysis operations on Excel files. Designed for both human users and AI systems, it provides fast, predictable operations without requiring scripts or programming.

Features

  • Filter & Search: Filter rows by conditions, search for values
  • Sort & Group: Sort by columns, group and aggregate data
  • Transform: Apply transformations, clean data, deduplicate
  • Multi-format: Support for XLSX, CSV, JSON, and Parquet
  • Functional Programming: Built with Result/Maybe types for robust error handling
  • AI-Friendly: Simple, composable commands perfect for AI automation
  • Fast: Efficient processing of large files

Installation

Using pip

pip install excel-toolkit

Using uv (recommended)

uv pip install excel-toolkit

Development installation

git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit
uv pip install -e ".[dev]"

With Parquet support

pip install "excel-toolkit[parquet]"

Quick Start

Basic filtering

# Filter rows where Amount > 1000
xl filter sales.xlsx --where "Amount > 1000" --output filtered.xlsx

# Filter by multiple conditions
xl filter data.xlsx --where "Region == 'North' and Price > 100" -o result.xlsx

Sorting and aggregation

# Sort by column
xl sort data.xlsx --by "Date" --descending --output sorted.xlsx

# Group and aggregate
xl group sales.xlsx --by "Region" --aggregate "Amount:sum" --output grouped.xlsx

Data cleaning

# Remove duplicates
xl dedupe data.xlsx --by "Email" --output unique.xlsx

# Clean whitespace and standardize
xl clean data.xlsx --trim --lowercase --columns "Name,Email" --output clean.xlsx

File conversion

# Convert Excel to CSV
xl convert data.xlsx --output data.csv

# Convert CSV to Excel
xl convert data.csv --output data.xlsx

Pipeline operations

# Chain operations
xl filter sales.xlsx --where "Amount > 1000" | \
  xl sort --by "Date" --descending | \
  xl group --by "Region" --aggregate "Amount:sum" \
  --output final.xlsx

Documentation

For detailed documentation, see the docs/ directory:

Development

Setup development environment

# Clone repository
git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit

# Install dependencies
uv sync --all-extras

# Install pre-commit hooks
pre-commit install

Running tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=excel_toolkit

# Run specific test file
uv run pytest tests/unit/test_filtering.py

Code quality

# Linting
uv run ruff check .

# Formatting
uv run ruff format .

# Type checking
uv run mypy excel_toolkit/

Command Reference

Core Commands

  • xl filter - Filter rows based on conditions
  • xl select - Select specific columns
  • xl sort - Sort by column(s)
  • xl group - Group and aggregate data
  • xl join - Join multiple files
  • xl clean - Clean data (trim, case, etc.)
  • xl dedupe - Remove duplicates
  • xl transform - Apply transformations
  • xl convert - Convert between formats
  • xl merge - Merge multiple files
  • xl stats - Calculate statistics
  • xl info - Display file information
  • xl validate - Validate data

Getting help

# General help
xl --help

# Command-specific help
xl filter --help

Examples

Data Analysis Workflow

# Extract sales data for Q1, filter high-value orders, group by region
xl filter sales.xlsx --where "Date >= '2024-01-01' and Date <= '2024-03-31'" | \
  xl filter --where "Amount > 1000" | \
  xl group --by "Region" --aggregate "Amount:sum,Orders:count" \
  --output q1_high_value_by_region.xlsx

Data Cleaning Pipeline

# Clean messy CSV file
xl clean messy_data.csv \
  --trim \
  --lowercase \
  --columns "email,name" \
  --output cleaned.csv

# Remove duplicates and validate
xl dedupe cleaned.csv --by "email" --output unique.csv
xl validate unique.csv --columns "email:email,age:int:0-120" --output final.csv

Architecture

Built with a functional programming approach:

  • Result types: Explicit error handling without exceptions
  • Maybe types: Safe handling of optional values
  • Immutable configuration: Predictable behavior
  • Composable operations: Chain commands efficiently

For more details, see FUNCTIONAL_ANALYSIS.md.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excel_toolkit_cwd-0.3.0.tar.gz (204.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excel_toolkit_cwd-0.3.0-py3-none-any.whl (96.0 kB view details)

Uploaded Python 3

File details

Details for the file excel_toolkit_cwd-0.3.0.tar.gz.

File metadata

  • Download URL: excel_toolkit_cwd-0.3.0.tar.gz
  • Upload date:
  • Size: 204.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_toolkit_cwd-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8504703fd5451c887b01832e0bbeb5aa0c7a6940895ec5b343cea12685162bec
MD5 45ea5e1be8dd0b719a6ac9550d2d50ae
BLAKE2b-256 f22c5bdd5ea679113f04975d536516d410e4eb6b07c986fbd57b978bf9e43888

See more details on using hashes here.

File details

Details for the file excel_toolkit_cwd-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: excel_toolkit_cwd-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 96.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_toolkit_cwd-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 335a372b2d33ff41b6ff00889c106172d61263bf61e5250e082d98104f53649b
MD5 462839cab7824a8c3431cbb101c76db2
BLAKE2b-256 0b62a790a1a6c3d8d11b4f4b92803716f5f03b53aca2257e9af8d00be465131e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page