Command-line toolkit for Excel data manipulation and analysis
Project description
Excel CLI Toolkit
Command-line toolkit for Excel data manipulation and analysis.
Overview
Excel CLI Toolkit (xl) is a powerful command-line interface for performing data wrangling and analysis operations on Excel files. Designed for both human users and AI systems, it provides fast, predictable operations without requiring scripts or programming.
Features
- Filter & Search: Filter rows by conditions, search for values
- Sort & Group: Sort by columns, group and aggregate data
- Transform: Apply transformations, clean data, deduplicate
- Multi-format: Support for XLSX, CSV, JSON, and Parquet
- Functional Programming: Built with Result/Maybe types for robust error handling
- AI-Friendly: Simple, composable commands perfect for AI automation
- Fast: Efficient processing of large files
Installation
Using pip
pip install excel-toolkit
Using uv (recommended)
uv pip install excel-toolkit
Development installation
git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit
uv pip install -e ".[dev]"
With Parquet support
pip install "excel-toolkit[parquet]"
Quick Start
Basic filtering
# Filter rows where Amount > 1000
xl filter sales.xlsx --where "Amount > 1000" --output filtered.xlsx
# Filter by multiple conditions
xl filter data.xlsx --where "Region == 'North' and Price > 100" -o result.xlsx
Sorting and aggregation
# Sort by column
xl sort data.xlsx --by "Date" --descending --output sorted.xlsx
# Group and aggregate
xl group sales.xlsx --by "Region" --aggregate "Amount:sum" --output grouped.xlsx
Data cleaning
# Remove duplicates
xl dedupe data.xlsx --by "Email" --output unique.xlsx
# Clean whitespace and standardize
xl clean data.xlsx --trim --lowercase --columns "Name,Email" --output clean.xlsx
File conversion
# Convert Excel to CSV
xl convert data.xlsx --output data.csv
# Convert CSV to Excel
xl convert data.csv --output data.xlsx
Pipeline operations
# Chain operations
xl filter sales.xlsx --where "Amount > 1000" | \
xl sort --by "Date" --descending | \
xl group --by "Region" --aggregate "Amount:sum" \
--output final.xlsx
Documentation
For detailed documentation, see the docs/ directory:
- FEATURES.md - Complete feature list
- PROJECT_STRUCTURE.md - Architecture overview
- CONTRIBUTING.md - Contributing guidelines
Development
Setup development environment
# Clone repository
git clone https://github.com/yourusername/excel-toolkit.git
cd excel-toolkit
# Install dependencies
uv sync --all-extras
# Install pre-commit hooks
pre-commit install
Running tests
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=excel_toolkit
# Run specific test file
uv run pytest tests/unit/test_filtering.py
Code quality
# Linting
uv run ruff check .
# Formatting
uv run ruff format .
# Type checking
uv run mypy excel_toolkit/
Command Reference
Core Commands
xl filter- Filter rows based on conditionsxl select- Select specific columnsxl sort- Sort by column(s)xl group- Group and aggregate dataxl join- Join multiple filesxl clean- Clean data (trim, case, etc.)xl dedupe- Remove duplicatesxl transform- Apply transformationsxl convert- Convert between formatsxl merge- Merge multiple filesxl stats- Calculate statisticsxl info- Display file informationxl validate- Validate data
Getting help
# General help
xl --help
# Command-specific help
xl filter --help
Examples
Data Analysis Workflow
# Extract sales data for Q1, filter high-value orders, group by region
xl filter sales.xlsx --where "Date >= '2024-01-01' and Date <= '2024-03-31'" | \
xl filter --where "Amount > 1000" | \
xl group --by "Region" --aggregate "Amount:sum,Orders:count" \
--output q1_high_value_by_region.xlsx
Data Cleaning Pipeline
# Clean messy CSV file
xl clean messy_data.csv \
--trim \
--lowercase \
--columns "email,name" \
--output cleaned.csv
# Remove duplicates and validate
xl dedupe cleaned.csv --by "email" --output unique.csv
xl validate unique.csv --columns "email:email,age:int:0-120" --output final.csv
Architecture
Built with a functional programming approach:
- Result types: Explicit error handling without exceptions
- Maybe types: Safe handling of optional values
- Immutable configuration: Predictable behavior
- Composable operations: Chain commands efficiently
For more details, see FUNCTIONAL_ANALYSIS.md.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
Built with:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file excel_toolkit_cwd-0.3.0.tar.gz.
File metadata
- Download URL: excel_toolkit_cwd-0.3.0.tar.gz
- Upload date:
- Size: 204.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8504703fd5451c887b01832e0bbeb5aa0c7a6940895ec5b343cea12685162bec
|
|
| MD5 |
45ea5e1be8dd0b719a6ac9550d2d50ae
|
|
| BLAKE2b-256 |
f22c5bdd5ea679113f04975d536516d410e4eb6b07c986fbd57b978bf9e43888
|
File details
Details for the file excel_toolkit_cwd-0.3.0-py3-none-any.whl.
File metadata
- Download URL: excel_toolkit_cwd-0.3.0-py3-none-any.whl
- Upload date:
- Size: 96.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
335a372b2d33ff41b6ff00889c106172d61263bf61e5250e082d98104f53649b
|
|
| MD5 |
462839cab7824a8c3431cbb101c76db2
|
|
| BLAKE2b-256 |
0b62a790a1a6c3d8d11b4f4b92803716f5f03b53aca2257e9af8d00be465131e
|