Skip to main content

Professional spreadsheet wrangling utilities for parsing, splitting, and expanding schedule data

Project description

ScheduleTools

Professional spreadsheet wrangling utilities for parsing, splitting, and expanding schedule data.

Python 3.8+ License: MIT PyPI version

Features

  • Flexible Parsing: Parse schedule data from various formats with configurable date/time formats and block detection
  • Smart Splitting: Split CSV data into multiple files based on grouping criteria with optional filtering
  • Column Expansion: Transform data to match specific output formats with configurable mappings
  • Dual Interface: Use as a Python library for programmatic access or as a CLI tool for file operations
  • Professional Design: Clean API, comprehensive error handling, and type hints

Installation

pip install scheduletools

For development installation:

git clone https://github.com/yourusername/scheduletools.git
cd scheduletools
pip install -e ".[dev]"

Usage

Programmatic Usage

from scheduletools import ScheduleParser, CSVSplitter, ScheduleExpander
import pandas as pd

# Parse schedule with default settings
parser = ScheduleParser("schedule.txt", reference_date="2025-07-21")
parsed_data = parser.parse()

# Parse with custom configuration
custom_config = {
    "Format": {
        "Date": "%m/%d/%Y",
        "Time": "%I:%M %p"
    },
    "Block Detection": {
        "start_marker": "Date",
        "skip_meta_rows": True
    },
    "Missing Values": {
        "Omit": True,
        "Replacement": "TBD"
    }
}

parser = ScheduleParser(
    "schedule.txt", 
    reference_date="2025-07-21",
    config=custom_config
)
parsed_data = parser.parse()

# Split by team
splitter = CSVSplitter(parsed_data, "Team")
team_schedules = splitter.split()

# Expand with template
expander = ScheduleExpander(team_schedules["16U"], expansion_template)
expanded_data = expander.expand()

As a CLI Tool

# Parse a schedule file with default block marker
schtool parse schedule.txt -o parsed_schedule.csv

# Parse with custom block marker
schtool parse schedule.txt --block-marker "Day" -o parsed_schedule.csv

# Split by team
schtool split parsed_schedule.csv -g Team -o team_schedules/

# Expand with template
schtool expand team_schedules/Team_A.csv template.json -o final_schedule.csv

# Complete workflow
schtool process schedule.txt -o output/ -t template.json

Documentation

ScheduleParser

Parse schedule data from various formats into structured DataFrames.

from scheduletools import ScheduleParser

# Basic usage with default block marker ("Date")
parser = ScheduleParser("schedule.txt")
df = parser.parse()

# With custom block marker
parser = ScheduleParser("schedule.txt", block_start_marker="Day")
df = parser.parse()

# With custom configuration
parser = ScheduleParser(
    "schedule.txt",
    config_path="config.json",
    reference_date="2025-09-02",
    block_start_marker="Day"
)
df = parser.parse()

Configuration Format:

{
    "Format": {
        "Date": "%m/%d/%Y",
        "Time": "%I:%M %p",
        "Duration": "H:MM"
    },
    "Block Detection": {
        "start_marker": "Date",
        "skip_meta_rows": true,
        "meta_patterns": ["ice", "time", "header", "day", "week", "note", "info"]
    },
    "Missing Values": {
        "Omit": true,
        "Replacement": "missing"
    },
    "Split": {
        "Skip": false,
        "Separator": "/"
    }
}

Block Detection: The parser uses a configurable block marker to identify where schedule blocks begin. By default, it looks for "Date" in the first column of each row. You can customize this behavior:

  • start_marker: Text that indicates the start of a block column (default: "Date")
  • skip_meta_rows: Whether to skip rows containing meta-information
  • meta_patterns: List of patterns to identify meta-information rows

CSVSplitter

Split CSV data into multiple DataFrames based on grouping criteria.

from scheduletools import CSVSplitter

# Split by single column
splitter = CSVSplitter("data.csv", "Team")
teams = splitter.split()

# Split by multiple columns with filtering
splitter = CSVSplitter(
    "data.csv", 
    ["Week", "Team"],
    include_values=["Week_1", "Week_2"],
    exclude_values=["Team_C"]
)
filtered_groups = splitter.split()

ScheduleExpander

Expand schedule data to include required columns with mappings and defaults.

from scheduletools import ScheduleExpander

# Expand with configuration
config = {
    "Required": ["Date", "Time", "Team", "Location", "Notes"],
    "defaults": {
        "Location": "Main Arena",
        "Notes": ""
    },
    "Mapping": {
        "Start Time": "Time",
        "Team Name": "Team"
    }
}

expander = ScheduleExpander("input.csv", config)
expanded_df = expander.expand()

Configuration

ScheduleParser supports flexible configuration through config objects or JSON files. Configuration options include:

Format Settings

  • Date: Date format string (default: "%m/%d/%Y")
  • Time: Time format string (default: "%I:%M %p")
  • Duration: Duration format (default: "H:MM")

Block Detection

  • start_marker: Text marker for block identification (default: "Date")
  • skip_meta_rows: Whether to skip metadata rows (default: True)
  • meta_patterns: Patterns to identify metadata rows

Missing Values

  • Omit: Whether to omit missing values (default: True)
  • Replacement: Value to use for missing entries (default: "missing")

Split Settings

  • Skip: Whether to skip team splitting (default: False)
  • Separator: Character to split team names (default: "/")

Example Configuration

{
    "Format": {
        "Date": "%m/%d/%Y",
        "Time": "%I:%M %p",
        "Duration": "H:MM"
    },
    "Block Detection": {
        "start_marker": "Date",
        "skip_meta_rows": true,
        "meta_patterns": ["ice", "time", "header", "day", "week", "note", "info"]
    },
    "Missing Values": {
        "Omit": true,
        "Replacement": "TBD"
    },
    "Split": {
        "Skip": false,
        "Separator": "/"
    }
}

CLI Commands

schtool parse

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

0.2.0

  • Enhanced Configuration System: Added support for passing config objects directly to ScheduleParser
  • Improved Block Detection: Fixed block boundary detection logic for more reliable parsing
  • Better Error Handling: Enhanced error messages and exception handling for configuration files
  • Meta Row Detection: Improved handling of empty strings and meta-information rows
  • Complete Workflow Support: Fixed end-to-end workflow testing and validation
  • Documentation Updates: Added comprehensive configuration documentation and examples

0.1.0

  • Initial release
  • Core parsing, splitting, and expansion functionality
  • CLI interface with comprehensive commands
  • Professional API design with type hints
  • Comprehensive error handling
  • Configurable block detection with custom markers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scheduletools-0.2.0.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scheduletools-0.2.0-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file scheduletools-0.2.0.tar.gz.

File metadata

  • Download URL: scheduletools-0.2.0.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for scheduletools-0.2.0.tar.gz
Algorithm Hash digest
SHA256 dabcbbdf2f9140cff0b3b8a73264605fd79d64e595ab70f132df3565a738fba0
MD5 97abbdb809d243590478ae4d1cc5128b
BLAKE2b-256 2b3cf4cf657d38ad08bc6d26c89f26dd635a925f390086b1f3e4a884101bb2ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for scheduletools-0.2.0.tar.gz:

Publisher: publish.yml on Khlick/scheduletools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scheduletools-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: scheduletools-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for scheduletools-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4657e39505dec48cb6b0c3849fee63e7908b758ae3f89461c249d52118766530
MD5 02f29a2a55104f13b35d75c1c5353abf
BLAKE2b-256 d425d3a1080cff46b3e8893aa3f17b8e07ed8eb99309961edc4fe0c1fb6736dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for scheduletools-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Khlick/scheduletools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page