Skip to main content

A Python package for reading data from the Inputs, Assumptions and Scenarios Report (IASR) Microsoft Excel workbook published by the Australian Energy Market Operator for use in their Integrated System Plan modelling.

Project description

AEMO Integrated System Plan Assumptions Workbook Parser

PyPI version Continuous Integration and Deployment codecov pre-commit.ci status UV

A Python package for reading data from the Inputs, Assumptions and Scenarios Report (IASR) Microsoft Excel workbook published by the Australian Energy Market Operator for use in their Integrated System Plan modelling.

Table of contents

Installation

pip install isp-workbook-parser

How the package works

  1. Load a workbook using Parser (see examples below).
    • While we do not include workbooks with the package distribution, you can find the versions for which table configurations are written within workbooks/<version>.
  2. Table configuration files for data tables are located in src/config/<version>
    • These specify the name, location, columns and data range of tables to be extracted from a particular workbook version. Optionally, rows to skip and not read in (e.g. where AEMO has formatted a row with a strike through to indicate that the data is no longer being used) and columns with merged rows can also be specified and handled.
    • These are included with the package distributions.
  3. Parser loads the MS Excel workbook and, by default, will check if the version of the workbook is supported by seeing if configuration files are included in the package for that version.
  4. If they are, Parser can use these configuration files to parse the data tables and save them as CSVs.

[!NOTE] This package makes some opinionated decisions when processing tables. For example, multiple header row tables are reduced to a single header, data in merged cells is inferred from surrounding cells, and notes and footnotes are dropped (amonst other ways in which the data is sanitised). For more detail, refer to the docstring and code in read_table.py and sanitisers.py.

Table configurations

Table configuration file attributes
  • name: the table name
  • sheet_name: the sheet where the table is located
    • N.B. there may be spaces at the end of sheet names in the workbook
  • header_rows: this specifies the Excel row(s) with table column names
    • A single row of table column names (e.g. 6)
    • Or a list of row numbers for the table header sorted in ascending order (e.g. [6, 7, 8])
  • end_row: the last row of table data
  • column_range: the Excel column range of the table in alphabetical/Excel format, e.g. "B:F"
  • skip_rows: optional, Excel row(s) in the table that should not be read in
    • A single row (e.g. 15)
    • Or a list of rows (e.g. [15, 16])
  • columns_with_merged_rows: optional, Excel column(s) with merged rows
    • A single column in alphabetical format (e.g. "B"),
    • Or a list of columns in alphabetical format (e.g. ["B", "D"])
  • forward_fill_values: optional, specifies whether table values should be forward filled
    • Default True to handle merged cells in tables
    • Should be set to False where there are empty columns

Adding table configuration files to this package

Refer to the contributing instructions for details on how to contribute table configuration (YAML) files to this repository and package.

Examples

Bulk export

Export all the data tables the package has a config file for to CSV files.

from isp_workbook_parser import Parser

workbook = Parser("<path/to/workbook>/2024-isp-inputs-and-assumptions-workbook.xlsx")

workbook.save_tables('<path/to/output directory>')

List tables with configuration files

Return a dictionary of table names, with lists of tables names stored under a key which is their sheet name in the workbook. For a given workbook version, this only returns tables the package has a configuration file for.

from isp_workbook_parser import Parser

workbook = Parser("<path/to/workbook>/2024-isp-inputs-and-assumptions-workbook.xlsx")

names = workbook.get_table_names()

names['Build limits']

Get table as DataFrame

Get a single table as a pandas DataFrame.

from isp_workbook_parser import Parser

workbook = Parser("<path/to/workbook>/2024-isp-inputs-and-assumptions-workbook.xlsx")

table = workbook.get_table("retirement_costs")

Get table with custom configuration

Get a table by directly providing the table config.

from isp_workbook_parser import Parser, TableConfig

workbook = Parser("<path/to/workbook>/2024-isp-inputs-and-assumptions-workbook.xlsx")

table_config = TableConfig(
  name="table_name",
  sheet_name="sheet_name",
  header_rows=5,
  end_row=21,
  column_range="B:J",
)

workbook.get_table_from_config(table_config)

Contributing

Interested in contributing to the source code or adding table configurations? Check out the contributing instructions, which also includes steps to install isp-workbook-parser for development.

Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

isp-workbook-parser was created as a part of the OpenISP project. It is licensed under the terms of GNU GPL-3.0-or-later licences.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isp_workbook_parser-2.5.0.tar.gz (53.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isp_workbook_parser-2.5.0-py3-none-any.whl (86.4 kB view details)

Uploaded Python 3

File details

Details for the file isp_workbook_parser-2.5.0.tar.gz.

File metadata

  • Download URL: isp_workbook_parser-2.5.0.tar.gz
  • Upload date:
  • Size: 53.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for isp_workbook_parser-2.5.0.tar.gz
Algorithm Hash digest
SHA256 958e5fa1115a1c5dee1a8d2f14a825f5a86133d3c0bd46cc4a465007ac954d8b
MD5 0a086399ece9ad43bc21787d7e18cd3f
BLAKE2b-256 994a33590f8f545b468020f28a2a2bb8071d5979e9ffa37dc2ed57b1502f50c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for isp_workbook_parser-2.5.0.tar.gz:

Publisher: cicd.yml on Open-ISP/isp-workbook-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file isp_workbook_parser-2.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for isp_workbook_parser-2.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d03b788ad305f93fb04f74924985e9325ba099498b99f24339c429dfd662df7c
MD5 2b2b95da2214339d9556934db709330a
BLAKE2b-256 02ebbb3b6c8831f9e2dca80bc4b4e5163fa7ae1500c547d8794391064ba0a0be

See more details on using hashes here.

Provenance

The following attestation bundles were made for isp_workbook_parser-2.5.0-py3-none-any.whl:

Publisher: cicd.yml on Open-ISP/isp-workbook-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page