Skip to main content

A data analysis package based on modelling and manipulation of mathematical step functions. Strongly aligned with pandas.

Project description

staircase logo

The staircase package enables data analysis through mathematical step functions. Step functions can be used to represent continuous time series - think changes in state over time, queue size over time, utilisation over time, success rates over time etc.

The package is built upon numpy and pandas, with a deliberate, stylistic alignment to the latter in order to integrate seamlessly into the pandas ecosystem.

The staircase package makes converting raw, temporal data into time series easy and readable. Furthermore there is a rich variety of arithmetic operations, relational operations, logical operations, statistical operations, to enable analysis, in addition to functions for univariate analysis, aggregations and compatibility with datetimes.

New in 2022: staircase now provides support for pandas extension arrays and a Series accessor.

An example

In this example, we consider data corresponding to site views for a website in October 2021. The start and end times have been logged for each session, in addition to one of three countries codes (AU, UK, US). These times are recorded with pandas.Timestamp and any time which falls outside of October is logged as NAT.

>>> data
                       start                   end   country
0                        NaT   2021-10-01 00:00:50        AU
1                        NaT   2021-10-01 00:07:45        AU
2                        NaT   2021-10-01 00:05:58        AU
3                        NaT   2021-10-01 00:08:48        AU
4                        NaT   2021-10-01 00:05:26        AU
...                      ...                   ...       ...
425728   2021-10-31 23:57:16                   NaT        US
425729   2021-10-31 23:57:25                   NaT        US
425730   2021-10-31 23:58:59                   NaT        US
425731   2021-10-31 23:59:45                   NaT        US
425732   2021-10-31 23:59:59                   NaT        US

Note that the number of users viewing the site over time can be modelled as a step function. The value of the function increases by 1 every time a user arrives at the site, and decreases by 1 every time a user leaves the site. This step function can be thought of as the sum of three step functions: AU users + UK users + US users. Creating a step function for AU users, for example, is simple. To achieve it we use the Stairs class, which represents a step function:

>>> import staircase as sc

>>> views_AU = sc.Stairs(data.query("country == 'AU'"), "start", "end")
>>> views_AU
<staircase.Stairs, id=1609972469384>

We can visualise the function with the plot function:

>>> views_AU.plot()

AU views example

Rather than creating a separate variable for each country, we can create a pandas.Series to hold a step function for each country. We can even give this Series a "Stairs" type.

>>> october = (pd.Timestamp("2021-10"), pd.Timestamp("2021-11"))
>>> series_stepfunctions = (
...     data.groupby("country")
...     .apply(sc.Stairs, "start", "end")
...     .apply(sc.Stairs.clip, october)  # set step functions to be undefined outside of October
...     .astype("Stairs")
... )
>>> series_stepfunctions
country
AU    <staircase.Stairs, id=2516367680328>
UK    <staircase.Stairs, id=2516362550664>
US    <staircase.Stairs, id=2516363585928>
dtype: Stairs

The plotting backend to staircase is provided by matplotlib.

>>> import matplotlib.pyplot as plt
>>> _, ax = plt.subplots(figsize=(15,4))
>>> series_stepfunctions.sc.plot(ax, alpha=0.7)
>>> ax.legend()

all views example

Now plotting step functions is useful, but the real fun starts when we go beyond this:

staircase analysis examples

Installation

staircase can be installed from PyPI:

python -m pip install staircase

or also with conda:

conda install -c conda-forge staircase

Documentation

The complete guide to using staircase can be found at staircase.dev

Contributing

There are many ways in which contributions can be made - the first and foremost being using staircase and giving feedback.

Bug reports, feature requests and ideas can be submitted via the Github issue tracker.

Additionally, bug fixes. enhancements, and improvements to the code and documentation are also appreciated and can be done via pull requests. Take a look at the current issues and if there is one you would like to work on please leave a comment to that effect.

See this beginner's guide to contributing, or Pandas' guide to contributing, to learn more about the process.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository. It is highly recommended to use staircase 2.*, for both performance and additional features.

License

This project is licensed under the MIT License - see the LICENSE file for details

Acknowledgments

The seeds of staircase began developing at the Hunter Valley Coal Chain Coordinator, where it finds strong application in analysing simulated data. Thanks for the support!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

staircase-2.5.2.tar.gz (48.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

staircase-2.5.2-py3-none-any.whl (58.4 kB view details)

Uploaded Python 3

File details

Details for the file staircase-2.5.2.tar.gz.

File metadata

  • Download URL: staircase-2.5.2.tar.gz
  • Upload date:
  • Size: 48.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for staircase-2.5.2.tar.gz
Algorithm Hash digest
SHA256 606de1836987dbed7ad2d40e02e66baac990005fbaa724d37e02ee09a9530a3a
MD5 9accd9714e07d84fcb43d307bf36ab47
BLAKE2b-256 1b19da6e14013688bd6c1ee24e2f04b4e239a6f7fbae293740555e0b14fa452e

See more details on using hashes here.

File details

Details for the file staircase-2.5.2-py3-none-any.whl.

File metadata

  • Download URL: staircase-2.5.2-py3-none-any.whl
  • Upload date:
  • Size: 58.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for staircase-2.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 bfc17bae7c2f3f730986aa2912bdcbdc828b9d07e03f55d6aa336c3d835c0e1f
MD5 8866540e9e33daf9c0f0dad41d99d6ff
BLAKE2b-256 a9db759a41ca743db953bc70dd55401071f6ebb8f7ce94f32c35afa4416c7db1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page