Skip to main content

Estimate Trend at a Particular Point in a Noisy Time Series

Project description

incline: Estimate Trend at a Particular Point in a Noisy Time Series

PyPI version Downloads CI Docs

Trends in time series are valuable. If the cost of a product rises suddenly, it likely indicates a sudden shortfall in supply or a sudden rise in demand. If the cost of claims filed by a patient rises sharply, it may suggest rapidly worsening health. But how do we estimate the trend at a particular time in a noisy time series? Smooth the time series using any one of the many methods, local polynomials or via GAMs or similar such methods, and then estimate the derivative(s) of the function at the chosen point in time.

The package provides a couple of ways of approximating the underlying function for the time series:

  • fitting a local higher order polynomial via Savitzky-Golay over a window of choice

  • fitting a smoothing spline

The package provides a way to estimate the first and second derivative at any given time using either of those methods. Beyond these smarter methods, the package also provides a way a naive estimator of slope---average change when you move one-step forward (step = observed time units) and one-step backward. The users can also calculate average or max. slope over a time window (over observed time steps).

The difference between naive estimates and estimates based on smoothed time series can be substantial. In the example we provide, the correlation is -.47.

Clarification

Sometimes we want to know what the "trend" was over a particular time window. But what that means is not 100% clear. For a synopsis of the issues, see here.

Underlying Machinery

Savitzky-Golay

Filter the time series using local polynomials and get an estimate of the derivative in one shot. For more information, see the Python documentation and Wikipedia

Univariate Splines

Find more details here

Assumption

Silly as it is, for now, we assume that the time series is a) complete, and b) increases with unit time intervals.

API

The package wraps the functions for doing local smoothing and derivative estimation for a standardized interface. We use this standard interface to estimate the trend at a particular set of points in parallel for thousands of time series.

The package incline exposes 4 functions:

  1. naive_trend:

    Input:

    Functionality:

    • estimates the derivative at a location by taking the average of change when you move one unit to the right and change when you move one unit to the left.

    Output:

    dataframe with 6 columns (smoothed value column just has None): datetime, function_order (value of the polynomial order), smoothed_value, derivative_method, derivative_order, derivative_value.

  2. spline_trend:

    Input:

    • df: pandas dataFrame time series object
    • function_order: spline order (default is 3)---fitting with cubic splines. The knot placement is determined by the smoothing factor s.
    • derivative_order: (0, 1, 2, ... with default as 1)
    • s: smoothing factor. the total unnormalized global cost that we are willing to bear. larger values give more smoothed estimates. See the documentation for details.

    Functionality:

    Interpolates time series with splines of 'function_order'. And then calculates the derivative_order using the smoothed function.

    Output:

    dataframe with 6 columns: datetime, function_order (value of the polynomial order), smoothed_value, derivative_method, derivative_order, derivative_value.

    A row can be 2012-01-01, "spline", 2, 1, 0

  3. sgolay_trend:

    Input:

    • df pandas dataFrame time series object
    • window_size: default is 15
    • function_order: polynomial order (default is 3)
    • derivative_order: (0, 1, 2, ... with default as 1)

    Functionality:

    Interpolates time series with savitzky-golay using polynomials of 'function_order'. And then calculates the derivative_order using the smoothed function.

    Output:

    dataframe with 6 columns: datetime, function_order (value of the polynomial order), smoothed_value, derivative_method, derivative_order, derivative_value.

    Sample row: 2012-01-01, "savitzky-golay", 2, 1, 0

  4. trending:

    Input:

    • df_list: list of outputs (dataframes) from savitzky_golay_trend or spline_trend with a new column called 'id' that identifies the time series
    • derivative_order: (1 or 2)
    • k: number of latest time periods to consider.
    • max_or_avg: "max" or "avg"

    Functionality:

    for each item in the list, calculate either the max or the average (depending on max_or_avg) of the Yth derivative (based on the derivative_order) over the last k time_periods (based on the input). It then orders the list based on max to min.

    For instance, for derivative_order = 1, max_or_avg = "max", time_periods = 3, for each item in the list, the function will take the max of the last 3 rows of the dataframe entries identifying the 1st derivative.

    So each item in the list produces one number (max or avg.). We then produce a new dataframe with 2 columns: id, max_or_avg

    Output:

    Dataframe with 2 columns: id, max_or_avg

Installation

pip install incline

Usage

from incline import spline_trend

locpol = spline_trend(time_series, , ...)

Examples

Please look at this notebook for how to use incline using data from the stock market.

License

The package is released under the MIT License.

Authors

Suriyan Laohaprapanon and Gaurav Sood

Additional Reading

While we don't provide this in the package but you could approximate the function using:

  1. Penalized cubic splines using GAMS via pyGAM. For more information, see these lecture notes

  2. Or, nonparametrically

And here's a paper on Derivative Estimation with Local Polynomial Fitting

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

incline-0.4.0.tar.gz (36.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

incline-0.4.0-py3-none-any.whl (41.2 kB view details)

Uploaded Python 3

File details

Details for the file incline-0.4.0.tar.gz.

File metadata

  • Download URL: incline-0.4.0.tar.gz
  • Upload date:
  • Size: 36.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for incline-0.4.0.tar.gz
Algorithm Hash digest
SHA256 96b413931cda1533149d419447290712ada3c9b3aefb11442fd5db162956dea7
MD5 b6aceb78096bb097cbb2accb887c9350
BLAKE2b-256 6742a6a1db09d02b49cc793fac65a84c477ca6ae5a5fc82e805f50ce0744a55a

See more details on using hashes here.

Provenance

The following attestation bundles were made for incline-0.4.0.tar.gz:

Publisher: python-publish.yml on finite-sample/incline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file incline-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: incline-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 41.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for incline-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 35aba58026a4a8367034fc9ab82a7ff8877582385230bee01ac40f41fbd4cf35
MD5 9de781d45746c4612989eb6051677ea5
BLAKE2b-256 5ffbbe921d88bd4f5079f1a62311cd53c118e02d6d941ab53ebc8a8d5dcda79f

See more details on using hashes here.

Provenance

The following attestation bundles were made for incline-0.4.0-py3-none-any.whl:

Publisher: python-publish.yml on finite-sample/incline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page