Skip to main content

Apply groupwise lowess smoothing to a dataframe

Project description

Lowess Grouped

Apply groupwise lowess smoothing to a dataframe.

Smooth data for each category using the lowess (aka loess) algorithm. You can use this code for all forms of data that should be smoothed independently by group:

lowess-grouped-example Figure 1: Smoothed temperature data for each region

Usage

Install the package (Python 3.8 or higher):

pip install lowess-grouped

Import the package and call the function lowess_grouped with your dataframe df. Use the parameter frac to control the strength of the smoothing:

from lowess_grouped.lowess_grouped import lowess_grouped

df_smoothed = lowess_grouped(df, 
                             x_name="year", 
                             y_name="temperature_anomaly",
                             group_name="region_name", 
                             frac=0.05)

For a detailed example, refer to the notebook temperature-example.ipynb.

Testcases

Tests are defined in the folder tests. To run them manually, follow these steps:

  1. Download the source code from GitHub.

  2. Install package locally by executing the following command in the project folder:

    pip install -e .
    

    You might need to upgrade your version of pip for this to work:

    pip install --upgrade pip
    
  3. Run the tests:

    python ./tests/test_lowess_grouped.py -v
    

Motivation

Smoothing data can greatly improve the interpretability of visualizations. One commonly used method is lowess, also knows as loess, sometimes also referred as Savitzky–Golay filter.

However, the built-in lowess function in Statsmodels (a popular statistics package) applies smoothing to the entire dataframe. This can lead to undesirable results when you need independent smoothing for multiple groups (e.g., temperature data by regions).

This package was developed to address this limitation and provide some convenience, like getting a dataframe with column names back, instead of unnamed numpy arrays. Internally it still uses Statsmodels.

Attribution

This project builds upon the lowess function from statsmodels. The temperature data used in the example notebook and testcases is from Berkley Earth, and licensed under Creative Commons BY-NC 4.0 International.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lowess_grouped-0.0.7.tar.gz (423.9 kB view details)

Uploaded Source

Built Distribution

lowess_grouped-0.0.7-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file lowess_grouped-0.0.7.tar.gz.

File metadata

  • Download URL: lowess_grouped-0.0.7.tar.gz
  • Upload date:
  • Size: 423.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for lowess_grouped-0.0.7.tar.gz
Algorithm Hash digest
SHA256 4e66b95b1843a7779fc179525cc9906f61a20c5e7b3591cb36ef6e345cecdbc1
MD5 fc874e2d3943621e839362826dc3c554
BLAKE2b-256 1dcc888f93ddaf7db7c2a39b3f7388607c66d2427278498640a0892fb7c95c8b

See more details on using hashes here.

File details

Details for the file lowess_grouped-0.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for lowess_grouped-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 55c4ab9ab14217d6050ad501ac337eb67b27e8a8c488af34f6e46cf2a8c6deba
MD5 64d5780c90f73cecc06f3b06bea603db
BLAKE2b-256 40304ef124e06dbd47acdf44fc277cb94923f3104f0a38f765bfe88db5095c7f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page