Skip to main content

Run-length encoding for data analysis in Python

Project description

python-rle

Run-length encoding (wikipedia link) for data analysis in Python. Install with pip install python-rle. No dependencies required other than tqdm for visualizing a progress bar.

Usage

Encode any iterable (tuples, lists, pd.Series etc.) with rle.encode.

# rle.encode(iterable) returns (values, counts)
>>> import rle 
>>> rle.encode((10, 10, 10, 20, 20, 20, 30, 30, 30))
([10, 20, 30], [3, 3, 3])

Decode (values, counts) back into a sequence with rle.decode.

>>> rle.decode([10, 20, 30], [3, 3, 3])
[10, 10, 10, 20, 20, 20, 30, 30, 30]

Set progress_bar == True for long sequences :

progress_bar_anim

Motivation

Base R contains a simple rle function that "computes the lengths and values of runs of equal values in a vector", as described by its docstring. I found it useful for calculating streaks in collected data, and is especially wonderful for compiling and summarizing categorical data that describes status over time. Hence this little utility.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-rle-0.0.3.tar.gz (4.2 kB view hashes)

Uploaded Source

Built Distribution

python_rle-0.0.3-py3-none-any.whl (6.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page