Skip to main content

Scalable time series processing

Project description

Time Series as a Service

#! pip install autotimeseries
import os
import time

import pandas as pd
from autotimeseries.core import AutoTS
from dotenv import load_dotenv

load_dotenv()
True
autotimeseries = AutoTS(bucket_name=os.environ['BUCKET_NAME'],
                        api_id=os.environ['API_ID'], 
                        api_key=os.environ['API_KEY'],
                        aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'], 
                        aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])

M5 example

filename_target = autotimeseries.upload_to_s3('../data/target.parquet')
filename_static = autotimeseries.upload_to_s3('../data/static.parquet')
filename_temporal = autotimeseries.upload_to_s3('../data/temporal.parquet')
filename_calendar_holidays = autotimeseries.upload_to_s3('../data/calendar-holidays.txt')
print(filename_target, filename_static, filename_temporal, filename_calendar_holidays, sep='\n')
target.parquet
static.parquet
temporal.parquet
calendar-holidays.txt

Each time series of the uploaded datasets is defined by the column item_id. Meanwhile the time column is defined by timestamp and the target column by demand. We need to pass this arguments to each call.

columns = dict(unique_id_column='item_id',
               ds_column='timestamp',
               y_column='demand')

Computing calendar features

response_calendar = autotimeseries.calendartsfeatures(filename=filename_temporal,
                                                      country='USA',
                                                      events=filename_calendar_holidays,
                                                      **columns)
response_calendar
{'id_job': '7781f21f-2d89-4709-b0f5-15ad26a74c44',
 'dest_url': 's3://nixtla-user-test/calendar-features.csv',
 'status': 200,
 'message': 'Check job status at GET /jobs/{job_id}'}

To check the status of your job use the method get_status:

autotimeseries.get_status(response_calendar['id_job'])
{'status': 'InProgress', 'processing_time_seconds': 1}

If you want to use the result of your job, you have to wait until its status is Completed.

autotimeseries.check_progress_logs(response_calendar['id_job'])
File written...
Merging finished...
Writing file...





'Completed'

Make forecasts

We will use the calendar features created by the previous process, so we pass calendar-features.parquet as filename_temporal argument.

response_forecast = autotimeseries.tsforecast(filename_target=filename_target,
                                              freq='D',
                                              horizon=28, 
                                              filename_static=filename_static,
                                              filename_temporal='calendar-features.parquet',
                                              objective='tweedie',
                                              metric='rmse',
                                              n_estimators=170,
                                              **columns)
response_forecast
{'id_job': '08b5e4d2-293f-426d-98c4-0ef642a21a1f',
 'dest_url': 's3://nixtla-user-test/forecasts.csv',
 'status': 200,
 'message': 'Check job status at GET /jobs/{job_id}'}
autotimeseries.get_status(response_forecast['id_job'])
{'status': 'InProgress', 'processing_time_seconds': 0}
autotimeseries.check_progress_logs(response_forecast['id_job'])
Starting preprocessing
Done
Processing static features
Reading file...
File read.
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.
Done
[160]#011train's rmse: 2.43104#011valid's rmse: 2.40545
[140]#011train's rmse: 2.4413#011valid's rmse: 2.41412
[120]#011train's rmse: 2.45143#011valid's rmse: 2.4238
[100]#011train's rmse: 2.4601#011valid's rmse: 2.4318
[80]#011train's rmse: 2.47825#011valid's rmse: 2.44776
[60]#011train's rmse: 2.50007#011valid's rmse: 2.46812
[40]#011train's rmse: 2.56329#011valid's rmse: 2.53034
[20]#011train's rmse: 3.07992#011valid's rmse: 3.05776
[LightGBM] [Warning] bagging_fraction is set=1.0, subsample=1.0 will be ignored. Current value: bagging_fraction=1.0
[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20
[LightGBM] [Warning] bagging_freq is set=0, subsample_freq=0 will be ignored. Current value: bagging_freq=0
Starting training
Done
Starting preprocessing
Done
Done
Processing temporal features
Processing static features
Reading file...
File read.





'Completed'

Download forecasts from S3

autotimeseries.download_from_s3(filename='forecasts.csv', filename_output='../data/forecasts.csv')
[==================================================]

Evaluate performance

response_benchmarks = autotimeseries.tsbenchmarks(filename='forecasts.csv',
                                                  dataset='M5')
response_benchmarks
{'id_job': 'fe3f76da-8bbe-45c8-b91b-c9a424449785',
 'dest_url': 's3://nixtla-user-test/benchmarks.csv',
 'status': 200,
 'message': 'Check job status at GET /jobs/{job_id}'}
autotimeseries.get_status(response_benchmarks['id_job'])
{'status': 'InProgress', 'processing_time_seconds': 0}
autotimeseries.check_progress_logs(response_benchmarks['id_job'])
Successfully decompressed data/m5/datasets/m5.zip
Reading file...
File readed.
Computing metrics...
Successfully downloaded m5.zip, 50219189, bytes.
Decompressing zip file...





'Completed'

Download evaluation

DataFrame results

autotimeseries.download_from_s3(filename='benchmarks.csv', filename_output='../data/benchmarks.csv')
[==================================================]
pd.read_csv('../data/benchmarks.csv')
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
model WRMSSE
0 YJ_STU_1st 0.520
1 Matthias_2nd 0.528
2 mf_3rd 0.536
3 YourModel 0.574
4 Rp_50th 0.576
5 AmazonF 0.789
6 sNaive 0.847
7 MLP 0.977
8 RF 1.010
9 Naive 1.752

Plot results

autotimeseries.download_from_s3(filename='WRMSSE.pdf', filename_output='../data/WRMSSE.pdf')
[==================================================]
class PDF(object):
    def __init__(self, pdf, size=(200,200)):
        self.pdf = pdf
        self.size = size

    def _repr_html_(self):
        return '<iframe src={0} width={1[0]} height={1[1]}></iframe>'.format(self.pdf, self.size)

    def _repr_latex_(self):
        return r'\includegraphics[width=1.0\textwidth]{{{0}}}'.format(self.pdf)
PDF('../data/WRMSSE.pdf', size = (600, 400))
<iframe src=../data/WRMSSE.pdf width=600 height=400></iframe>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autotimeseries-0.0.4.tar.gz (17.4 kB view hashes)

Uploaded Source

Built Distribution

autotimeseries-0.0.4-py3-none-any.whl (12.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page