Scalable time series features computation

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language
- Python :: 3.7
- Python :: 3.8

Project description

FastTSFeatures

Time-series feature extraction as a service. FastTSFeatures is an SDK to compute static, temporal and calendar variables as a service.

The package serves as a wrapper for tsfresh and tsfeatures. Since we take care of the whole infrastructure, feature extraction becomes as easy as running a line in your python nootebooks or calling an API.

Why?

We build FastTSFeatures because we wanted and easy and fast way to extract Time Series Features without having to think about infrastructure and deployment. Now we want to see if other Data Scientists find it useful too.

Avaiable Features (More than 600)

Static Features

40+ Features: https://github.com/Nixtla/tsfeatures
600+ Temporal Features: https://github.com/blue-yonder/tsfresh/
10 Temporal Features (lags, mean lags, std_lags) [Currently just supported for daily data] Calendar Features (distance in minutes to holidays)
Calendar features for 83 Countries https://github.com/dr-prodigy/python-holidays

API

For api documantation visit [PENDING]

Install

pip install fasttsfeatures

How to use

You can use FastTSFeatures by either using a completly public S3 bucket or by upploading a file to your own S3 bucket provided by us.

Data Format

Currently we only support .csv files. These files must include at least 3 colums, with a unique_id a date stamp and a value.

1. Request free trial

Request a free trial sending an email to: fede.garza.ramirez@gmail.com and get your API_KEY, API_ID and private URI

2. Run fasttsfeatures on a private S3 Bucket

If you don´t want other people to potentially have acces to your data you can run fasttsfeatures on a private S3 Bucket. For that you have to upload your data to a private S3 Bucket that we will provide for you, you can do this inside of python.

2.1 Upload to S3 from python

You will need the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY that we provided.

Import and Instantiate TSFeatures introduce your API_ID and API_KEY, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY.

from fasttsfeatures.core import TSFeatures

tsfeatures = TSFeatures(api_id=os.environ['API_ID'], 
                        api_key=os.environ['API_KEY'],
                        aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'], 
                        aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])

Upload your local file introducing its name and the bucket's name (provided by Nixtla in the file we sent by email).

s3_uri = tsfeatures.upload_to_s3('../train.csv', 'PROVIDED BUCKET NAME')

Run the features extraction process

You can run temporal, static or calendar features on the data you uploaded. To run the process specify:

s3_uri: S3 uri we provided in the file we sent by email.
freq: integer where {'Hourly': 24, 'Daily': 7,'Monthly': 12, 'Quarterly': 4,'Wweekly': 1, 'Yearly': 1}
ds_column: name of the unique id column
y_column: name of the target column

In the case of calendar variable you have to specify the country using the ISO code.

#Run Temporal Features
response_tmp_ft = tsfeatures.calculate_temporal_features_from_s3_uri(s3_uri="PRIVATE S3 URI HERE",
                                                     freq=7,
                                                     unique_id_column="NAME OF ID COLUMN",
                                                     ds_column= "NAME OF DATESTAMP COLUMN",
                                                     y_column="NAME OF TARGET COLUMN")

#Run Static Features
response_static_ft = tsfeatures.calculate_static_features_from_s3_uri(s3_uri="PRIVATE S3 URI HERE",
                                                     freq=7,
                                                     unique_id_column="NAME OF ID COLUMN",
                                                     ds_column= "NAME OF DATESTAMP COLUMN",
                                                     y_column="NAME OF TARGET COLUMN")

#Run Calendar Features
response_cal_ft = tsfeatures.calculate_calendar_features_from_s3_uri(s3_uri="PRIVATE S3 URI HERE",
                                                     country="ISO",
                                                     unique_id_column="NAME OF ID COLUMN",
                                                     ds_column= "NAME OF DATESTAMP COLUMN",
                                                     y_column="NAME OF TARGET COLUMN")

To see the status of your job you can run the following code

response_tmp_ft

	status	body	id_job	message
0	200	"s3://nixtla-user-test/features/features.csv"	f7bdb6dc-dcdb-4d87-87e8-b5428e4c98db	Check job status at GET /tsfeatures/jobs/{job_id}

Monitor the process with the following code. Once it's done, access to your bucket to download the generated features.

job_id = response['id_job'].item()

display(tsfeatures.get_status(job_id))

	status	processing_time_seconds
0	InProgress	3

Once the process is done you will find a file for each process you ran in the URI we provied.

ToDos

Optimizing writing and reading speed with Parquet files
Making temporal features available for different granularities
Fill zeros (For Data where 0 values are not reported, e.g. Retail Data)
Empirical benchamarking of model improvement
Nan Handling
Check data integrety before Upload
Informative error messages
Informative Status
Optional parameter y in calendar

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language
- Python :: 3.7
- Python :: 3.8

Release history Release notifications | RSS feed

0.0.10

Sep 22, 2021

0.0.9

Sep 17, 2021

0.0.8

Sep 17, 2021

This version

0.0.7

Sep 17, 2021

0.0.6

Sep 16, 2021

0.0.5

Sep 16, 2021

0.0.4

Sep 15, 2021

0.0.3

Sep 15, 2021

0.0.2

Sep 15, 2021

0.0.1

Sep 13, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasttsfeatures-0.0.7.tar.gz (13.7 kB view hashes)

Uploaded Sep 17, 2021 Source

Built Distribution

fasttsfeatures-0.0.7-py3-none-any.whl (10.9 kB view hashes)

Uploaded Sep 17, 2021 Python 3

Hashes for fasttsfeatures-0.0.7.tar.gz

Hashes for fasttsfeatures-0.0.7.tar.gz
Algorithm	Hash digest
SHA256	`688c69c3b85e9417aa9b8ce429522ba5ad8cbf8c8854ee5f7ba84ee21cb794ba`
MD5	`f154ffb543d97c2a20b1086a7deac51f`
BLAKE2b-256	`893471f6588622ac4fc99a2034b89fc40df5e7d6e7ab2260e4c1bcdcfe4181ec`

Hashes for fasttsfeatures-0.0.7-py3-none-any.whl

Hashes for fasttsfeatures-0.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cad0b32f72e403d42bebee206a783c524be6668ec9c81190e6ebf5396bc523cf`
MD5	`5419fe7a822850386ec15a361253b330`
BLAKE2b-256	`eda64b006a493e4709f110059cfd1c5be9ab325706020509bc176073f7e1668d`