Skip to main content

Deep learning, Machine learning and Statistical learning for humans.

Project description

Perlib

This repository hosts the development of the Perlib library.

About Perlib

Perlib is a framework written in Python where you can use deep and machine learning algorithms.

Perlib is

Feature to use many deep or machine learning models easily Feature to easily generate estimates in a single line with default parameters Understanding data with simple analyzes with a single line Feature to automatically preprocess data in a single line Feature to easily create artificial neural networks Feature to manually pre-process data, extract analysis or create models with detailed parameters, produce tests and predictions

Usage

The core data structures are layers and models. For quick results with default parameters To set up more detailed operations and structures, you should use the Perflib functional API, which allows you to create arbitrary layers or write models completely from scratch via subclassing.

Install

pip install perlib
from perlib.forecaster import *

This is how you can use sample datasets.

from perlib import datasets # or  from perlib.datasets import *
import pandas as pd
dataset = datasets.load_airpassengers()
data = pd.DataFrame(dataset)
data.index = pd.date_range(start="2022-01-01",periods=len(data),freq="d")

To read your own dataset;

import perlib
pr = perlib.dataPrepration()
data = pr.read_data("./datasets/winequality-white.csv",delimiter=";")
model_selector = perlib.ModelSelection(data_l)
model_selector.select_model()
Dimension Reduction Analysis: Size reduction is not recommended.

Selected Models:
1. ('BILSTM', 'Data symmetric, BILSTM is available.')
2. ('CONVLSTM', 'There are spatial and temporal patterns, CONVLSTM can be used.')
3. ('SARIMA', 'No time series data found, SARIMA is not recommended.')
4. ('PROPHET', 'No time series data found, PROPHET is not recommended.')
5. ('LSTM', 'Insufficient dataset size, LSTM is not recommended.')
6. ('LSTNET', 'Insufficient dataset size, LSTNet is not recommended.')
7. ('TCN', 'No temporal dependencies, TCN is not recommended.')
8. ('XGBoost', 'No irregular and heterogeneous data, XGBoost is not recommended.')

The easiest way to get quick results is with the 'get_result' function. You can choice modelname ; "RNN", "LSTM", "BILSTM", "CONVLSTM", "TCN", "LSTNET", "ARIMA" ,"SARIMA" or all machine learning algorithms

forecast,evaluate = get_result(dataFrame=data,
                    y="Values",
                    modelName="Lstnet",
                    dateColumn=False,
                    process=False,
                    forecastNumber=24,
                    metric=["mape","mae","mse"],
                    epoch=2,
                    forecastingStartDate=2022-03-06
                    )
Parameters created
The model training process has been started.
Epoch 1/2
500/500 [==============================] - 14s 23ms/step - loss: 0.2693 - val_loss: 0.0397
Epoch 2/2
500/500 [==============================] - 12s 24ms/step - loss: 0.0500 - val_loss: 0.0092
Model training process completed

The model is being saved
1/1 [==============================] - 0s 240ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 10ms/step
1/1 [==============================] - 0s 16ms/step
              Values   Predicts
Date                            
2022-03-07         71  79.437263
2022-03-14         84  84.282906
2022-03-21         90  88.096298
2022-03-28         87  82.875603
MAPE: 3.576822717339706
forecast

            Predicts   Actual
Date                            
2022-03-07         71  79.437263
2022-03-08         84  84.282906
2022-03-09         90  88.096298
2022-03-10         87  82.875603
evaluate

{'mean_absolute_percentage_error': 3.576822717339706,
 'mean_absolute_error': 14.02137889193878,
 'mean_squared_error': 3485.26570064559}

he Time Series module helps to create many basic models without using much code and helps to understand which models work better without any parameter adjustments.

from perlib.piplines.dpipline import Timeseries
pipline = Timeseries(dataFrame=data,
                       y="Values",
                       dateColumn=False,
                       process=False,
                       epoch=1,
                       forecastingStartDate="2022-03-06",
                       forecastNumber= 24,
                       models="all",
                       metrics=["mape","mae","mse"]
                       )
predictions = pipline.fit()

            mean_absolute_percentage_error | mean_absolute_error  | mean_squared_error
LSTNET                              14.05  |                67.70 |  5990.35
LSTM                                7.03   |                38.28 |  2250.69
BILSTM                              13.21  |                68.22 |  6661.60
CONVLSTM                            9.62   |                48.06 |  2773.69
TCN                                 12.03  |                65.44 |  6423.10
RNN                                 11.53  |                59.33 |  4793.62
ARIMA                               50.18  |                261.14|  74654.48
SARIMA                              10.48  |                51.25 |  3238.20

With the 'summarize' function you can see quick and simple analysis results.

summarize(dataFrame=data)

With the 'auto' function under 'preprocess', you can prepare the data using general preprocessing.

preprocess.auto(dataFrame=data)

12-2022 15:04:36.22    - DEBUG - Conversion to DATETIME succeeded for feature "Date"
27-12-2022 15:04:36.23 - INFO - Completed conversion of DATETIME features in 0.0097 seconds
27-12-2022 15:04:36.23 - INFO - Started encoding categorical features... Method: "AUTO"
27-12-2022 15:04:36.23 - DEBUG - Skipped encoding for DATETIME feature "Date"
27-12-2022 15:04:36.23 - INFO - Completed encoding of categorical features in 0.001252 seconds
27-12-2022 15:04:36.23 - INFO - Started feature type conversion...
27-12-2022 15:04:36.23 - DEBUG - Conversion to type INT succeeded for feature "Salecount"
27-12-2022 15:04:36.24 - DEBUG - Conversion to type INT succeeded for feature "Day"
27-12-2022 15:04:36.24 - DEBUG - Conversion to type INT succeeded for feature "Month"
27-12-2022 15:04:36.24 - DEBUG - Conversion to type INT succeeded for feature "Year"
27-12-2022 15:04:36.24 - INFO - Completed feature type conversion for 4 feature(s) in 0.00796 seconds
27-12-2022 15:04:36.24 - INFO - Started validation of input parameters...
27-12-2022 15:04:36.24 - INFO - Completed validation of input parameters
27-12-2022 15:04:36.24 - INFO - AutoProcess process completed in 0.034259 seconds

If you want to build it yourself;

from perlib.core.models.dmodels import models
from perlib.core.train import dTrain
from perlib.core.tester import dTester

You can use many features by calling the 'dataPrepration' function for data preparation operations.

data = dataPrepration.read_data(path="./dataset/Veriler/ayakkabı_haftalık.xlsx")
data = dataPrepration.trainingFordate_range(dataFrame=data,dt1="2013-01-01",dt2="2022-01-01")

You can use the 'preprocess' function for data preprocessing.

data = preprocess.missing_num(dataFrame=data)
data = preprocess.find_outliers(dataFrame=data)
data = preprocess.encode_cat(dataFrame=data)
data = preprocess.dublicates(dataFrame=data,mode="auto")

You should create an architecture like below.

layers = {
            "unit":[150,100], 
            "activation":["tanh","tanh"],
            "dropout"  :[0.2,0.2]
         }

You can set each parameter below it by calling the 'req_info' object.

from perlib.forecaster import req_info,dmodels
from perlib.core.train import dTrain
from perlib.core.tester import dTester

#layers = {
#        "CNNFilters":100,
#        "CNNKernel":6,
#        "GRUUnits":50,
#        "skip" : 25,
#        "highway" : 1
#          }

req_info.layers = None
req_info.modelname = "lstm"
req_info.epoch  =  30
#req_info.learning_rate = 0.001
req_info.loss  = "mse"
req_info.lookback = 30
req_info.optimizer = "adam"
req_info.targetCol = "Values"
req_info.forecastingStartDate = "2022-01-06 15:00:00"
req_info.period = "daily"
req_info.forecastNumber = 30
req_info.scaler = "standard"
s = dmodels(req_info)

It will be prepared after importing it into models.

s = models(req_info)

After sending the dataframe and the prepared architecture to the dTrain, you can start the training process by calling the .fit() function.

train = dTrain(dataFrame=data,object=s)
train.fit()

After the training is completed, you can see the results by giving the dataFrame,object,path,metric parameters to 'dTester'.

t = dTester(dataFrame=data,object=s,path="Data-Lstm-2022-12-14-19-56-28.h5",metric=["mape","mae"])
t.forecast()
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 19ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step

t.evaluate()

MAPE: 3.35
from perlib.core.models.smodels import models as armodels
from perlib.core.train import  sTrain
from perlib.core.tester import sTester
aR_info.modelname = "sarima"
aR_info.forcastingStartDate = "2022-6-10"
ar = armodels(aR_info)
#train = sTrain(dataFrame=data,object=ar)
res = train.fit()
r = sTester(dataFrame=data,object=ar,path="Data-sarima-2022-12-30-23-49-03.pkl")
r.forecast()
r.evaluate()
from perlib.core.models.mmodels import models
from perlib.core.train import mTrain
m_info.testsize = .01
m_info.y        = "quality"
m_info.modelname= "SVR"
m_info.auto  = False
m = models(m_info)
train = mTrain(dataFrame=data,object=m)
preds, evaluate = train.predict()
# If you want to make any other data predictions you can use the train.tester
# func after train.predict. You can make predictions with
predicts = train.tester(path="Data-SVR-2023-01-08-09-50-37.pkl", testData=data.iloc[:,1:][-20:])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perlib-3.0.4.tar.gz (136.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page