Skip to main content

AutoML, Forecasting, NLP, Image Classification, Feature Engineering, Model Evaluation, Model Interpretation, Fast Processing.

Project description

Version: 0.0.1 Python Build: Passing License: MPL 2.0 Maintenance PRs Welcome GitHub Stars

Installation

pip install git+https://github.com/AdrianAntico/RetroFit.git#egg=retrofit

Feature Engineering

Feature Engineering - Some of the feature engineering functions can only be found in this package. I believe feature engineering is your best bet for improving model performance. I have functions that cover all feature types. There are feature engineering functions for numeric data, categorical data, text data, and date data. They are all designed to generate features for training and scoring pipelines and they run extremely fast with low memory utilization. The package takes advantage of datatable or polars (user chooses) for all feature engineering and data wrangling related functions which means you'll only have to go to big data tools if absolutely necessary.

Machine Learning

Machine Learning Training -

Machine Learning Scoring -

Machine Learning Evaluation -

Machine Learning Interpretation -

retrofit and RemixAutoML Blogs

Expand to view content

Python retrofit and R RemixAutoML Blogs

Sales Funnel Forecasting with ML using RemixAutoML

The Most Feature Rich ML Forecasting Methods Available

AutoML Frameworks in R & Python

AI for Small to Medium Size Businesses: A Management Take On The Challenges...

Why Machine Learning is more Practical than Econometrics in the Real World

Build Thousands of Automated Demand Forecasts in 15 Minutes Using AutoCatBoostCARMA in R

Automate Your KPI Forecasts With Only 1 Line of R Code Using AutoTS

Feature Engineering

Expand to view feature engineering functions

Feature Engineering: Date Feature Engineering

Expand to view content

AutoCalendarVariables()

Code Example

# Test Function
import datatable
import retrofit
from retrofit import TimeSeriesFeatures as ts
 
# Data can be created using the R package RemixAutoML and function FakeDataGenerator
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoCalendarVariables(
  data=data, 
  ArgsList=None, 
  DateColumnNames = 'CalendarDateColumn', 
  CalendarVariables = ['wday','mday','wom','month','quarter','year'], 
  Processing = 'datatable', 
  InputFrame = 'datatable', 
  OutputFrame = 'datatable')

# Check
data.names

Function Description

AutoCalendarVariables() Automatically generate calendar variables from your date columns using datatable.

Feature Engineering: Numeric Feature Engineering

Expand to view content

Coming Soon

Feature Engineering: Categorical Feature Engineering

Expand to view content

Coming Soon

Feature Engineering: Cross-Row Operations

Expand to view content

AutoLags()

Code Example

# Test Function
import datatable
import retrofit
from retrofit import TimeSeriesFeatures as ts
 
# Data can be created using the R package RemixAutoML and function FakeDataGenerator
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
    
## Group Example:
data = ts.AutoLags(data=data, LagPeriods=[1,3,5,7], LagColumnNames='Leads', DateColumnName='CalendarDateColumn', ByVariables=None, ImputeValue=-1, Sort=True)
print(data.names)
    
## Group and Multiple Periods and LagColumnNames:
data = ts.AutoLags(data=data, LagPeriods=[1,3,5], LagColumnNames=['Leads','XREGS1'], DateColumnName='CalendarDateColumn', ByVariables=['MarketingSegments', 'MarketingSegments2', 'MarketingSegments3', 'Label'], ImputeValue=-1, Sort=True)
print(data.names)

## No Group Example:
data = ts.AutoLags(data=data, LagPeriods=1, LagColumnNames='Leads', DateColumnName='CalendarDateColumn', ByVariables=None, ImputeValue=-1, Sort=True)
print(data.names)

Function Description

AutoLags() Automatically generate any number of lags, for any number of columns, by any number of By-Variables, using datatable.

AutoRollStats()

Code Example

# Test Function
import datatable
import retrofit
from retrofit import TimeSeriesFeatures as ts

## Group Example:
import datatable as dt
from datatable import sort, f, by
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoRollStats(data=data, RollColumnNames='Leads', DateColumnName='CalendarDateColumn', ByVariables=None, MovingAvg_Periods=[3,5,7], MovingSD_Periods=[3,5,7], MovingMin_Periods=[3,5,7], MovingMax_Periods=[3,5,7], ImputeValue=-1, Sort=True)
print(data.names)
    
## Group and Multiple Periods and RollColumnNames:
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoRollStats(data=data, RollColumnNames=['Leads','XREGS1'], DateColumnName='CalendarDateColumn', ByVariables=['MarketingSegments', 'MarketingSegments2', 'MarketingSegments3', 'Label'], MovingAvg_Periods=[3,5,7], MovingSD_Periods=[3,5,7], MovingMin_Periods=[3,5,7], MovingMax_Periods=[3,5,7], ImputeValue=-1, Sort=True)
print(data.names)

## No Group Example:
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoRollStats(data=data, RollColumnNames='Leads', DateColumnName='CalendarDateColumn', ByVariables=None, MovingAvg_Periods=[3,5,7], MovingSD_Periods=[3,5,7], MovingMin_Periods=[3,5,7], MovingMax_Periods=[3,5,7], ImputeValue=-1, Sort=True)
print(data.names)

Function Description

AutoRollStats() Automatically generate any number of moving averages, moving standard deviations, moving mins and moving maxs from any number of source columns, by any number of By-Variables, using datatable.

AutoDiff()

Code Example

# Test Function
import datatable
import retrofit
from retrofit import TimeSeriesFeatures as ts

## Group Example:
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoDiff(data=data, DateColumnName = 'CalendarDateColumn', ByVariables = ['MarketingSegments', 'MarketingSegments2', 'MarketingSegments3', 'Label'], DiffNumericVariables = 'Leads', DiffDateVariables = 'CalendarDateColumn', DiffGroupVariables = None, NLag1 = 0, NLag2 = 1, Sort=True, InputFrame = 'datatable', OutputFrame = 'datatable')
print(data.names)
    
## Group and Multiple Periods and RollColumnNames:
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoDiff(data=data, DateColumnName = 'CalendarDateColumn', ByVariables = ['MarketingSegments', 'MarketingSegments2', 'MarketingSegments3', 'Label'], DiffNumericVariables = 'Leads', DiffDateVariables = 'CalendarDateColumn', DiffGroupVariables = None, NLag1 = 0, NLag2 = 1, Sort=True, InputFrame = 'datatable', OutputFrame = 'datatable')
print(data.names)

## No Group Example:
data = dt.fread("C:/Users/Bizon/Documents/GitHub/BenchmarkData.csv")
data = ts.AutoDiff(data=data, DateColumnName = 'CalendarDateColumn', ByVariables = None, DiffNumericVariables = 'Leads', DiffDateVariables = 'CalendarDateColumn', DiffGroupVariables = None, NLag1 = 0, NLag2 = 1, Sort=True, InputFrame = 'datatable', OutputFrame = 'datatable')
print(data.names)

Function Description

AutoDiff() Automatically generate any number of differences from any number of source columns, for numeric, character, and date columns, by any number of By-Variables, using datatable.

Feature Engineering: Data Set Feature Engineering

Expand to view content

Coming Soon

Feature Engineering: Model-Based Feature Engineering

Expand to view content

Coming Soon

Machine Learning Training

Expand to view machine learning functions

Coming Soon

Machine Learning Scoring

Expand to view machine learning scoring functions

Coming Soon

Machine Learning Evaluation

Expand to view machine learning evaluation functions

Coming Soon

Machine Learning Interpretation

Expand to view machine learning interpretation functions

Coming Soon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retrofit-0.0.1.tar.gz (12.0 kB view hashes)

Uploaded Source

Built Distribution

retrofit-0.0.1-py3-none-any.whl (9.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page