Time series models represented as pure functions with SKATER convention.
Project description
timemachines
Timemachines standardizes, and tests the efficacy of, combinations of time series approaches and hyperoptimization of the same. The project exposes in a simple way optimizers from scipy, axplatform, hyperopt, optuna, platypus, pymooo, pySOT  each with various strategy and parameter variations. It also presents time series models from pydlm, flux, pmdarima and others in a simple format. Combinations of models and hyperoptimization strategy are tested out of sample on live data, and assigned Elo ratings.
Time series models are pure functions suggestion state machines
Here a time series model:
 takes the form of a pure function with a skater signature,
 that is a recipe for a state machine,
 where the intent that the caller might carry the state from one invocation to the next, not the callee, and
 with the further, somewhat unusual convention that variables known in advance (a) and the full set of model hyperparameters (r) are both squished down into their respective scalar arguments.
The penultimate convention is for generality, and also eyes lambdabased deployments. The last convention imposes at design time a consistent hyperparameter space. This step may seem unnatural, but it facilitates comparisons of models and hyperparameter optimizers in different settings. It is workable, we hope, with some spacefilling curve conventions.
This isn't put forward as the right way to write time series packages  more a way of exposing their functionality for comparisons. If you are interested in design thoughts for time series maybe participate in this thread.
The skater signature
Most time series packages use a complex combination of methods and data to represent a time series model, its fitting, and forecasting usage. But in this package a "model" is merely a function in the mathematical sense.
x, s, w = f( y:Union[float,[float]], # Contemporaneously observerd data,
# ... including exogenous variables in y[1:], if any.
s=None, # Prior state
k:float=1, # Number of steps ahead to forecast. Typically integer.
a:float=None, # Variable(s) known in advance, or conditioning
t:float=None, # Time of observation (epoch seconds)
e:float=None, # Nonbinding maximal computation time ("e for expiry"), in seconds
r:float=None) # Hyperparameters ("r" stands for for hype(r)pa(r)amete(r)s in R^n)
The function returns:
> float, # A point estimate, or anchor point, or theo
Any, # Posterior state, intended for safe keeping by the callee until the next invocation
Any # Everything else (e.g. confidence intervals) not needed for the next invocation.
(Yes one might quibble with the purity given that state s can be modified, but that's Python sensible).
Skating forward
def posteriors(f,ys):
s = None
xs = list()
for y in ys:
x, s, _ = f(y,s)
xs.append(xs)
return xs
Picture by Joe Cook
Conventions:

State
 The caller, not the callee, persists state from one invocation to the next
 The caller passes s=None the first time, and the callee initializes state
 State can be mutable for efficiency (e.g. it might be a long buffer) or not.
 ?? state should, ideally, be JSONfriendly. Maybe use .tolist() on arrays.

Observations:
 If y is a vector, the target is the first element y[0]
 The elements y[1:] are contemporaneous exogenous variables, not known in advance.
 Missing data as np.nan but not None

Fitting:
 If y=None is passed, it is a suggestion to the callee to perform fitting, should that be necessary. In this case the e argument takes on a slightly different interpretation, and should be larger than usual.
 The callee should return x=None, as acknowledgement that it has recognized the "offline" convention

Variables known in advance, or conditioning variables:
 Passed as scalar argument a in (0,1).
 Rationale: make it easier to design general purpose conditional prediction algorithms
 Examples: business day indicator; size of trade, joystick button up

HyperParameter space:
 A float r in (0,1).
 This package provides some conventions for expanding to R^n using space filling curves, so that the callee's (hyper) parameter optimization can still exploit geometry, if it wants to.

Ordering of parameters in spacefilling curve:
 The most important variables should be listed first, as they vary more slowly.
 See picture below or video
Spacefilling conventions for a and r
The script demo_balanced_log_scale.py illustrates the quasilogarithmic parameter mapping from r in (0,1) to R.
The script demo_param_ordering.py illustrates the mapping from r in (0,1) to R^n. Observe why the most important parameter should be listed first. It will vary more smoothly as we vary r.
FAQ:
Question 1. Why not have the model persist the state?
Answer 1. Go ahead:
class Predictor:
def __init__(self,f):
self.f = f
self.s = s
def __call__(self,y,k,a,t,e):
x, self.s = self.f(y=y,s=self.s,k=k,a=a,t=t,e=e)
return x
or write a decorator. However:
 We have lambda patterns in mind
 The callee has more control in this setup (e.g. for multiple conditional forecasts)
Question 2. Why do it this barebones manner with squished parameter spaces?
Answer 2. The intent is to produce lambdafriendly models but also:
 Comparison, combination and search for models, made possible by
 A reasonable way to map the most important hyperparameter choices (we hope),
 Which imposes some geometric discipline on the hyperparameter space (e.g. most important first), and
 enables search across packages which have entirely different conventions and hyperparameter spaces.
Observe that this package wraps some partial functionality of some time series prediction libraries. Those libraries could not be further removed from the above in that they:
 Use pandas dataframes
 Bundle data with prediction logic
 Rely on column naming conventions
 Require 1020 lines of setup code before a prediction can be made
 Require tracing into the code to infer intent
 Use conventions such as '5min' which not everyone agrees on
This package should not be viewed as an attempt to wrap most of the functionality of these packages. If you have patterns in mind that match them, and you are confident of their performance, you are best served to use them directly.
Scope and limitations
The simple interface is not well suited to problems where exogenous data comes and goes. You might consider a dictionary interface instead, as with the river package. It is also not well suited to fixed horizon forecasting if the data isn't sampled terribly regularly. Nor is it well suited to prediction of multiple time series whose sampling occurs irregularly. Ordinal values can be kludged into the parameter space and action argument, but purely categorical not so much. And finally, if you don't like the idea of hyperparameters lying in R^n or don't see any obvious embedding, this might not be for you.
Yes, we're keen to receive PR's
If you'd like to contribute to this standardizing and benchmarking effort, here are some ideas:
 See the list of popular time series packages ranked by download popularity.
 Think about the most important hyperparameters.
 Consider "warming up" the mapping (0,1)>hyperparams by testing on real data. There is a tutorial on retrieving live data, or use the real data package, if that's simpler.
 The comparison of hyperparameter optimization packages might also be helpful.
If you are the maintainer of a time series package, we'd love your feedback and if you take the time to submit a PR here, do yourself a favor and also enable "supporting" on your repo.
Deployment
Some of these models are used as intermediate steps in the creation of distributional forecasts, at microprediction.org.
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for timemachines0.1.12py3noneany.whl
Algorithm  Hash digest  

SHA256  4494af0a6ccf1704b739430f5fde9239c229d2256318637debdc67361f3fef45 

MD5  2635167cd628444a6f0ef50d01484e0e 

BLAKE2b256  02918e0769e6c84c872e1730f79428c79aad1a62b4f3a3fae09228bd12b6e4df 