Skip to main content

Automated weekly sequence-model workflow (LSTM / Transformer) for customer transaction prediction.

Project description

autoseqmodels

Automated weekly sequence-model workflow for customer transaction prediction. Provides an end-to-end pipeline from raw transaction tables to trained LSTM / Transformer models, with column-type inference, encoding strategy proposal, per-customer sequence construction, training, tuning (Optuna), and holdout evaluation.

Installation

pip install autoseqmodels

Or from a local clone:

pip install -e .

Expected input format

build_transaction_panel does not infer column roles — auto-detection silently degrades if the inputs don't match these expectations:

  • tx_df (raw transactions, one row per purchase) must contain a customer-identifier column (the name you pass as merge_on, e.g. "Id") and exactly one date-typed column (either datetime64 dtype, or a string column whose name contains "date" and parses as a date). Any other columns are ignored at panel-building time but flow through to the panel and are typed in step 3.
  • cov_df (optional covariate calendar) must contain the same merge_on column, exactly one date-typed column (the per-week observation date), and one row per (customer, week). Merge-key dtypes are coerced automatically across the two tables, so an int64 Id on one side and a string Id on the other is fine.
  • The downstream build_transaction_sequences step expects the panel to carry a transaction_count column (added by build_transaction_panel) plus the customer and date columns; everything else is treated as a feature.

Workflow

Steps 1-7 prepare the data and are identical for both training paths. Step 8 has two variants — pick one:

  • 8a trains a single model with hyperparameters you set yourself.
  • 8b runs an Optuna search, then refits with the best trial.

Steps 1-7 — data preparation (shared)

from autoseqmodels import (
    loader, inspection, encoders, sequence_builder,
)

# 1. Load data (CSV / Excel / RData)
df = loader.load_table("transactions.csv")

# 2. Aggregate raw transactions to a (customer, week) panel
#    Two modes:
#      a) With a covariate calendar — pass tx_df + cov_df:
panel = loader.build_transaction_panel(tx_df, cov_df, merge_on="Id")
#      b) Transactions only (no covariates) — omit cov_df:
panel = loader.build_transaction_panel(tx_df, merge_on="Id")
#    The transactions-only mode returns one row per (customer, purchase-week);
#    sequence_builder fills zero-transaction weeks itself in step 7.

# 3. Detect column types (user-editable)
detected = inspection.infer_column_types(panel)
panel    = inspection.cast_columns_by_detected_type(panel, detected)

# 4. Resolve entity / date / target + covariate plan
structure, plan = inspection.analyze_structure(panel, detected)

# 5. Propose encoding strategy (user-editable)
strategy = encoders.propose_encodings(panel, detected, plan)

# 6. Fit encoders on training rows only
enc_df, spec = encoders.apply_encodings(panel, strategy, ...)
plan         = encoders.expand_plan(plan, spec)

# 7. Build per-customer sequences
seqs = sequence_builder.build_transaction_sequences(enc_df, ...)

Step 8a — train with fixed hyperparameters

Use this when you already know the hyperparameters you want or just need a quick baseline. No Optuna involved.

from autoseqmodels import training, sequence_lstm

ds                = training.SequenceDataset(seqs)
train_ds, val_ds  = training.make_train_val_split(ds, val_fraction=0.1, seed=42)

model = sequence_lstm.SequenceLSTM.from_sequences(
    seqs, hidden_size=128, n_layers=1, dropout=0.1,
)
history = training.train_model(model, train_ds, val_ds, lr=1e-3, epochs=30)
preds   = sequence_lstm.predict_holdout(model, seqs)

Step 8b — train with Optuna hyperparameter search

tune_lstm runs the search; train_tuned_lstm is the post-tuning refit and requires the resulting study plus the same train/val subsets. The study persists to SQLite when storage is given, so re-running the same study_name resumes from where it left off.

from autoseqmodels import training, sequence_lstm

ds                = training.SequenceDataset(seqs)
train_ds, val_ds  = training.make_train_val_split(ds, val_fraction=0.1, seed=42)

study = sequence_lstm.tune_lstm(
    seqs,
    train_ds,
    val_ds,
    n_trials   = 30,
    max_epochs = 100,
    storage    = "sqlite:///optuna_lstm.db",
    study_name = "lstm_v1",
)

model, history = sequence_lstm.train_tuned_lstm(seqs, study, train_ds, val_ds)
preds          = sequence_lstm.predict_holdout(model, seqs)

The Transformer variant mirrors both step-8 paths under sequence_transformer: build with SequenceTransformer.from_sequences

  • training.train_model (8a), or tune_transformertrain_tuned_transformerpredict_holdout_transformer (8b).

Minimal runnable example

End-to-end on the bundled Electronic.csv (Id, Date, Price, 829 customers, transactions from 1999-01-01 to 2004-11-30). Uses the transactions-only mode of build_transaction_panel so no covariate file is needed; the same script works with a covariate calendar by passing it as the second positional argument.

from autoseqmodels import (
    loader, inspection, encoders, sequence_builder,
    training, sequence_lstm,
)

# 1-2. Load and aggregate to a (customer, week) panel
tx_df = loader.load_table("Datasets/Electronic.csv")
panel = loader.build_transaction_panel(tx_df, merge_on="Id")

# 3. Detect column types and cast
detected = inspection.infer_column_types(panel)
panel    = inspection.cast_columns_by_detected_type(panel, detected)

# 4. Resolve entity / date / target + covariate plan
structure, plan = inspection.analyze_structure(
    panel, detected, target_col="transaction_count"
)

# 5-6. Encoding strategy + fit on training rows
strategy        = encoders.propose_encodings(panel, detected, plan)
train_mask      = panel["Date"] <= "2003-12-31"
enc_df, spec    = encoders.apply_encodings(panel, strategy, train_mask=train_mask)
plan            = encoders.expand_plan(plan, spec)

# 7. Build per-customer sequences (3-year calibration, 11-month holdout)
seqs = sequence_builder.build_transaction_sequences(
    enc_df,
    account_col     = "Id",
    date_col        = "Date",
    training_start  = "2001-01-01",
    training_end    = "2003-12-31",
    holdout_start   = "2004-01-01",
    holdout_end     = "2004-11-30",
    transaction_col = "transaction_count",
    plan            = plan,
    seasonality     = ["woy"],
)

# 8. Tune with Optuna, refit on the best trial, then roll out the holdout
ds                = training.SequenceDataset(seqs)
train_ds, val_ds  = training.make_train_val_split(ds, val_fraction=0.1, seed=42)

study = sequence_lstm.tune_lstm(
    seqs, train_ds, val_ds,
    n_trials   = 10,
    max_epochs = 30,
    storage    = "sqlite:///optuna_lstm.db",
    study_name = "electronic_demo",
)
model, history = sequence_lstm.train_tuned_lstm(seqs, study, train_ds, val_ds)
preds          = sequence_lstm.predict_holdout(model, seqs)

To verify the install in a few minutes, drop n_trials and max_epochs to small values (e.g. 3 and 10). Re-running the same script with the same study_name resumes the existing Optuna study from the SQLite file.

Modules

loader — input I/O and panel aggregation

  • load_table(path, ...) — read .csv, .xlsx/.xls, .RData/.rda. For R files, r_covariates_object_name (and optional r_base_object_name) returns a (base_df, covariates_df) tuple in one call.
  • build_transaction_panel(tx_df, cov_df=None, merge_on=...) — aggregate raw transactions into a weekly (customer, week) panel. With cov_df, left-joins counts onto a covariate calendar (missing weeks → 0). Without cov_df, returns one row per (customer, purchase-week); the sequence builder fills zero weeks itself. Auto-detects the date column and reconciles merge-key dtypes across the two tables.

inspection — column typing & structural analysis

  • infer_column_types(df, config=TypeDetectionConfig()) — Auto-Prep-style detection: id / bool / date / time / category / string, plus a statistical role (identifier / discrete / continuous / binary / categorical / temporal / text). Output is a typed summary DataFrame (detected) the user can edit before casting.
  • cast_columns_by_detected_type(df, detected) — apply the inferred dtypes column by column.
  • propose_structure(detected)DataStructure — pick entity_col, date_col (highest-cardinality date), and surface alternatives as entity_candidates / date_candidates.
  • classify_time_variance(df, entity_col, detected) — label every column as invariant / variant per entity, and pick the main transaction date.
  • plan_covariates(df, detected, structure)CovariatePlan — classify every column as static_cols, time_varying_cols, or skip_cols, with per-column encoding hints.
  • analyze_structure(df, detected, unknown_future_cols=None, target_col=None) — one-shot wrapper that runs propose_structure + plan_covariates and removes columns whose future values aren't known at prediction time from time_varying_cols.

encoders — turning columns into model-ready numerics

  • propose_encodings(df, detected, plan, ...) — pick a strategy per column: EMBED (id or high-cardinality category), ONE-HOT (low-cardinality), SCALE (numeric, Standard or MinMax), or PASSTHROUGH (binary). Embedding dimension follows the Guo & Berkhahn rule. Strategy is user-editable.
  • apply_encodings(df, strategy, training_mask=...) — fit encoders on training rows only, transform the whole panel, return (enc_df, spec) where spec: EncodingSpec carries the embedding vocabularies, scaler parameters, one-hot category orders, and the total input_width.
  • expand_plan(plan, spec) — rewrite the CovariatePlan so one-hot expansion columns replace their source column.
  • prepare_encodings(...) / encoding_report(spec) — convenience wrapper and a human-readable summary table.

sequence_builder — per-customer weekly sequences

  • build_transaction_sequences(df, *, account_col, date_col, training_start, training_end, holdout_start, holdout_end, transaction_col="transaction_count", plan=None, covariate_cols=None, time_varying_cols=None, seasonality=None, clip_transactions=None) — constructs its own universal weekly calendar from (training_start, holdout_end), merges each customer's transactions onto it (zero-filling missing weeks), and emits a dict with samples / targets (calibration shifted by one), calibration (full training tensor), holdout (rollout tensor), account_ids, n_features, feature_names, total_transactions. Optional seasonality features: woy / moy / dow / year (sin/cos pairs).

models.training — dataset, split, train loop, evaluation

  • SequenceDataset — wraps the dict from build_transaction_sequences for use with DataLoader.
  • make_train_val_split(dataset, val_fraction=..., seed=...) — customer-level random split (no leakage between train and val).
  • train_model(model, train_ds, val_ds, ...) — generic training loop with early stopping, returns the per-epoch loss history.
  • evaluate_predictions(...) — MAE / RMSE / aggregate-count metrics on the holdout rollout.
  • plot_training_and_predictions(...) — quick-look loss curves and predicted-vs-actual transaction plots.

models.sequence_lstm — LSTM model + Optuna tuning

  • SequenceLSTM(nn.Module) — embedding layers (one per id / categorical), multi-layer LSTM, regression head; consumes the feature_names order produced by build_transaction_sequences.
  • predict_holdout(model, seqs) — autoregressive holdout rollout, returns predicted weekly counts per customer.
  • tune_lstm(seqs, train_ds, val_ds, *, n_trials, max_epochs, storage=None, study_name=None, fixed_params=None) — Optuna search over hidden_size, n_layers, dropout, learning rate, batch size, … Studies persist to SQLite (sqlite:///optuna_lstm.db) when storage is given, so runs are resumable. Any param can be pinned via fixed_params.
  • train_tuned_lstm(seqs, study, train_ds, val_ds) — refit with the best trial's hyperparameters, returning (model, history).

models.sequence_transformer — Transformer mirror

  • SequenceTransformer(nn.Module) — same input contract as SequenceLSTM but with positional encoding and a TransformerEncoder stack; n_heads is auto-picked to divide d_model.
  • predict_holdout_transformer, tune_transformer, train_tuned_transformer — direct counterparts of the LSTM helpers, with their own SQLite study (optuna_transformer.db).

Requirements

Python ≥ 3.10. See pyproject.toml for the full dependency list.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoseqmodels-0.1.2.tar.gz (55.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoseqmodels-0.1.2-py3-none-any.whl (51.3 kB view details)

Uploaded Python 3

File details

Details for the file autoseqmodels-0.1.2.tar.gz.

File metadata

  • Download URL: autoseqmodels-0.1.2.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for autoseqmodels-0.1.2.tar.gz
Algorithm Hash digest
SHA256 616cd343297e806c5d28dd7e92e20e392958e86a17694e69f190dbc9fa57b206
MD5 1bf69bc925e905e4845781ba441a3cba
BLAKE2b-256 c7ae96d0d6fce1e8d36331380036ed2b6c875eea8d679a902013e0e3ed017d79

See more details on using hashes here.

File details

Details for the file autoseqmodels-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: autoseqmodels-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 51.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for autoseqmodels-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 829e1d21dde1f3fb902cf86a795c161e2b5ae8434bb780dbb2c30e09aa5291f9
MD5 97feca368bc65a7284f7483154cc604a
BLAKE2b-256 ec009331fb3725b880c58f449b2acc01025a3711846c2bbf1cde6a89864dedb4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page