Skip to main content

State of the Art Neural Networks for Tabular Deep Learning

Project description

pyradox-tabular

State of the Art Neural Networks for Tabular Deep Learning

Downloads Downloads Downloads


Table of Contents


Installation

pip install pyradox-tabular

Usage

Data Preparation

pyradox-tabular comes with its own DataLoader Class which can be used to load data from a pandas DataFrame. We provide a utility DataConfig class which stores the configuration of the data, which are then required by the model for feature preprocessing. We also provide seperate ModelConfig classes for the different models, which ae required to store the model hyperparamers.

from pyradox_tabular.data import DataLoader
from pyradox_tabular.data_config import DataConfig

data_config = DataConfig(
    numeric_feature_names=["numerical", "column","names"],
    categorical_features_with_vocabulary={
        "column": ["label", "encoded", "unique", "values", "as", "strings"],
    },
)

data_train = DataLoader.from_df(x_train, y_train, batch_size=1024)
data_valid = DataLoader.from_df(x_valid, y_valid, batch_size=1024)
data_test = DataLoader.from_df(x_test, batch_size=1024)

This library provides the implementations of the following tabular deep learning models:

Deep Tabular Network

In principle a neural network can approximate any continuous function and piece wise continuous function. However, it is not suitable to approximate arbitrary non-continuous functions as it assumes certain level of continuity in its general form.

Unlike unstructured data found in nature, structured data with categorical features may not have continuity at all and even if it has it may not be so obvious.

Deep Tabular Network use the entity embedding method to automatically learn the representation of categorical features in multi-dimensional spaces which reveals the intrinsic continuity of the data and helps neural networks to solve the problem.

from pyradox_tabular.model_config import DeepNetworkConfig
from pyradox_tabular.nn import DeepTabularNetwork

model_config = DeepNetworkConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64])
model = DeepTabularNetwork.from_config(data_config, model_config, name="deep_network")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Wide and Deep Tabular Network

The human brain is a sophisticated learning machine, forming rules by memorizing everyday events and generalizing those learnings to apply tothings we haven't seen before. Perhaps more powerfully, memorization also allows us to further refine our generalized rules with exceptions.

By jointly training a wide linear model (for memorization) alongside a deep neural network (for generalization) Wide and Deep Tabular Networks combine the strengths of both to bring us one step closer to teach computers to learn like humans do.

from pyradox_tabular.model_config import WideAndDeepNetworkConfig
from pyradox_tabular.nn import WideAndDeepTabularNetwork

model_config = WideAndDeepNetworkConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64])
model = WideAndDeepTabularNetwork.from_config(data_config, model_config, name="wide_deep_network")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Deep and Cross Tabular Network

Feature engineering has been the key to the success of many prediction models. However, the process is nontrivial and often requires manual feature engineering or exhaustive searching. DNNs are able to automatically learn feature interactions; however, they generate all the interactions implicitly, and are not necessarily efficient in learning all types of cross features.

Deep and Cross Tabular Network explicitly applies feature crossing at each layer, requires no manual feature engineering, and adds negligible extra complexity to the DNN model.

from pyradox_tabular.model_config import DeepAndCrossNetworkConfig
from pyradox_tabular.nn import DeepAndCrossTabularNetwork

model_config = DeepAndCrossNetworkConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64], n_cross=2)
model = DeepAndCrossTabularNetwork.from_config(data_config, model_config, name="deep_cross_network")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

TabTansformer

TabTransformer is built upon self-attention based on Transformers. The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy.

The contextual embeddings learned from TabTransformer are highly robust against both missing and noisy data features, and provide better interpretability.

from pyradox_tabular.model_config import TabTransformerConfig
from pyradox_tabular.nn import TabTransformer

model_config = TabTransformerConfig(num_outputs=1, out_activation='sigmoid', num_transformer_blocks=3, num_heads=4, mlp_hidden_units_factors=[2, 1])
model = TabTransformer.from_config(data_config, model_config, name="tab_transformer")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

TabNet

TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and better learning as the learning capacity is used for the most salient features.

It employs a single deep learning architecture for feature selection and reasoning.

from pyradox_tabular.model_config import TabNetConfig
from pyradox_tabular.nn import TabNet

model_config = TabNetConfig(num_outputs=1, out_activation='sigmoid',feature_dim=16, output_dim=12, num_decision_steps=5)
model = TabNet.from_config(data_config, model_config, name="tabnet")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Deep Neural Decision Tree

Deep Neural Decision Trees unifies classification trees with the representation learning functionality known from deep convolutional network. These are essentially a stochastic and differentiable decision tree model.

from pyradox_tabular.model_config import NeuralDecisionTreeConfig
from pyradox_tabular.nn import NeuralDecisionTree

model_config = NeuralDecisionTreeConfig(depth=2, used_features_rate=1, num_classes=2)
model = NeuralDecisionTree.from_config(data_config, model_config, name="deep_neural_decision_tree")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Deep Neural Decision Forest

A Deep Neural Decision Forest is an bagging ensemble of Deep Neural Decision Trees.

from pyradox_tabular.model_config import NeuralDecisionForestConfig
from pyradox_tabular.nn import NeuralDecisionForest

model_config = NeuralDecisionForestConfig(num_trees=10, depth=2, used_features_rate=0.8, num_classes=2)
model = NeuralDecisionForest.from_config(data_config, model_config, name="deep_neural_decision_forest")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Neural Oblivious Decision Tree

from pyradox_tabular.model_config import NeuralObliviousDecisionTreeConfig
from pyradox_tabular.nn import NeuralObliviousDecisionTree

model_config = NeuralObliviousDecisionTreeConfig()
model = NeuralObliviousDecisionTree.from_config(data_config, model_config, name="neural_oblivious_decision_tree")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Neural Oblivious Decision Ensemble

NODE architecture generalizes ensembles of oblivious decision trees, but benefits from both end-to-end gradient-based optimization and the power of multi-layer hierarchical representation learning.

from pyradox_tabular.model_config import NeuralObliviousDecisionEnsembleConfig
from pyradox_tabular.nn import NeuralObliviousDecisionEnsemble

model_config = NeuralObliviousDecisionEnsembleConfig()
model = NeuralObliviousDecisionEnsemble.from_config(data_config, model_config, name="neural_oblivious_decision_ensemble")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Feature Tokenizer Transformer

It is a simple adaptation of the Transformer architecture for the tabular domain. In a nutshell, Feature Tokenizer Transformer transforms all features (categorical and numerical) to embeddings and applies a stack of Transformer layers to the embeddings.

Thus, every Transformer layer operates on the feature level of one object.

from pyradox_tabular.model_config import FeatureTokenizerTransformerConfig
from pyradox_tabular.nn import FeatureTokenizerTransformer

model_config = FeatureTokenizerTransformerConfig(num_outputs=1, out_activation='sigmoid', num_transformer_blocks=2, num_heads=8, embedding_dim=32, dense_dim=16)
model = FeatureTokenizerTransformer.from_config(data_config, model_config, name="feature_tokenizer_transformer")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

Tabular ResNet

Tabular Resnet is a ResNet like architecture containing skip connection but instead of Convolutional Layers, it consists of Linear Layers.

from pyradox_tabular.model_config import TabularResNetConfig
from pyradox_tabular.nn import TabularResNet

model_config = TabularResNetConfig(num_outputs=1, out_activation='sigmoid', hidden_units=[64, 64])
model = TabularResNet.from_config(data_config, model_config, name="deep_network")
model.compile(optimizer="adam", loss="binary_crossentropy")
model.fit(data_train, validation_data=data_valid)
preds = model.predict(data_test)

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyradox-tabular-1.3.0.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyradox_tabular-1.3.0-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file pyradox-tabular-1.3.0.tar.gz.

File metadata

  • Download URL: pyradox-tabular-1.3.0.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.4 tqdm/4.59.0 importlib-metadata/4.10.0 keyring/22.3.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.8

File hashes

Hashes for pyradox-tabular-1.3.0.tar.gz
Algorithm Hash digest
SHA256 91ec956278f49bef2398390d2c05524a5febf978c1dc27abb672437bc65277d8
MD5 b56414cf0349685534378a042c76a746
BLAKE2b-256 0b2dd9555ba2db19b4308432fe082398ddec02b4afb07763c361c9e5b42c0ef4

See more details on using hashes here.

File details

Details for the file pyradox_tabular-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: pyradox_tabular-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.4 tqdm/4.59.0 importlib-metadata/4.10.0 keyring/22.3.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.8

File hashes

Hashes for pyradox_tabular-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 33eb44a530dfc68e2fec61613f4d3b725792c273e87b6358c15a777c093c73b1
MD5 d1ffad93f857a8503f601a0251fd6c6a
BLAKE2b-256 7ca266cad762d4e35967ad5449aed03b3f62713b7198551428b4fb2ffd4288bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page