Skip to main content

Deep-learning Toolkit for Tabular datasets

Project description

DeepTables

Python Versions TensorFlow Versions Downloads PyPI Version

Documentation Status Build Status Coverage Status License

DeepTables: Deep-learning Toolkit for Tabular data

DeepTables(DT) is a easy-to-use toolkit that enables deep learning to unleash great power on tabular data.

Overview

MLP (also known as Fully-connected neural networks) have been shown inefficient in learning distribution representation. The "add" operations of the perceptron layer have been proven poor performance to exploring multiplicative feature interactions. In most cases, manual feature engineering is necessary and this work requires extensive domain knowledge and very cumbersome. How learning feature interactions efficiently in neural networks becomes the most important problem.

Various models have been proposed to CTR prediction and continue to outperform existing state-of-the-art approaches to the late years. Well-known examples include FM, DeepFM, Wide&Deep, DCN, PNN, etc. These models can also provide good performance on tabular data under reasonable utilization.

DT aims to utilize the latest research findings to provide users with an end-to-end toolkit on tabular data.

DT has been designed with these key goals in mind:

  • Easy to use, non-experts can also use.
  • Provide good performance out of the box.
  • Flexible architecture and easy expansion by user.

Tutorials

Please refer to the official docs at https://deeptables.readthedocs.io/en/latest/.

Installation

pip install deeptables

GPU Setup (Optional)

pip install deeptables[gpu]

Verify the install:

python -c "from deeptables.utils.quicktest import test; test()”

Optional dependencies

Following libraries are not hard dependencies and are not automatically installed when you install DeepTables. To use all functionalities of DT, these optional dependencies must be installed.

pip install shap

Example:

A simple binary classification example

import numpy as np
from deeptables.models import deeptable, deepnets
from deeptables.datasets import dsutils
from sklearn.model_selection import train_test_split

#loading data
df = dsutils.load_bank()
df_train, df_test = train_test_split(df, test_size=0.2, random_state=42)

y = df_train.pop('y')
y_test = df_test.pop('y')

#training
config = deeptable.ModelConfig(nets=deepnets.DeepFM)
dt = deeptable.DeepTable(config=config)
model, history = dt.fit(df_train, y, epochs=10)

#evaluation
result = dt.evaluate(df_test,y_test, batch_size=512, verbose=0)
print(result)

#scoring
preds = dt.predict(df_test)

A solution using DeepTables to win the 1st place in Kaggle Categorical Feature Encoding Challenge II

Click here

DataCanvas

DeepTables is an open source project created by DataCanvas.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deeptables-0.1.12.tar.gz (66.1 kB view details)

Uploaded Source

Built Distribution

deeptables-0.1.12-py3-none-any.whl (2.2 MB view details)

Uploaded Python 3

File details

Details for the file deeptables-0.1.12.tar.gz.

File metadata

  • Download URL: deeptables-0.1.12.tar.gz
  • Upload date:
  • Size: 66.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.5

File hashes

Hashes for deeptables-0.1.12.tar.gz
Algorithm Hash digest
SHA256 c5d8c6e3cbdf256645f5b7854815f3dbb166458a5d0948a34b61845201b914b4
MD5 9c31255bca11df165f7c8985fae42625
BLAKE2b-256 76e49c07ba7b00b7dfcab6dcd233af0ebdee637b2521e667110e17459337d7ba

See more details on using hashes here.

File details

Details for the file deeptables-0.1.12-py3-none-any.whl.

File metadata

  • Download URL: deeptables-0.1.12-py3-none-any.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.5

File hashes

Hashes for deeptables-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 332deb663fbafa81e6b742b97c48f84d2e01e27394f1b4b074bdaaef241fe163
MD5 98b3b6b2cb8653c4294995d60cba8114
BLAKE2b-256 b7d383f4e2dc1a512df323c47816e31aaa34b9aea53e1646f3ed6b5a2a2a1318

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page