DynaTab: Dynamic Feature Ordering as Neural Rewiring for High-Dimensional Tabular Data

These details have not been verified by PyPI

Project links

Project description

DynaTab: Dynamic Feature Ordering as Neural Rewiring for High-Dimensional Tabular Data

Python License Neuroplasticity OPE PIGL DMA Backbone IDF Analyzer Model Conference Status

DynaTab Architecture

DynaTab is a neuro-inspired tabular deep learning model for high-dimensional tabular data that tackles the Column Permutation Problem by dynamically reordering features instead of treating them as a fixed set. It predicts when feature ordering is beneficial using an intrinsic-dimensionality-based IDF/FOE criterion, then applies dynamic feature ordering (DFO) to rewire feature graphs and produce a task-aware global sequence. This reordered input is processed by an order-aware fusion block combining positional embeddings (OPE), importance gating (PIGL), and dynamic masked attention (DMA) on top of a sequential backbone (Transformer, DAE, LSTM, Mamba, or DAE-MHA-LSTM). It also empirically group tabular datasets into 5 categories. Across 36 real-world datasets and over 45 baselines, DynaTab achieves strong, statistically significant gains, particularly in high-dimensional low-sample-size (HDLSS) and other complex regimes, positioning dynamic feature ordering as a powerful paradigm for order-sensitive backbones in tabular deep learning for high-dimensional tabular data.

Citation

Al Zadid Sultan Bin Habib, Gianfranco Doretto, and Donald A. Adjeroh. “DynaTab: Dynamic Feature Ordering as Neural Rewiring for High-Dimensional Tabular Data.” In AAAI 2026 First International Workshop on Neuro for AI & AI for Neuro: Towards Multi-Modal Natural Intelligence (NeuroAI) Workshop Proceedings (PMLR), 2026.

Bibtex:

@inproceedings{habib2026dynatab,
  title     = {{DynaTab: Dynamic Feature Ordering as Neural Rewiring for High-Dimensional Tabular Data}},
  author    = {Habib, Al Zadid Sultan Bin and Doretto, Gianfranco and Adjeroh, Donald A.},
  booktitle = {Proceedings of the AAAI 2026 First International Workshop on Neuro for AI \& AI for Neuro: Towards Multi-Modal Natural Intelligence (NeuroAI)},
  year      = {2026},
  series    = {PMLR}
}

Files and Repository Structure

Python package: `dynatab/`

This folder contains the core DynaTab implementation (15 Python modules):

__init__.py - Package initializer and high-level API exports.
model.py - Main DynaTab model definition and wiring of all sub-modules.
dfo.py - Dynamic Feature Ordering (DFO) module and clustering/graph construction.
ope.py - Order-Aware Positional Embedding (OPE) implementation.
pigl.py - Positional Importance Gating Layer (PIGL).
dma.py - Dynamic Masked Attention (DMA) block.
seqprobinary.py - Training loop / utilities for binary classification.
seqpromulti.py - Training loop / utilities for multiclass classification.
seqproregression.py - Training loop / utilities for regression.
preprocess.py - Data preprocessing and tabular input utilities (splits, scaling, etc.).
metrics.py - Evaluation metrics and helper functions.
estimator.py - High-level estimator wrapper for running experiments (sklearn-style API).
idf_analyzer.py - Intrinsic Dimensionality Factor (IDF) + FOE analyzer: “Feature Ordering – When to Use?”.
customloss.py - Custom loss functions used by DynaTab.
trainer.py - Generic training / validation loop utilities shared across tasks.

Notebooks

DynaTab Dataset Complexity Analysis.ipynb
Contains the experiments for the “Feature Ordering – When to Use?” section, including IDF / FOE computation across datasets.
DynaTab IDF Analyzer.ipynb
Shows how to install/import the dynatab package and use TabularIDFAnalyzer to compute dataset complexity metrics with demo runs.
The code cells illustrate how to use DynaTab to assess when feature ordering is appropriate for a given dataset.
DynaTab_Experiment1.ipynb
Demonstrates how to use DynaTab for binary classification, multiclass classification, and regression, with or without Optuna-based hyperparameter tuning.
DynaTab_Experiment2.ipynb
Demonstrates DynaTab on the GLI-85 HDLSS dataset for binary classification, without Optuna tuning, using Mamba or LSTM as the sequential processor backbone.
N.B.: Demo runs only contain less number of epochs or Optuna trials. For complete run, please use proper number of Optuna trials to search and find optimum hyperparameters.

Other top-level files

requirements.txt - Python dependencies required to run the DynaTab package and notebooks.
DynaTab_Architecture.jpg - High-level architecture diagram of the DynaTab framework.
LICENSE - MIT license for this repository.
README.md - Project overview, usage instructions, and citation information.
.gitignore - Standard Git ignore rules for Python and Jupyter projects.

Tested Environment

Python 3.8+
torch 2.5.1+cu121 (CUDA 12.1)
numpy 1.26.4
pandas 2.2.3
scikit-learn 1.5.2
matplotlib 3.10.0
scipy 1.11.4
kmeans_gpu 0.0.5

Recommended PyTorch install (GPU, CUDA 12.1)

pip install "torch==2.5.1+cu121" --index-url https://download.pytorch.org/whl/cu121

Installation

You can install DynaTab in several ways depending on your workflow.

Option 1: Clone the Repository (Recommended for Development)

git clone https://github.com/zadid6pretam/DynaTab.git
cd DynaTab
pip install -r requirements.txt
pip install -e .

Option 2: Install Directly from GitHub (No Cloning Needed)

pip install "git+https://github.com/zadid6pretam/DynaTab.git"

Option 3: Use a Virtual Environment

python -m venv dynatab-env
source dynatab-env/bin/activate  # On Windows: dynatab-env\Scripts\activate

git clone https://github.com/zadid6pretam/DynaTab.git
cd DynaTab
pip install -r requirements.txt
pip install -e .

Option 4: Local Install Without Editable Mode

git clone https://github.com/zadid6pretam/DynaTab.git
cd DynaTab
pip install -r requirements.txt
pip install .

Option 5: Install from PyPI (Planned)

pip install dynatab

Example Usage

Below are minimal examples for using DynaTab on standard binary, multiclass, and regression tasks.
For full HDLSS experiments and Optuna sweeps, see the accompanying Jupyter notebooks.

1. Binary Classification (Breast Cancer)

import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

from dynatab import (
    DynaTabClassifier,
    DFOConfig,
    TrainConfig,
    LossConfig,
)

# -----------------------------
# Data
# -----------------------------
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)  # 0/1 labels

X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    stratify=y,
    random_state=42,
)

# -----------------------------
# DynaTab configs
# -----------------------------
dfo_cfg = DFOConfig(
    metric="manhattan",
    num_clusters=2,
    order="ascending",
    mutation_prob=0.0,
    tolerance=1e-3,
    seed=42,
)

train_cfg = TrainConfig(
    epochs=100,
    lr=1e-3,
    batch_size=256,
    print_every=20,
)

loss_cfg = LossConfig(
    loss_mode="DFO",      # "standard" | "dispersion" | "DFO"
    lambda_disp=0.0,
    lambda_global=0.0,
)

# -----------------------------
# Model: DynaTabClassifier
# -----------------------------
clf = DynaTabClassifier(
    task="binary",
    backbone="Transformer",   # or "LSTM", "DAE", "Mamba", ...
    embedding_dim=128,
    dfo_cfg=dfo_cfg,
    train_cfg=train_cfg,
    loss_cfg=loss_cfg,
    eval_metrics=["acc"],
    device=None,              # auto-selects CUDA/CPU
    standardize=True,         # train-only impute + standardize
)

clf.fit(X_train, y_train)
metrics = clf.score(X_test, y_test, metrics=["acc"])
print(f"Test Accuracy (Breast Cancer): {metrics['acc']:.4f}")

2. Multiclass Classification (Iris)

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from dynatab import (
    DynaTabClassifier,
    DFOConfig,
    TrainConfig,
    LossConfig,
)

data = load_iris()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)  # 3 classes: 0,1,2

X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    stratify=y,
    random_state=42,
)

dfo_cfg = DFOConfig(
    metric="variance",
    num_clusters=3,
    order="descending",
    mutation_prob=0.1,
    tolerance=1e-3,
    seed=42,
)

train_cfg = TrainConfig(
    epochs=80,
    lr=1e-3,
    batch_size=64,
    print_every=20,
)

loss_cfg = LossConfig(
    loss_mode="standard",
    lambda_disp=0.0,
    lambda_global=0.0,
)

clf = DynaTabClassifier(
    task="multiclass",
    num_classes=3,
    backbone="Transformer",
    embedding_dim=64,
    dfo_cfg=dfo_cfg,
    train_cfg=train_cfg,
    loss_cfg=loss_cfg,
    eval_metrics=["acc"],
    device=None,
    standardize=True,
)

clf.fit(X_train, y_train)
metrics = clf.score(X_test, y_test, metrics=["acc"])
print(f"Test Accuracy (Iris): {metrics['acc']:.4f}")

3. Regression (Diabetes)

import pandas as pd
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

from dynatab import (
    DynaTabRegressor,
    DFOConfig,
    TrainConfig,
    LossConfig,
)

data = load_diabetes()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)

X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42,
)

dfo_cfg = DFOConfig(
    metric="correlation",
    num_clusters=3,
    order="ascending",
    mutation_prob=0.1,
    tolerance=1e-3,
    seed=42,
)

train_cfg = TrainConfig(
    epochs=120,
    lr=1e-3,
    batch_size=128,
    print_every=20,
)

loss_cfg = LossConfig(
    loss_mode="standard",   # for regression we typically keep it standard
    lambda_disp=0.0,
    lambda_global=0.0,
)

reg = DynaTabRegressor(
    backbone="Transformer",
    embedding_dim=64,
    dfo_cfg=dfo_cfg,
    train_cfg=train_cfg,
    loss_cfg=loss_cfg,
    eval_metrics=["r2"],    # e.g., R^2
    device=None,
    standardize=True,
)

reg.fit(X_train, y_train)
metrics = reg.score(X_test, y_test, metrics=["r2"])
print(f"Test R² (Diabetes): {metrics['r2']:.4f}")

4. Advanced: 5-Fold CV + Optuna Hyperparameter Tuning

For full HDLSS experiments, repeated CV, and Optuna-based tuning (Transformer, LSTM, DAE, Mamba backbones) on real datasets such as AI-d_case5, ADNI_AD123, GLI-85, and others, see:

DynaTab_Experiment1.ipynb – Binary & multiclass classification and regression (with / without Optuna-based hyperparameter tuning).
DynaTab_Experiment2.ipynb – HDLSS case studies (e.g., GLI-85 with Mamba/LSTM backbones).
DynaTab Dataset Complexity Analysis.ipynb and DynaTab IDF Analyzer.ipynb – Intrinsic dimensionality and “when to use feature ordering” analysis.
You can tweak the metrics / epochs / DFO settings if you want them lighter or closer to the paper defaults.

Previous Work: TabSeq

DynaTab builds on our earlier work on feature ordering for tabular data:

TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering
GitHub: https://github.com/zadid6pretam/TabSeq
Springer (ICPR 2024 proceedings): https://link.springer.com/chapter/10.1007/978-3-031-78128-5_27

If you are interested in:

MHA-DAE-guided sequential tabular models,
Cluster-guided feature ordering, and
Baseline comparison to classical ML and other deep models,

please also refer to the TabSeq repository and its accompanying paper as the foundational precursor to DynaTab.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynatab-0.1.0.tar.gz (41.9 kB view details)

Uploaded Feb 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dynatab-0.1.0-py3-none-any.whl (49.3 kB view details)

Uploaded Feb 27, 2026 Python 3

File details

Details for the file dynatab-0.1.0.tar.gz.

File metadata

Download URL: dynatab-0.1.0.tar.gz
Upload date: Feb 27, 2026
Size: 41.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for dynatab-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`d38b27d65ac01c7f64272be65bdcaba3fbd23052cf21c196595ce173f45e81b2`
MD5	`a1fee03cd40942247631a60e6aca1e28`
BLAKE2b-256	`45aa66c61278f0f9528e5d7a794821543dcdd6417bb59e57160f46418eed46c7`

See more details on using hashes here.

File details

Details for the file dynatab-0.1.0-py3-none-any.whl.

File metadata

Download URL: dynatab-0.1.0-py3-none-any.whl
Upload date: Feb 27, 2026
Size: 49.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for dynatab-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5c3456ce4f603c3a45a85539f40a583288cd0f4f30ab5dfca786d64708200f0d`
MD5	`04f33378b9fc304cc7b57511e50e9a71`
BLAKE2b-256	`09738a2c99cb5d129bda3259ce0d0f310df31c0a7d9e56088892cec281e93c80`

See more details on using hashes here.

dynatab 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DynaTab: Dynamic Feature Ordering as Neural Rewiring for High-Dimensional Tabular Data

Citation

Files and Repository Structure

Python package: dynatab/

Notebooks

Other top-level files

Tested Environment

Recommended PyTorch install (GPU, CUDA 12.1)

Installation

Option 1: Clone the Repository (Recommended for Development)

Option 2: Install Directly from GitHub (No Cloning Needed)

Option 3: Use a Virtual Environment

Option 4: Local Install Without Editable Mode

Option 5: Install from PyPI (Planned)

Example Usage

1. Binary Classification (Breast Cancer)

2. Multiclass Classification (Iris)

3. Regression (Diabetes)

4. Advanced: 5-Fold CV + Optuna Hyperparameter Tuning

Previous Work: TabSeq

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Python package: `dynatab/`