Skip to main content

Feature Ordering Module from TabSeq (ICPR 2024)

Project description

TabSeq Feature Ordering

PyPI version Python Versions License: MIT ICPR 2024 Paper GitHub Stars


This module extracts and packages the feature ordering algorithm used in TabSeq (ICPR 2024) as a standalone utility, enabling integration into any tabular deep learning pipeline.

Key Features

  • Variance-based intra-cluster ordering
  • KMeans clustering for feature grouping
  • Weighted global ordering from local cluster orders
  • Minimal dependencies, flexible integration

Installation

pip install tabseq-feature-ordering

Usage

from tabseq_feature_ordering import reorder_features

# Inputs
X_train = ...  # pandas DataFrame of shape (n_samples, n_features)
cluster_size = 5
sort_order = 'descending'  # or 'ascending'

# Output
global_ordering, X_train_reordered = reorder_features(X_train, cluster_size, sort_order)

Parameters

  • X_train: Tabular training data as pd.DataFrame
  • cluster_size: Number of clusters (e.g., 5)
  • sort_order: Intra-cluster sorting order by variance ('ascending' or 'descending')

Output

  • global_ordering: List of column names in reordered order
  • X_train_reordered: DataFrame with reordered columns

Example

import pandas as pd
import numpy as np
from tabseq_feature_ordering import reorder_features

# Example input
X = pd.DataFrame(np.random.rand(40, 80), columns=[f"F{i}" for i in range(80)])

# Run feature ordering
order, X_reordered = reorder_features(X, cluster_size=5, sort_order='descending')

print(order[:10])  # First 10 features in the new order

License

MIT License © 2024 Zadid Habib

Citation

If you use this module, please cite our paper:

Habib, Al Zadid Sultan Bin, Kesheng Wang, Mary-Anne Hartley, Gianfranco Doretto, and Donald A. Adjeroh. "TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering." In International Conference on Pattern Recognition, pp. 418-434. Cham: Springer Nature Switzerland, 2024.


Bibtex

@inproceedings{habib2024tabseq,
  title={TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering},
  author={Habib, Al Zadid Sultan Bin and Wang, Kesheng and Hartley, Mary-Anne and Doretto, Gianfranco and A. Adjeroh, Donald},
  booktitle={International Conference on Pattern Recognition},
  pages={418--434},
  year={2024},
  organization={Springer}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabseq_feature_ordering-0.1.5.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tabseq_feature_ordering-0.1.5-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file tabseq_feature_ordering-0.1.5.tar.gz.

File metadata

  • Download URL: tabseq_feature_ordering-0.1.5.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for tabseq_feature_ordering-0.1.5.tar.gz
Algorithm Hash digest
SHA256 b6bb5af726e8bf17fd08bfc9ed61b9dc777fae2b0792a1ee8fde838a04416398
MD5 9e89b0293c39f7a267f00a7d2c53bd3f
BLAKE2b-256 8d58ac6e507f08df3b7d8cab496743027a60d22aad0430a85294a1f91a026975

See more details on using hashes here.

File details

Details for the file tabseq_feature_ordering-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for tabseq_feature_ordering-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7af2e63f2fc7f6666b3ee5481f990de9c61d84842a934f862127fb5ed1bd0fa9
MD5 58ad4be92829873ee7db396bfb4a8f33
BLAKE2b-256 941c5bb4a7aa699b2b619137c12aca48a2bd688cfd37d1a76b4a94344f1b237b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page