TabularBench: Adversarial robustness benchmark for tabular data
Project description
TabularBench
TabularBench: Adversarial robustness benchmark for tabular data
Leaderboard: https://serval-uni-lu.github.io/tabularbench/
Research papers:
- Benchmark: TabularBench: Benchmarking Adversarial Robustness for Tabular Deep Learning in Real-world Use-cases
- CAPGD and CAA attacks: Constrained Adaptive Attack: Effective Adversarial Attack Against Deep Neural Networks for Tabular Data
- MOEVA attack: A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space
Installation
Using Docker (recommended)
-
Clone the repository
-
Build the Docker image
./tasks/docker_build.sh
-
Run the Docker container
./tasks/run_benchmark.sh
Note: The ./tasks/run_benchmark.sh
script mounts the current directory to the /workspace
directory in the Docker container.
This allows you to edit the code on your host machine and run the code in the Docker container without rebuilding.
With Pyenv and Poetry
-
Clone the repository
-
Create a virtual environment using Pyenv with Python 3.8.10.
-
Install the dependencies using Poetry.
poetry install
Using conda
-
Clone the repository
-
Create a virtual environment using Conda with Python 3.8.10.
conda create -n tabularbench python=3.8.10
-
Activate the conda environment.
conda activate tabularbench
-
Install the dependencies using Pip.
pip install -r requirements.txt
How to use
Run the benchmark
You can run the benchmark with the following command:
python -m tasks.run_benchmark
or with Docker:
docker_run_benchmark
Using the API
You can also use the API to run the benchmark. See tasks/run_benchmark.py
for an example.
clean_acc, robust_acc = benchmark(
dataset="URL",
model="STG_Default",
distance="L2",
constraints=True,
)
Retrain the models
We provide the models and parameters used in the paper. You can retrain the models with the following command:
python -m tasks.train_model
Edit the tasks/train_model.py
file to change the model, dataset, and training method.
Data availability
Datasets, pretrained models, and synthetic data are publicly available here. The folder structure on the Shared folder should be followed locally to ensure the code runs correctly.
Datasets: Datasets are downloaded automatically in data/datasets
when used.
Models: Pretrained models are available in the folder data/models
.
Model parameters: Optimal parameters (from hyperparameters search) are required to train models and are in data/model_parameters
.
Synthetic data: The synthetic data generated by GANs is available in the folder data/synthetic
.
Naming
For technical reasons, the names of datasets, models, and training methods are different from the paper. The mapping can be found in docs/naming.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tabularbench-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5244d2de59336ebf5636c85e51b28506859f11065c29c3c921420b989ef66b7d |
|
MD5 | 5f33da19a10307cbb951ac6e8c201403 |
|
BLAKE2b-256 | 4d4d12cd8c9855d6052425790ad177f2a5501eadc6611682ecc87e4a41798352 |