Skip to main content

PyTorch implementation of Bezier simplex fitting

Project description

pytorch-bsf

CodeQL PyTest PyPI Conda GHPages

A PyTorch implementation of Bezier simplex fitting.

The Bezier simplex is a high-dimensional generalization of the Bezier curve. It enables us to model a complex-shaped point cloud as a parametric hyper-surface in high-dimensional spaces. This package provides an algorithm to fit a Bezier simplex to given data points. To process terabyte-scale data, this package supports distributed training, realtime progress reporting, and checkpointing on top of PyTorch Lightning and MLflow.

See the following papers for technical details.

  • Kobayashi, K., Hamada, N., Sannai, A., Tanaka, A., Bannai, K., & Sugiyama, M. (2019). Bézier Simplex Fitting: Describing Pareto Fronts of´ Simplicial Problems with Small Samples in Multi-Objective Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 2304-2313. https://doi.org/10.1609/aaai.v33i01.33012304
  • Tanaka, A., Sannai, A., Kobayashi, K., & Hamada, N. (2020). Asymptotic Risk of Bézier Simplex Fitting. Proceedings of the AAAI Conference on Artificial Intelligence, 34(03), 2416-2424. https://doi.org/10.1609/aaai.v34i03.5622

Requirements

Python >=3.8, <3.12.

Quickstart

Download the latest Miniconda and install it. Then, install MLflow on your conda environment:

conda install -c conda-forge mlflow

Prepare data:

cat <<EOS > params.tsv
1 0
0.75 0.25
0.5 0.5
0.25 0.75
0 1
EOS
cat <<EOS > values.tsv
0 1
3 2
4 5
7 6
8 9
EOS

Run the following command:

mlflow run https://github.com/rafcc/pytorch-bsf \
  -P params=params.tsv \
  -P values=values.tsv \
  -P degree=3

which automatically sets up the environment and runs an experiment:

  1. Download the latest pytorch-bsf into a temporary directory.
  2. Create a new conda environment and install dependencies in it.
  3. Run an experiment on the temporary directory and environment.
Parameter Type Default Description
params path required The parameter data file. The file should contain a numerical matrix in the TSV format: each row represents a record that consists of features separated by Tabs or spaces.
values path required The value data file. The file should contain a numerical matrix in the TSV format: each row represents a record that consists of outcomes separated by Tabs or spaces.
init path None Load initial control points from a file. The file must be of pickled PyTorch (.pt), CSV (.csv), TSV (.tsv), JSON (.json), or YAML (.yml or .yaml). Either this option or --degree must be specified.
degree int $(x \ge 1)$ None Generate initial control points at random with specified degree. Either this option or --init must be specified.
skeleton list[list[int]] None Specified control points are trained. By default, all control points are trained.
header int $(x \ge 0)$ 0 The number of header lines in params/values files.
delimiter str " " The delimiter of values in params/values files.
normalize "max", "std", "quantile" None The data normalization: "max" scales each feature as the minimum is 0 and the maximum is 1, suitable for uniformly distributed data; "std" does as the mean is 0 and the standard deviation is 1, suitable for nonuniformly distributed data; "quantile" does as 5-percentile is 0 and 95-percentile is 1, suitable for data containing outliers; None does not perform any scaling, suitable for pre-normalized data.
split_ratio float $(0 < x < 1)$ 0.5 The ratio of training data against validation data.
batch_size int $(x \ge 0)$ 0 The size of minibatch. The default uses all records in a single batch.
max_epochs int $(x \ge 1)$ 1000 The number of epochs to stop training.
accelerator "auto", "cpu", "gpu", etc. "auto" Accelerator to use. See PyTorch Lightning documentation.
strategy "auto", "dp", "ddp", etc. "auto" Distributed strategy. See PyTorch Lightning documentation.
devices int $(x \ge -1)$ "auto" The number of accelerators to use. By default, use all available devices. See PyTorch Lightning documentation.
num_nodes int $(x \ge 1)$ 1 The number of compute nodes to use. See PyTorch Lightning documentation.
precision "64", "32", "16", "bf16" "32" The precision of floating point numbers.
loglevel int $(0 \le x \le 2)$ 2 What objects to be logged. 0: nothing; 1: metrics; 2: metrics and models.

Installation

pip install pytorch-bsf

Fitting via CLI

This package provides a command line interface to train a Bezier simplex with a dataset file.

Execute the torch_bsf module:

python -m torch_bsf \
  --params=params.tsv \
  --values=values.tsv \
  --degree=3

Fitting via Script

Train a model by fit(), and call the model to predict.

import torch
import torch_bsf

# Prepare training data
ts = torch.tensor(  # parameters on a simplex
    [
        [3/3, 0/3, 0/3],
        [2/3, 1/3, 0/3],
        [2/3, 0/3, 1/3],
        [1/3, 2/3, 0/3],
        [1/3, 1/3, 1/3],
        [1/3, 0/3, 2/3],
        [0/3, 3/3, 0/3],
        [0/3, 2/3, 1/3],
        [0/3, 1/3, 2/3],
        [0/3, 0/3, 3/3],
    ]
)
xs = 1 - ts * ts  # values corresponding to the parameters

# Train a model
bs = torch_bsf.fit(params=ts, values=xs, degree=3)

# Predict by the trained model
t = [[0.2, 0.3, 0.5]]
x = bs(t)
print(f"{t} -> {x}")

Documents

See documents for more details. https://rafcc.github.io/pytorch-bsf/

Author

RIKEN AIP-FUJITSU Collaboration Center (RAFCC)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch-bsf-0.3.1.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

pytorch_bsf-0.3.1-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file pytorch-bsf-0.3.1.tar.gz.

File metadata

  • Download URL: pytorch-bsf-0.3.1.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for pytorch-bsf-0.3.1.tar.gz
Algorithm Hash digest
SHA256 72666d6329d34d950b55181d9314d0ca5413c63f6392684b802fc520a5cb177d
MD5 3777de8b0d1d6431a7193336928506e3
BLAKE2b-256 0309111c5b0ef5c067096cf51f420bdc12876c853e6a0fe1559dfae36e9b6bf5

See more details on using hashes here.

File details

Details for the file pytorch_bsf-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: pytorch_bsf-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for pytorch_bsf-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d53f2f7468eb6b1abe347d0d6e040127781300ab8d099698f4130511de60c18f
MD5 e32bd7a84d3518a78aa06e14fb309b84
BLAKE2b-256 cfeee138abff73f054907efa2a4fbe179279ecec9b6f5e4b3fb9d733929e2d13

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page