Skip to main content

Data Curation in Polaris

Project description

Auroris

PyPI Conda PyPI - Downloads Conda PyPI - Python Version

test release code-check doc

Tools for data curation in the Polaris ecosystem.

Getting started

from auroris.curation import Curator
from auroris.curation.actions import MoleculeCuration, OutlierDetection, Discretization

# Define the curation workflow
curator = Curator(
    steps=[
        MoleculeCuration(input_column="smiles"),
        OutlierDetection(method="zscore", columns=["SOL"]),
        Discretization(input_column="SOL", thresholds=[-3]),
    ],
    parallelized_kwargs = { "n_jobs": -1 }
)

# Run the curation
dataset, report = curator(dataset)

Run curation with command line

A Curator object is serializable, so you can save it to and load it from a JSON file to reproduce the curation.

auroris [config_file] [destination] --dataset-path [data_path]

Documentation

Please refer to the documentation, which contains tutorials for getting started with auroris and detailed descriptions of the functions provided.

Installation

You can install auroris using conda/mamba/micromamba:

conda install -c conda-forge auroris

You can also use pip:

pip install auroris

Development lifecycle

Setup dev environment

conda env create -n auroris -f env.yml
conda activate auroris

pip install --no-deps -e .
Other installation options
Alternatively, using [uv](https://github.com/astral-sh/uv):
```shell
uv venv -p 3.12 auroris
source .venv/auroris/bin/activate
uv pip compile pyproject.toml -o requirements.txt --all-extras
uv pip install -r requirements.txt 
```   

Tests

You can run tests locally with:

pytest

License

Under the Apache-2.0 license. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auroris-0.1.4.tar.gz (96.0 kB view details)

Uploaded Source

Built Distribution

auroris-0.1.4-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file auroris-0.1.4.tar.gz.

File metadata

  • Download URL: auroris-0.1.4.tar.gz
  • Upload date:
  • Size: 96.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for auroris-0.1.4.tar.gz
Algorithm Hash digest
SHA256 22003c307c9f5cbfc0ea3a0e7203e6df24f3cc508094d3328f9cb863205d027d
MD5 313b32c1a32474afe91a0f73bb4ef3d3
BLAKE2b-256 3608551512f842f2e4b158bc6989ddbd764424f15bc417964990248b832cbb87

See more details on using hashes here.

File details

Details for the file auroris-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: auroris-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 34.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for auroris-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e8d3d30cdbc47cece3e9b839cb3a3c33a9fe1e3f0a15b61ca75b1d83aef9e56d
MD5 7363a3b21061215ba8f0218afcd78f4e
BLAKE2b-256 fff247e185f228feeffa9e74017dc33690e2268a0603eb7f66073c53e64e4afe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page