Skip to main content

Data Curation in Polaris

Project description

Auroris

PyPI Conda PyPI - Downloads Conda PyPI - Python Version

test release code-check doc

Tools for data curation in the Polaris ecosystem.

Getting started

from auroris.curation import Curator
from auroris.curation.actions import MoleculeCuration, OutlierDetection, Discretization

# Define the curation workflow
curator = Curator(
    steps=[
        MoleculeCuration(input_column="smiles"),
        OutlierDetection(method="zscore", columns=["SOL"]),
        Discretization(input_column="SOL", thresholds=[-3]),
    ],
    parallelized_kwargs = { "n_jobs": -1 }
)

# Run the curation
dataset, report = curator(dataset)

Run curation with command line

A Curator object is serializable, so you can save it to and load it from a JSON file to reproduce the curation.

auroris [config_file] [destination] --dataset-path [data_path]

Documentation

Please refer to the documentation, which contains tutorials for getting started with auroris and detailed descriptions of the functions provided.

Installation

You can install auroris using conda/mamba/micromamba:

conda install -c conda-forge auroris

You can also use pip:

pip install auroris

Development lifecycle

Setup dev environment

conda env create -n auroris -f env.yml
conda activate auroris

pip install --no-deps -e .
Other installation options
Alternatively, using [uv](https://github.com/astral-sh/uv):
```shell
uv venv -p 3.12 auroris
source .venv/auroris/bin/activate
uv pip compile pyproject.toml -o requirements.txt --all-extras
uv pip install -r requirements.txt 
```   

Tests

You can run tests locally with:

pytest

License

Under the Apache-2.0 license. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auroris-0.1.5.tar.gz (96.2 kB view details)

Uploaded Source

Built Distribution

auroris-0.1.5-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file auroris-0.1.5.tar.gz.

File metadata

  • Download URL: auroris-0.1.5.tar.gz
  • Upload date:
  • Size: 96.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for auroris-0.1.5.tar.gz
Algorithm Hash digest
SHA256 02098456ed9f774469f90ad21347d5ddaeeb53b50d88488fd090037365831aab
MD5 3a6334b031f37277b0ade73416d1efbb
BLAKE2b-256 887b2c1a66464f7f9e832e377b936fcc4c263caec3d7039b087727851eb27abd

See more details on using hashes here.

File details

Details for the file auroris-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: auroris-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 34.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for auroris-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 9e5ccece84dafc80b5728b9318a153b54d3576f7497a4dc7faaf8e6f50196157
MD5 e9cf043c3f5172d7cb3b35ad47db24fd
BLAKE2b-256 610cc7b1d5bd36a79f0d2c4bc223541dcf6815e2d40fd0f72cce2f0e22cb61d1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page