Skip to main content

Help manage parameters and datasets

Project description

Neba

Manage parameters and datasets

GitHub release test status codecov Documentation Status

  • Obtain your parameters from configuration files or command line arguments. Validate them against a structured specification that is easy to write, expandable, and which allows to document every parameter.
  • Declare datasets in a flexible way to manage multiple source files and to read and write data easily using different libraries.

Configuration

The configuration framework is:

  • strict: parameters are defined beforehand. Any unknown or invalid parameter will raise errors
  • structured: parameters can be organized in (nested) sections
  • documented: docstrings of parameters are re-used in configuration files, command line help, and static documentation via a plugin for Sphinx

The parameters values can be retrieved from configuration files (TOML, YAML, Python files, JSON), and from the command line.

The framework is based on the existing traitlets library. It allows type-checking, arbitrary value validation and "on-change" callbacks. This package extends it to allow nesting. The objects containing parameters are significantly extended to ease manipulation.

Here is a simple example project:

from neba.config import ApplicationBase, Section
from traitlets import Float, List, Int, Unicode

class App(ApplicationBase):
    # our parameters
    result_dir = Unicode("/data/results", help="Directory containing results") 

    # a nested section called "model"
    class model(Section):
        year = Int(2000)
        coefficients = List(Float(), [0.5, 1.5, 10.0])

app = App()
print(app.model.year)

Parameters from the example above could be retrieved from the command line with --result_dir "./some_dir" --model.coefficients 0 2.5 10. The application can generate a configuration file, for instance in TOML:

# result_dir = "/data/results"
# ----------
# result_dir (Unicode) default: "/data/results"
# Directory containing results

[model]

# coefficients = [0.5, 1.5, 10.0]
# ------------
# model.coefficients (List[Float]) default: [0.5, 1.5, 10.0]

# year = 2000
# ----
# model.year (Int) default: 2000

Dataset management

The second part aims to ease the creation and management of datasets with different file formats, structures, etc. that can all depend on various parameters.

Each new dataset is specified by creating a new subclass. These classes are made as universal as possible via a system of modules that each cover specific features, and whose implementation can be changed between datasets. For instance one dataset can deal with multiple source files selected via glob patterns and loaded into Pandas, while another could have a remote data-store as input loaded into Xarray.

An example of a dataset where multiple files are managed with a glob pattern, and fed into Xarray:

from neba.data import Dataset, GlobSource, ParamsManagerDict
from neba.data.xarray import XarrayLoader

class SST(Dataset):
    # parameters will be held in a simple dict
    Params = ParamsManagerDict
    # loader module uses xarray.open_mfdataset
    Loader = XarrayLoader
    
    # source files are retrieved from disk using glob
    class Source(GlobSource):
        def get_root_directory(self):
            # we use the parameters of the Dataset instance
            root = self.params["data_dir"]
            # this will automatically be joined into a path
            return [root, "SST"]
            
        def get_filename_pattern(self):
            return f"{self.params['year']}/SST_*.nc*"
            
ds = SST(year=2000, data_dir="/data")
sst = ds.get_data()

We used the parameters and loader module as is, but we configured the source module for our needs. Most modules will use methods like this to take advantage of the parameters contained in the dataset.

Documentation

https://neba.readthedocs.io/en/latest/

Requirements

Installation

🚧 Soon on PyPI 🚧

From source:

git clone https://github.com/Descanonge/neba
cd neba
pip install -e .

or

pip install -e https://github.com/Descanonge/neba.git#egg=neba

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neba-0.1.0.tar.gz (120.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neba-0.1.0-py3-none-any.whl (78.9 kB view details)

Uploaded Python 3

File details

Details for the file neba-0.1.0.tar.gz.

File metadata

  • Download URL: neba-0.1.0.tar.gz
  • Upload date:
  • Size: 120.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for neba-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9170bab5db69ad85276dcf79aea7700ed02f6f2fff4772d90e786d6c9c316b64
MD5 ce2c532d474c77fd0c92917e0d7a5b2a
BLAKE2b-256 14918ba8b087e3d144fdb9132a11e95e2691886bd54389687ce2d87a1896dfa8

See more details on using hashes here.

Provenance

The following attestation bundles were made for neba-0.1.0.tar.gz:

Publisher: publish-pypi.yml on Descanonge/neba

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file neba-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: neba-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 78.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for neba-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 df033025a41a5ed8bcbd45a1d9243f5530cbd16fe49e475ddaeaee185438e229
MD5 7075160144e852425d4ec43d4959e3b1
BLAKE2b-256 9b63665acc0de5d3d6093cd97b71b59d0cc606fb862f81335675b8807f3d852c

See more details on using hashes here.

Provenance

The following attestation bundles were made for neba-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on Descanonge/neba

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page