Skip to main content

Help manage parameters and datasets

Project description

Neba

Manage parameters and datasets

PyPI GitHub release test status codecov Documentation Status

  • Obtain your parameters from configuration files or command line arguments. Validate them against a structured specification that is easy to write, expandable, and which allows to document every parameter.
  • Declare datasets in a flexible way to manage multiple source files and to read and write data easily using different libraries.

Configuration

The configuration framework is:

  • strict: parameters are defined beforehand. Any unknown or invalid parameter will raise errors
  • structured: parameters can be organized in (nested) sections
  • documented: docstrings of parameters are re-used in configuration files, command line help, and static documentation via a plugin for Sphinx

The parameters values can be retrieved from configuration files (TOML, YAML, Python files, JSON), and from the command line.

The framework is based on the existing traitlets library. It allows type-checking, arbitrary value validation and "on-change" callbacks. This package extends it to allow nesting. The objects containing parameters are significantly extended to ease manipulation.

Here is a simple example project:

from neba.config import Application, Section
from traitlets import Enum, Float, List, Unicode

class App(Application):
    """The application will retrieve and store parameters."""

    result_dir = Unicode("/data/results", help="Directory containing results") 

    class model(Section):
        """A nested section."""
        coefficients = List(Float(), [0.5, 1.5, 10.0], help="Some coefficients for computation.")
        style = Enum(["serial", "parallel"], "serial", help="Only some values are accepted.")

app = App()
print(app.model.style)

Parameters from the example above could be retrieved from the command line with --result_dir "./some_dir" --model.coefficients 0 2.5 10. The application can generate a configuration file, for instance in TOML:

# result_dir = "/data/results"
# ----------
# result_dir (Unicode) default: "/data/results"
# Directory containing results

[model]
# A nested section.

# coefficients = [0.5, 1.5, 10.0]
# ------------
# model.coefficients (List[Float]) default: [0.5, 1.5, 10.0]
# Some coefficients for computation

# style = "serial"
# -----
# model.style (Enum) default: "serial"
# Accepted values: ['serial', 'parallel']
# Only some values are accepted

Data management

Neba tries to ease the creation and management of multiple datasets with different file formats, structures, etc. One dataset can have with multiple source files selected via glob patterns, loaded into pandas, while another could have xarray load a remote data-store.

Each new dataset is specified by creating a subclass of DataInterface which can then be re-used in various scripts to read or write data easily. The interface contains interchangeable modules that are tasked with retrieving data locations, loading and writing data. Their behavior can depend on parameters held by the interface.

Here is an example of an interface where multiple files are found with a glob pattern, and fed into Xarray:

from neba.data import DataInterface, GlobSource, ParametersDict
from neba.data.xarray import XarrayLoader

class SST(DataInterface):
    # store parameters with a simple dict
    Parameters = ParametersDict

    # load data using xarray
    Loader = XarrayLoader
    Loader.open_mfdataset_kwargs = dict(parallel=True)
    
    # find files on disk using glob
    class Source(GlobSource):
        def get_root_directory(self):
            # we use the parameters of the interface instance
            root = self.parameters["data_dir"]
            # this will automatically be joined into a path
            return [root, "SST"]
            
        def get_filename_pattern(self):
            return f"{self.parameters['year']}/SST_*.nc*"
            
di = SST(year=2000, data_dir="/data")
sst = di.get_data()

We used the parameters and loader modules as is, but we configured the source module for our needs.

Documentation

https://neba.readthedocs.io/en/latest/

Requirements

Installation

From PyPI:

pip install neba

From source:

git clone https://github.com/Descanonge/neba
cd neba
pip install -e .

or

pip install -e https://github.com/Descanonge/neba.git#egg=neba

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neba-0.2.0.tar.gz (128.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neba-0.2.0-py3-none-any.whl (80.0 kB view details)

Uploaded Python 3

File details

Details for the file neba-0.2.0.tar.gz.

File metadata

  • Download URL: neba-0.2.0.tar.gz
  • Upload date:
  • Size: 128.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for neba-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bf4d1f0deb5a2c68caa83882c03e52beb66a5fc56d18ad7dd14fdbbde7ce6e91
MD5 274c15451acf48175159cac535a2ef52
BLAKE2b-256 30146a3eb1785def604f6ef3a997224b66190daa6f15f4edc0d0026fa909495b

See more details on using hashes here.

Provenance

The following attestation bundles were made for neba-0.2.0.tar.gz:

Publisher: publish-pypi.yml on Descanonge/neba

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file neba-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: neba-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 80.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for neba-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 667340b4163e65d09cbea51e64a52b213b091e0eff9bc03821eae364015e42ab
MD5 91c3542ef5c69e72a6766f7761a8a65d
BLAKE2b-256 97122b520295fdd24dba983283011056fa7ebf6cc2e49cb4e04d5b01dc9c7b73

See more details on using hashes here.

Provenance

The following attestation bundles were made for neba-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on Descanonge/neba

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page