Skip to main content

A collection of utilities and extensions for Kedro

Project description

more-kedro :hammer_and_wrench:

A collection of utilities and extensions for Kedro

Installation

$ pip install more-kedro

hooks.TypedParameters

Enables on the fly typing and validation of your parameter dictionaries.

Usage

Activate by adding the TypedParameters hook to your KedroContext:

from more_kedro.hooks import TypedParameters

class ProjectContext(KedroContext):
    hooks = (
        TypedParameters(),
    )

    ...

Now you can specify types in your parameters.yml:

training__type: my_project.nodes.model.TrainingParams
training:
  num_iter: 100
  learning_rate: 0.001

or if you pass TypedParameters(inline=True):

training:
  type: my_project.nodes.model.TrainingParams
  num_iter: 100
  learning_rate: 0.001

The benefit of the first approach is that you can overwrite your parameter values in conf/local/ without having to respecify the types.

Any node which has an input params:training will now be injected with the equivalent of TrainingParams(num_iter=100, learning_rate=0.001) instead of a raw dictionary. You can use any custom class, dataclass, pydantic model or any other callable to get validation and typing of your parameters. The type must contain the full location and name of your type object, so that it can be imported from the root of your project.

The parameters are typed right after your DataCatalog is created, so any failures will surface before your kedro run starts.

datasets.TryLoadDataSet

A dataset which uses an underlying dataset definition to load and save, but if the load method throws an exception it returns a default value instead. Can be used if the existence of some data is optional to the pipeline.

Usage

TryLoadDataSet takes two arguments, dataset which is a normal dataset definition, and an optional default_value which is the value to return if the load fails (defaults to None). Example of an entry in catalog.yml:

companies:
  dataset:
    type: pandas.CSVDataSet
    filepath: "path/to/companies.csv"
  default_value: null

Contributions

If you have any useful Kedro utilities such as runners, hooks, datasets or whatever it may be - PR's are very welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

more-kedro-0.2.0.tar.gz (3.7 kB view hashes)

Uploaded Source

Built Distribution

more_kedro-0.2.0-py3-none-any.whl (4.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page