Skip to main content

Abstract data dispatch

Project description

databackend

The databackend package allows you to register a subclass, without needing to import the subclass itself. This is useful for implementing actions over optional dependencies.

Example

For this example, we’ll implement a function, fill_na(), that fills in missing values in a DataFrame. It works with DataFrame objects from two popular libraries: pandas and polars. Importantly, neither library needs to be installed.

Setup

The code below defines “abstract” parent classes for each of the DataFrame classes in the two libraries.

from databackend import AbstractBackend

class AbstractPandasFrame(AbstractBackend):
    _backends = [("pandas", "DataFrame")]


class AbstractPolarsFrame(AbstractBackend):
    _backends = [("polars", "DataFrame")]

Note that the abstract classes can be used as stand-ins for the real thing in issubclass() and isinstance.

from pandas import DataFrame

issubclass(DataFrame, AbstractPandasFrame)
isinstance(DataFrame(), AbstractPandasFrame)
True

📝 Note that you can use AbstractPandasFrame.register_backend("pandas", "DataFrame"), as an alternative way to register backends.

Simple fill_na: isinstance to switch behavior

The fill_na() function below uses custom handling for pandas and polars.

def fill_na(data, x):
    if isinstance(data, AbstractPolarsFrame):
        return data.fill_nan(x)
    elif isinstance(data, AbstractPandasFrame):
        return data.fillna(x)
    else:
        raise NotImplementedError()

Notice that neither pandas nor polars need to be imported when defining fill_na().

Here is an example of calling fill_na() on both kinds of DataFrames.

# test polars ----

import polars as pl

df = pl.DataFrame({"x": [1, 2, None]})
fill_na(df, 3)


# test pandas ----

import pandas as pd

df = pd.DataFrame({"x": [1, 2, None]})
fill_na(df, 3)
     x
0  1.0
1  2.0
2  3.0

The key here is that a user could have only pandas, or only polars, installed. Importantly, doing the isinstance checks do not import any libraries!

Advanced fill_na: generic function dispatch

databackend shines when combined with generic function dispatch. This is a programming approach where you declare a function (e.g. fill_na()), and then register each backend specific implementation on the function.

Python has a built-in function implementing this called functools.singledispatch.

Here is an example of the previous fill_na() function written using it.

from functools import singledispatch

@singledispatch
def fill_na2(data, x):
    raise NotImplementedError(f"No support for class: {type(data)}")


# handle polars ----

@fill_na2.register
def _(data: AbstractPolarsFrame, x):
    return data.fill_nan(x)


# handle pandas ----

@fill_na2.register
def _(data: AbstractPandasFrame, x):
    return data.fillna(x)

Note two important decorators:

  • @singledispatch defines a default function. This gets called if no specific implementations are found.
  • @fill_na2.register defines specific versions of the function.

Here’s an example of it in action.

# example ----

import pandas as pd
import polars as pl

df = pl.DataFrame({"x": [1, 2, None]})
fill_na2(df, 3)

df = pd.DataFrame({"x": [1, 2, None]})
fill_na2(df, 3)
     x
0  1.0
1  2.0
2  3.0

How it works

Under the hood, AbstractBackend behaves similarly to python’s builtin abc.ABC class.

from abc import ABC

class MyABC(ABC):
    pass

from io import StringIO

MyABC.register(StringIO)


# StringIO is a "virtual subclass" of MyABC
isinstance(StringIO("abc"), MyABC)
True

The key difference is that you can specify the virtual subclass using the tuple ("<mod_name>", "<class_name>").

When issubclass(SomeClass, AbstractBackend) runs, then…

  • The standard ABC caching mechanism is checked, and potentially returns the answer immediately.
  • Otherwise, a subclass hook cycles through registered backends.
  • The hook runs the subclass check for any backends that are imported (e.g. are in sys.modules).

Technically, AbstractBackend inherits all the useful metaclass things from abc.ABCMeta, so these can be used also.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databackend-0.0.3.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

databackend-0.0.3-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file databackend-0.0.3.tar.gz.

File metadata

  • Download URL: databackend-0.0.3.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for databackend-0.0.3.tar.gz
Algorithm Hash digest
SHA256 3f047f21b5d92dcfdc85545c679d024b6595bb10a72bd61ef514d8f5857e22f9
MD5 bbef86bd675e90c3e82fac668ad0d14a
BLAKE2b-256 5afb0f8c739f42008957e4ba334cbf3d0b512946ed6d120e912d3f63f9f1d550

See more details on using hashes here.

File details

Details for the file databackend-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: databackend-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for databackend-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2ded97ff85a0ca272d752fbc0513f70512d2c23325e3c72de15613df7e959ef5
MD5 89358b256a79b82cafd3c5257873ef52
BLAKE2b-256 d784582e1daf3f2eb296573c11b774abf10223e9858d4c5c5c936180b643a720

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page