Skip to main content

Parameter-holding classes with robust subclassing protection

Project description

pypi python versions license MIT pipeline status codecov mypy Ruff uv

ParamClass

# Install from PyPI
pip install paramclasses
Table of Contents
  1. ๐Ÿ‘ฉโ€๐Ÿซ Rationale
  2. ๐Ÿง Overview
  3. ๐Ÿ‘ฉโ€๐Ÿ’ป Subclassing API
  4. ๐Ÿค“ Advanced
  5. ๐Ÿš€ Contributing
  6. โš–๏ธ License

1. Rationale ๐Ÿ‘ฉโ€๐Ÿซ

For a parameter-holding class (like dataclasses), it is nice to embark some functionality (e.g. properties params to get a dict of parameters' (key, value) pairs, missing_params for unassigned parameter keys, ...). Inheriting them via subclassing would allows to factor out specialized functionalities with context-dependant methods (e.g. fit, reset, plot, etc...). However, such subclassing comes with a risk of attributes conflicts, especially for exposed APIs, when users do not necessarily know every "read-only" (or "protected") attributes from parents classes.

To solve this problem, we propose a base ParamClass with a @protected decorator, which robustly protects any target attribute from being accidentally overriden when subclassing, at runtime. Note that @dataclass(frozen=True) only applies protection to instances' parameters and can silently override subclass assignments. Atlernatives such as typing.final and typing.Final are designed for type checkers on which we do not want to rely -- from python 3.11 onwards, final does add a __final__ flag when possible, but it will not affect immutable objects.

Back to Table of Contents๐Ÿ‘†

2. Overview ๐Ÿง

Defining a paramclass

A paramclass is defined by subclassing ParamClass directly or another paramclass. Similarly to dataclasses, parameters are identified as any annotated attribute and instancation logic is automatically built-in -- though it can be extended.

from paramclasses import ParamClass

class A(ParamClass):
    parameter_with_a__default_value: ... = "default value"
    parameter_with_no_default_value: ...
    not_a_parameter = "not a parameter"
    def an_actual_method(self): ...
    def a_method_turned_into_a_parameter(self): ...
    a_method_turned_into_a_parameter: ...

Instances have a repr -- which can be overriden in subclasses -- displaying non-default or missing parameter values.

>>> A(parameter_with_a__default_value="non-default value")
A(parameter_with_a__default_value='non-default value', parameter_with_no_default_value=?)

One accesses current parameters dict and missing parameters of an instance with the properties params and missing_params respectively.

>>> from pprint import pprint
>>> pprint(A().params)
{'a_method_turned_into_a_parameter': <function A.a_method_turned_into_a_parameter at 0x11067b9a0>,
 'parameter_with_a__default_value': 'default value',
 'parameter_with_no_default_value': ?}
>>> A().missing_params
('parameter_with_no_default_value',)

Note that A().a_method_turned_into_a_parameter is not a bound method -- see Descriptor parameters.

Back to Table of Contents๐Ÿ‘†

Protecting attributes with @protected

Say we define the following BaseEstimator class.

from paramclasses import ParamClass, protected

class BaseEstimator(ParamClass):
    @protected
    def fit(self, data): ...  # Some fitting logic

Then, we are guaranteed that no subclass can redefine fit.

>>> class Estimator(BaseEstimator):
...     fit = True  # This should FAIL
... 
<traceback>
paramclasses.paramclasses.ProtectedError: Attribute 'fit' is protected

This runtime protection can be applied to all attributes -- with protected(value) --, methods, properties, etc... during class definition but not after. It is "robust" in the sense that breaking the designed behaviour, though possible, requires -- to our knowledge -- obscure patterns.

Back to Table of Contents๐Ÿ‘†

Seamless attributes interactions

Parameters can be assigned values like any other attribute -- unless specifically protected -- with instance.attr = value. It is also possible to set multiple parameters at once with keyword arguments during instantiation, or after with set_params.

class A(ParamClass):
    x: ...      # Parameter without default value
    y: ... = 1  # Parameter with default value `1`
    z = 2       # Non-parameter attribute
>>> A(y=0)                      # Instantiation assignments
A(x=?, y=0)                     # Only shows non-default values
>>> A().set_params(x=0, y=0)    # `set_params` assignments
>>> A().y = 1                   # Usual assignment
>>> del A(x=0).x                # Usual deletion
>>> A.x = 1                     # Class-level assignment/deletion works...
>>> A()
A(x=1)                          # ... and `A` remembers default values -- otherwise would show `A(x=?)`
>>> a.set_params(z=0)           # Should FAIL: Non-parameters cannot be assigned with `set_params`
<traceback>
AttributeError: Invalid parameters: {'z'}. Operation cancelled

Additional functionalities

Callback on parameters updates

Whenever an instance is assigned a value -- instantiation, set_params, dotted assignment -- the callback

def _on_param_will_be_set(self, attr: str, future_val: object) -> None

is triggered. For example, it can be used to unfit and estimator on specific modifications. As suggested by the name and signature, the callback operates just before the future_val assignment. There is currently no counterpart for parameter deletion. This could be added upon motivated interest.

Back to Table of Contents๐Ÿ‘†

Instantiation logic with __post_init__

Similarly to dataclasses, a __post_init__ method can be defined to complete instantiation after the initial setting of parameter values. It must have signature

def __post_init__(self, *args: object, **kwargs: object) -> None

and is called as follows by __init__.

# Close equivalent to actual implementation
@protected
def __init__(self, args: list = [], kwargs: dict = {}, /, **param_values: object) -> None:
        self.set_params(**param_values)
        self.__post_init__(*args, **kwargs)

Since parameter values are set before __post_init__ is called, they are accessible when it executes.

Back to Table of Contents๐Ÿ‘†

Abstract methods

The base ParamClass already inherits ABC functionalities, so @abstractmethod can be used.

from abc import abstractmethod

class A(ParamClass):
    @abstractmethod
    def next(self): ...
>>> A()
<traceback>
TypeError: Can't instantiate abstract class A with abstract method next

Back to Table of Contents๐Ÿ‘†

3. Subclassing API ๐Ÿ‘ฉโ€๐Ÿ’ป

As seen in Additional functionalities, three methods may be overriden by subclasses.

# ===================== Subclasses may override these ======================
def _on_param_will_be_set(self, attr: str, future_val: object) -> None:
    """Call before parameter assignment."""

def __post_init__(self, *args: object, **kwargs: object) -> None:
    """Init logic, after parameters assignment."""

def __repr__(self) -> str:
    """Show all non-default or missing, e.g. `A(x=1, z=?)`."""

Furthermore, as a last resort, developers may occasionally wish to use the following module attributes.

  • DEFAULT: Current value is "__paramclass_default_". Use getattr(self, DEFAULT) to access the dict (mappingproxy) of parameters' (key, default value) pairs.
  • PROTECTED: Current value is "__paramclass_protected_". Use getattr(self, PROTECTED) to access the set (frozenset) of protected parameters.
  • MISSING: The object representing the "missing value" in the default values of parameters. Using instance.missing_params should almost always be enough, but if necessary, use val is MISSING to check for missing values.

Strings DEFAULT and PROTECTED act as special protected keys for paramclasses' namespaces, to leave default and protected available to users. We purposefully chose would-be-mangled names to further decrease odds of natural conflict.

# Recommended way of using `DEFAULT` and `PROTECTED`
from paramclasses import ParamClass, DEFAULT, PROTECTED

getattr(ParamClass, DEFAULT)    # mappingproxy({})
getattr(ParamClass, PROTECTED)  # frozenset({'params', '__getattribute__', '__paramclass_default_', '__paramclass_protected_', 'missing_params', '__setattr__', '__init__', '__delattr__', 'set_params', '__dict__'})
# Works on subclasses and instances too

Finally, when subclassing an external Parent class, one can check whether it is a paramclass with isparamclass.

from paramclasses import isparamclass

isparamclass(Parent)  # Returns a boolean

Back to Table of Contents๐Ÿ‘†

4. Advanced ๐Ÿค“

Post-creation protection

It is not allowed and will be ignored with a warning.

class A(ParamClass):
    x: int = 1
>>> A.x = protected(2)  # Assignment should WORK, protection should FAIL
<stdin>:1: UserWarning: Cannot protect attribute 'x' after class creation. Ignored
>>> a = A(); a
A(x=2)                  # Assignment did work
>>> a.x = protected(3)  # Assignment should WORK, protection should FAIL
<stdin>:1: UserWarning: Cannot protect attribute 'x' on instance assignment. Ignored
>>> a.x
3                       # First protection did fail, new assignment did work
>>> del a.x; a
A(x=2)                  # Second protection did fail

Back to Table of Contents๐Ÿ‘†

Descriptor parameters

TLDR: using descriptors for parameter values is fine if you know what to expect.

import numpy as np

class Aggregator(ParamClass):
    aggregator: ... = np.cumsum

Aggregator().aggregator([0, 1, 2])  # array([0, 1, 3])

This behaviour is similar to dataclasses' but is not trivial:

class NonParamAggregator:
    aggregator: ... = np.cumsum
>>> NonParamAggregator().aggregator([0, 1, 2])  # Should FAIL
<traceback>
TypeError: 'list' object cannot be interpreted as an integer
>>> NonParamAggregator().aggregator
<bound method cumsum of <__main__.NonParamAggregator object at 0x13a10e7a0>>

Note how NonParamAggregator().aggregator is a bound method. What happened here is that since np.cumsum is a descriptor -- like all function, property or member_descriptor objects for example --, the function np.cumsum(a, axis=None, dtype=None, out=None) interpreted NonParamAggregator() to be the array a, and [0, 1, 2] to be the axis.

To avoid this kind of surprises we chose, for instances' parameters only, to bypass the get/set/delete descriptor-specific behaviours, and treat them as usual attributes. Contrary to dataclasses, by also bypassing descriptors for set/delete operations, we allow property-valued parameters, for example.

class A(ParamClass):
    x: property = property(lambda _: ...)  # Should WORK

@dataclass
class B:
    x: property = property(lambda _: ...)  # Should FAIL
>>> A()  # paramclass
A()
>>> B()  # dataclass
<traceback>
AttributeError: can't set attribute 'x'

This should not be a very common use case anyway.

Back to Table of Contents๐Ÿ‘†

Multiple inheritance

Multiple inheritance is not a problem. Default values will be retrieved as expect following the MRO, but there's one caveat: protected attributes should be consistant between the bases. For example, if A.x is not protected while B.x is, one cannot take (A, B) for bases.

class A(ParamClass):
    x: int = 0

class B(ParamClass):
    x: int = protected(1)

class C(B, A): ...  # Should WORK

class D(A, B): ...  # Should FAIL
>>> class C(B, A): ...  # Should WORK
... 
>>> class D(A, B): ...  # Should FAIL
... 
<traceback>
paramclasses.paramclasses.ProtectedError: Incoherent protection inheritance for attribute 'x'

Back to Table of Contents๐Ÿ‘†

Using __slots__

Before using __slots__ with ParamClass, please note the following.

  1. Since the parameters get/set/delete interactions bypass descriptors, using __slots__ on them will not yield the usual behaviour.
  2. You cannot slot a previously protected attribute -- since it would require replacing its value with a member object.
  3. Since ParamClass does not use __slots__, any of its subclasses will still have a __dict__.
  4. The overhead from ParamClass functionality, although not high, probably nullifies any __slots__ optimization in most use cases.

Back to Table of Contents๐Ÿ‘†

Breaking ParamClass protection scheme

There is no such thing as "perfect attribute protection" in Python. As such ParamClass only provides protection against natural behaviour (and even unnatural to a large extent). Below are some knonwn easy ways to break it, representing discouraged behaviour. If you find other elementary ways, please report them in an issue.

  1. Modifying @protected -- huh?
  2. Modifying or subclassing type(ParamClass) -- requires evil dedication.

Back to Table of Contents๐Ÿ‘†

Type checkers

The @protected decorator is not acting in the usual sense, as it is a simple wrapper meant to be detected and unwrapped by the metaclass constructing paramclasses. Consequently, type checkers such as mypy may be confused. If necessary, we recommend locally disabling type checking with the following comment -- and the appropriate error-code.

@protected  # type: ignore[error-code]  # mypy is fooled
def my_protected_method(self):

It is not ideal and may be fixed in future updates.

Back to Table of Contents๐Ÿ‘†

5. Contributing ๐Ÿš€

Questions, issues, discussions and pull requests are welcome! Please do not hesitate to contact me.

Developing with uv

The project is developed with uv which simplifies soooo many things!

# Installing `uv` on Linux and macOS
curl -LsSf https://astral.sh/uv/install.sh | sh
# Using `uv` command may require restarting the bash session

After having installed uv, you can independently use all of the following without ever worrying about installing python or dependencies, or creating virtual environments.

uvx ruff check                        # Check linting
uvx ruff format --diff                # Check formatting
uv run mypy                           # Run mypy
uv pip install -e . && uv run pytest  # Run pytest
uv run python                         # Interactive session in virtual environment

Back to Table of Contents๐Ÿ‘†

6. License โš–๏ธ

This package is distributed under the MIT License.

Back to Table of Contents๐Ÿ‘†

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paramclasses-0.1.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paramclasses-0.1.0-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file paramclasses-0.1.0.tar.gz.

File metadata

  • Download URL: paramclasses-0.1.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for paramclasses-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e2a8efcab869044d8cb9e320994c1dfed5f681ea60c0b2cbaaea03ff441c673a
MD5 b4a224e3bd549462b930ff795e60ef53
BLAKE2b-256 8b44439a5d726433259bbf36666114a2f59f27ed9425c4d91a14e65c447286fa

See more details on using hashes here.

File details

Details for the file paramclasses-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for paramclasses-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e0c3da09f2541295aeaf4c1e853303ec1bff19ba2a42c7d68e7bf7451572b21b
MD5 c983d103d3ef7d24e7fdca3aef61fecd
BLAKE2b-256 0e5d4e8e06e6fbf958b8c2638f56aefe6a81fbd7f27e7e3934740c3e74eb615a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page