Python hacks for type-checking numbers
Project description
Copyright and other protections apply.
Please see the accompanying LICENSE
file for rights and restrictions governing use of this software.
All rights not expressly waived or licensed are reserved.
If that file is missing or appears to be modified from its original, then please contact the author before viewing or using this software in any capacity.
Are you defining a numeric interface that should work with more than just int
s and float
s?
Are you annotating that interface for documentation and type-checking?
Were you excited by PEP 3141’s glitz and gloss promising a clean, straightforward number type definition mechanism, only to learn the hard way—after many hours of searching, tweaking, hacking, and testing ever more convoluted code, again and again—that you could’t actually make it work with Python’s type-checking system?
Do you now wonder whether numbers were something new to computing in general because nothing else would explain such a gaping hole in a programming language so popular with the STEM crowd that has been around since the early 1990s?
Does the number 3186 haunt you in your dreams?
Do you find yourself shouting to no one in particular, “There has to be a better way?”
Well I’m here to tell you there isn’t. But until there is, there’s …
numerary
—Now with Protocol Power™
That’s right!
For a hopefully limited time, you too can benefit from someone else’s deranged work-arounds for the enormous chasms in Python that lie between the esoteric fields of computation that are “typing” and “numbers” instead of having to roll your own out of sheer desperation from first principles!
If you still have no idea what I’m talking about, this may help illustrate.
numerary
is a pure-Python codified rant for signaling that your interface is usable with non-native numeric primitives[^1] without breaking type-checking.
More simply, numerary
aspires to a world where numbers and types can work together.
If you’re thinking that you shouldn’t need a 🤬ing library for that, you’re right.
[^1]:
You know, *super* weird, off-the-wall shit, like members of the [numeric tower](https://docs.python.org/3/library/numbers.html), or [standard library primitives that remain *non*-members for some 🤬ed up reason](https://docs.python.org/3/library/decimal.html), or [legitimate non-members because they predate PEP 3141 and conforming would amount to breaking changes](https://trac.sagemath.org/ticket/28234), or—I don’t know—oodles of libraries and applications that have been around for literally decades that bring huge value to vast scientific and mathematic audiences, but whose number primitives break type-checking if one abides by the ubiquitous bum steer, “I don’t have any experience trying to do what you’re doing, but just use ``float``, bro.”
Because, hey, *🤬* numbers!
Am I right?
This madness should enjoy no audience. It should not exist. Yet here we are. Its author gauges its success by how quickly it can be forgotten, relegated to the annals of superfluous folly.
numerary
is licensed under the MIT License.
See the accompanying LICENSE
file for details.
It should be considered experimental for now, but should settle down quickly.
See the release notes for a summary of version-to-version changes.
Source code is available on GitHub.
If you find it lacking in any way, please don’t hesitate to bring it to my attention.
You had me at, “numbers and types can work together”
numerary
strives to define composable, efficient protocols that one can use to construct numeric requirements.
If all you deal with are integrals and reals, and what you want is broad arithmetic operator compatibility, this will probably get you where you likely want to go:
>>> from numerary import IntegralLike, RealLike
>>> def deeper_thot(arg: RealLike) -> IntegralLike:
... assert arg != 0 and arg ** 0 == 1
... return arg // arg + 42
Beyond default compositions for common use cases, numerary
expands on the Supports
pattern used in the standard library.
For example, numerary.types.SupportsIntegralOps
is a @typing.runtime_checkable
protocol that approximates the unary and binary operators introduced by numbers.Integral
.
>>> from numerary.types import SupportsIntegralOps
>>> def shift_right_one(arg: SupportsIntegralOps) -> SupportsIntegralOps:
... assert isinstance(arg, SupportsIntegralOps)
... return arg >> 1
>>> shift_right_one(2)
1
>>> from sympy import sympify
>>> two = sympify("2") ; type(two)
<class 'sympy.core.numbers.Integer'>
>>> res = shift_right_one(two) ; res
1
>>> type(res)
<class 'sympy.core.numbers.One'>
>>> from fractions import Fraction
>>> shift_right_one(Fraction(1, 2)) # type: ignore [arg-type] # properly caught by Mypy
Traceback (most recent call last):
...
AssertionError
!!! note
Until 1.9, ``sympy.Integer`` [lacked the requisite bitwise operators](https://github.com/sympy/sympy/issues/19311).
``numerary`` catches that!
The above properly results in both a type-checking error as well as a runtime failure for [SymPy](https://www.sympy.org/) versions prior to 1.9.
numerary
’s Supports
protocols can be composed to refine requirements.
For example, let’s say one wanted to ensure type compatibility with primitives that support both __abs__
and __divmod__
.
>>> from typing import TypeVar
>>> T_co = TypeVar("T_co", covariant=True)
>>> from numerary.types import (
... CachingProtocolMeta, Protocol, runtime_checkable,
... SupportsAbs, SupportsDivmod,
... )
>>> @runtime_checkable
... class MyType(
... SupportsAbs[T_co], SupportsDivmod[T_co],
... Protocol, metaclass=CachingProtocolMeta,
... ):
... pass
>>> my_type: MyType
>>> my_type = 3.5
>>> isinstance(my_type, MyType)
True
>>> abs(my_type)
3.5
>>> divmod(my_type, 2)
(1.0, 1.5)
>>> from fractions import Fraction
>>> my_type = Fraction(22, 7)
>>> isinstance(my_type, MyType)
True
>>> abs(my_type)
Fraction(22, 7)
>>> divmod(my_type, 2)
(1, Fraction(8, 7))
>>> from decimal import Decimal
>>> my_type = Decimal("5.2")
>>> isinstance(my_type, MyType)
True
>>> abs(my_type)
Decimal('5.2')
>>> divmod(my_type, 2)
(Decimal('2'), Decimal('1.2'))
>>> my_type = "nope" # type: ignore [assignment] # properly caught by Mypy
>>> isinstance(my_type, MyType)
False
Remember that scandal where complex
defined exception-throwing comparators it wasn’t supposed to have, which confused runtime protocol checking, and then its type definitions lied about it to cover it up?
Yeah, that shit ends here.
>>> from numerary.types import SupportsRealOps
>>> isinstance(1.0, SupportsRealOps) # all good
True
>>> has_real_ops: SupportsRealOps = complex(1) # type: ignore [assignment] # properly caught by Mypy
>>> isinstance(complex(1), SupportsRealOps) # you're not fooling anyone, buddy
False
numerary
not only caches runtime protocol evaluations, but allows overriding those evaluations when the default machinery gets it wrong.
>>> from abc import abstractmethod
>>> from numerary.types import CachingProtocolMeta, Protocol, runtime_checkable
>>> @runtime_checkable
... class MySupportsOne(Protocol, metaclass=CachingProtocolMeta):
... @abstractmethod
... def one(self) -> int:
... pass
>>> class Imposter:
... def one(self) -> str:
... return "one"
>>> imp: MySupportsOne = Imposter() # type: ignore [assignment] # properly caught by Mypy
>>> isinstance(imp, MySupportsOne) # fool me once, shame on you ...
True
>>> MySupportsOne.excludes(Imposter)
>>> isinstance(imp, MySupportsOne) # ... can't get fooled again
False
numerary
has default overrides to correct for known oddities with native types (like our old friend, complex
) and with popular libraries like numpy
[^2] and sympy
.
Others will be added as they are identified.
If I’ve missed any, or if you would like numerary
to support additional number implementations out of the box, please let me know.
[^2]:
!!! bug
`numpy` no longer validates on assignment as it once did.
See [posita/numerary#16](https://github.com/posita/numerary/issues/16) for details.
Performance Enhanced Protocols—A different kind of “PEP” for your step
By default, protocols frustrate runtime type-checking performance.
numerary
applies two distinct, layered optimization strategies:
- Cached
__instancecheck__
results fornumerary
-defined protocols; and - Optional(-ish) short-circuit type enumerations.
Cached __instancecheck__
results
To understand why numerary
protocols are faster for runtime checks, it helps to understand why non-numerary
protocols are so slow.
At runtime (i.e., via isinstance
), the default Protocol
implementation delegates to type(Protocol).__instancecheck__
to perform a crude comparison of an instance’s callable attributes against the protocol’s.
More attributes means more comparisons.
Further, it performs these comparisons … Every. Single. 🤬ing. Time.
Protocols provided by numerary
use instead CachingProtocolMeta
as their meta class.
CachingProtocolMeta
derives from type(beartype.typing.Protocol)
which caches results based on instance type.
numerary
’s version allows for runtime check overrides of those results.
Conceptually:
>>> isinstance(1, SupportsIntegralOps) # first check for an int is delegated to type(Protocol).__instancecheck__
True
>>> isinstance(2, SupportsIntegralOps) # cached result
True
>>> isinstance(1.0, SupportsIntegralOps) # the first check for a float is delegated to type(Protocol).__instancecheck__
False
>>> isinstance(2.0, SupportsIntegralOps) # cached result
False
These offer significant performance improvements, especially where protocols define many methods.
--8<-- "docs/perf_supports_complex.txt"
Source: perf_supports_complex.ipy
--8<-- "docs/perf_supports_complex.ipy"
Union
s for inclusion
Sometimes we might want types that don’t comply with protocol definitions to validate anyway (e.g., because we know they will work at runtime).
For example, float
s in Python versions prior to 3.9 officially lacked __floor__
and __ceil__
methods, but were registered with the numeric tower and worked just fine with math.floor
and math.ceil
.
How does numerary
’s SupportsFloorCeil
deal with this situation?
Not very well, unfortunately, at least not on its own.
>>> import math, sys
>>> from numerary.types import SupportsFloorCeil
>>> def my_dumb_floor_func(arg: SupportsFloorCeil) -> int:
... assert isinstance(arg, SupportsFloorCeil) # will work, even for floats, thanks to default overrides
... return math.floor(arg) # type: ignore [arg-type] # doesn't understand SupportsFloorCeil
>>> float_val: float = 1.6180339887
>>> # For illustration only until <https://github.com/python/mypy/issues/5940> is fixed
>>> if sys.version_info < (3, 9):
... my_dumb_floor_func(float_val) # type: ignore [arg-type] # still results in a Mypy error for Python version <3.9
... else:
... my_dumb_floor_func(float_val) # validates
1
Union
s allow a work-around.
>>> from typing import Union
>>> from numerary.types import SupportsFloorCeil, __floor__
>>> SupportsFloorCeilU = Union[float, SupportsFloorCeil]
>>> import sys
>>> def my_floor_func(arg: SupportsFloorCeilU) -> int:
... assert isinstance(arg, SupportsFloorCeil)
... return __floor__(arg)
>>> my_floor_func(float(1.2)) # works in 3.8+
1
This is largely a contrived example, since math.floor
and math.ceil
happily accept SupportsFloat
, but it is useful for illustration.
Limitations
There are some downsides, though. (Aren’t there always?)
Sometimes protocols are too trusting
Protocols trust numeric tower registrations. TODO(@posita): Is this really true? But sometimes, out there in the real world, implementations lie.
Consider:
>>> from numbers import Integral
>>> hasattr(Integral, "real") and hasattr(Integral, "imag")
True
>>> import sympy
>>> pants_on_fire = sympy.Integer(1)
>>> isinstance(pants_on_fire, Integral)
True
>>> hasattr(pants_on_fire, "real") or hasattr(pants_on_fire, "imag") # somebody's tellin' stories
False
>>> from numerary.types import SupportsRealImag
>>> real_imag: SupportsRealImag = pants_on_fire # fails to detect the lie
>>> real_imag.real
Traceback (most recent call last):
...
AttributeError: 'One' object has no attribute 'real'
In this particular case, numerary
provides us with a defensive mechanism.
>>> from numerary.types import SupportsRealImagMixedU, real, imag
>>> real_imag_defense: SupportsRealImagMixedU = pants_on_fire
>>> real(real_imag_defense)
1
>>> imag(real_imag)
0
Protocols loses fidelity during runtime checking
At runtime, protocols match names, not signatures.
For example, SupportsNumeratorDenominator
’s numerator
and denominator
properties will match sage.rings.integer.Integer
’s similarly named functions.
In other words, isinstance(sage_integer, SupportsNumeratorDenominator)
will return True
.
Further, if the short-circuiting approach is used, because sage.rings.integer.Integer
registers itself with the numeric tower, this may[^3] not be caught by Mypy.
[^3]:
I say *may* because I don’t really understand how Sage’s number registrations work.
>>> class SageLikeRational:
... def __init__(self, numerator: int, denominator: int = 1):
... self._numerator = numerator
... self._denominator = denominator
... def numerator(self) -> int:
... return self._numerator
... def denominator(self) -> int:
... return self._denominator
>>> from numerary.types import SupportsNumeratorDenominator
>>> frac: SupportsNumeratorDenominator = Fraction(29, 3) # no typing error
>>> sage_rational1: SupportsNumeratorDenominator = SageLikeRational(29, 3) # type: ignore [assignment] # Mypy catches this
>>> isinstance(sage_rational1, SupportsNumeratorDenominator) # isinstance does not
True
>>> sage_rational1.numerator
<...method...numerator...>
>>> frac.numerator
29
Known warts could be cured by cache overriding as discussed above.
However, to combat this particular situation, numerary
provides an alternative: the SupportsNumeratorDenominatorMethods
protocol and the numerator
and denominator
helper functions.
These accommodate rational implementations like Sage’s that are mostly compliant with the exception of their respective numerator
and denominator
implementations.
>>> from numerary.types import numerator
>>> numerator(sage_rational1)
29
>>> numerator(frac)
29
>>> from numerary.types import SupportsNumeratorDenominatorMethods, numerator
>>> sage_rational2: SupportsNumeratorDenominatorMethods = SageLikeRational(3, 29) # no type error
>>> numerator(sage_rational2)
3
numerary
also defines:
SupportsNumeratorDenominatorMixedU = Union[
SupportsNumeratorDenominator,
SupportsNumeratorDenominatorMethods,
]
SupportsNumeratorDenominatorMixedT = (
SupportsNumeratorDenominator,
SupportsNumeratorDenominatorMethods,
)
>>> from numerary.types import SupportsNumeratorDenominatorMixedU, numerator
>>> chimera_rational: SupportsNumeratorDenominatorMixedU
>>> chimera_rational = Fraction(29, 3) # no type error
>>> numerator(chimera_rational)
29
>>> chimera_rational = SageLikeRational(3, 29) # still no type error
>>> numerator(chimera_rational)
3
The SupportsNumeratorDenominator*
primitives provide the basis for analogous numerary.types.RationalLike*
primitives, which should provide sufficient (if idiosyncratic) coverage for dealing with (seemingly mis-appropriately named) rationals.
Pass-through caching with composition is pretty sketchy
This is really getting into where the sausage is made, but full transparency is important, because CachingProtocolMeta
does change how protocols are validated at runtime.
Let’s say we register an errant implementation as non-compliant using the CachingProtocolMeta.excludes
method.
>>> from numerary.types import SupportsFloat
>>> class FloatImposter:
... def __float__(self) -> float:
... raise NotImplementedError("Haha! JK! @#$% you!")
... def __int__(self) -> int:
... return 42
>>> float_imp = FloatImposter()
>>> isinstance(float_imp, SupportsFloat)
True
>>> SupportsFloat.excludes(FloatImposter)
>>> isinstance(float_imp, SupportsFloat)
False
For registration to be ergonomic, it should be indelible, survive composition, and afford preference to subsequent overrides by inheritors.
>>> from numerary.types import (
... CachingProtocolMeta, Protocol, runtime_checkable,
... SupportsInt,
... )
>>> @runtime_checkable
... class MySupportsFloatInt(
... SupportsFloat, SupportsInt,
... Protocol,
... ):
... pass
>>> isinstance(float_imp, MySupportsFloatInt) # composition picks up override from base
False
>>> SupportsFloat.reset_for(FloatImposter) # base resets override
>>> isinstance(float_imp, SupportsFloat)
True
>>> isinstance(float_imp, MySupportsFloatInt) # picks up base’s changes
True
>>> MySupportsFloatInt.excludes(FloatImposter) # composition overrides
>>> isinstance(float_imp, MySupportsFloatInt)
False
>>> SupportsFloat.includes(FloatImposter) # base changes
>>> isinstance(float_imp, FloatImposter)
True
>>> isinstance(float_imp, MySupportsFloatInt) # composition remains unchanged
False
>>> MySupportsFloatInt.reset_for(FloatImposter) # removes override in composition
>>> isinstance(float_imp, MySupportsFloatInt) # base is visible again
True
>>> SupportsFloat.excludes(FloatImposter)
>>> isinstance(float_imp, MySupportsFloatInt) # base’s changes are visible to composition again
False
For this to work under the current implementation, we cannot rely exclusively on the standard library’s implementation of __instancecheck__
, since it flattens and inspects all properties (with some proprietary exceptions) of all classes in the inheritance tree (in order of the MRO).
In practical terms, this means one can’t easily delegate to an ancestor’s __instancecheck__
method and a protocol’s cache is effectively hidden from its progeny.
In other words, leaning on the default behavior would require one to register exceptions with every inheritor.
That would suck, so let’s not do that.
However, overriding the behavior is problematic, because the standard library uses non-public interfaces to perform its attribute enumeration. We certainly don’t want to re-implement protocol runtime checking from scratch. (At least not yet.)
beartype.typing.Protocol
’s meta class tries to work around this by sneakily limiting its evaluation to directly defined attributes, and then delegating isinstance
evaluation to its __base__
classes.
In doing so, it picks up its bases’ then-cached values, but at the cost of re-implementing the attribute check as well as taking a dependency on various implementation details of the standard library, which creates a fragility.
Further, for post-inheritance updates, CachingProtocolMeta
extends beartype
’s version to implement a simplistic publish/subscribe mechanism that dirties non-overridden caches in inheritors when member protocols caches are updated.
That’s completely off the beaten path and there are probably some gremlins hiding out there.
One subtlety is that the implementation deviates from performing checks in MRO order (and may perform redundant checks).
This is probably fine as long as runtime comparisons remain limited to crude checks whether attributes merely exist.
It would likely fail if runtime checking becomes more sophisticated, at which time, this implementation will need to be revisited.
Hopefully by then, we can just delete numerary
as the aspirationally unnecessary hack it is and move on with our lives.
(See beartype.typing
and numerary
’s extension for details.)
License
numerary
is licensed under the MIT License.
See the included LICENSE
file for details.
Source code is available on GitHub.
Installation
Installation can be performed via PyPI.
% pip install numerary
...
Alternately, you can download the source and install manually.
% git clone https://github.com/posita/numerary.git
...
% cd numerary
% python -m pip install . # -or- python -c 'from setuptools import setup ; setup()' install .
...
Requirements
numerary
requires a relatively modern version of Python:
It has the following runtime dependencies:
beartype
for caching protocols
numerary
will not use beartype
internally unless the NUMERARY_BEARTYPE
environment variable is set to a truthy[^4] value before numerary
is loaded.
[^4]:
I.E., one of: ``1``, ``on``, ``t``, ``true``, and ``yes``.
See the hacking quick-start for additional development and testing dependencies.
Customers
dyce
- a pure-Python library for modeling arbitrarily complex dice mechanics andmotherbirthing code base ofnumerary
!- 👻
phantom-types
- predicates and other type constraints without runtime overhead - The next one could be you! 👋
Do you have a project that suffers problems made slightly less annoying by numerary
?
Let me know, and I’ll promote it here!
And don’t forget to do your part in perpetuating gratuitous badge-ification!
<!-- Markdown -->
As of version 0.4.1, ``dyce`` is
[![numerary-encumbered](https://raw.githubusercontent.com/posita/numerary/latest/docs/numerary-encumbered.svg)][numerary-encumbered]!
[numerary-encumbered]: https://posita.github.io/numerary/ "numerary-encumbered"
..
reStructuredText - see https://docutils.sourceforge.io/docs/ref/rst/directives.html#image
As of version 0.4.1, ``dyce`` is |numerary-encumbered|!
.. |numerary-encumbered| image:: https://raw.githubusercontent.com/posita/numerary/latest/docs/numerary-encumbered.svg
:align: top
:target: https://posita.github.io/numerary/
:alt: numerary-encumbered
<!-- HTML -->
As of version 0.4.1, <code>dyce</code> is <a href="https://posita.github.io/numerary/"><img
src="https://raw.githubusercontent.com/posita/numerary/latest/docs/numerary-encumbered.svg"
alt="numerary-encumbered"
style="vertical-align: middle;"></a>!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.