Algebraic data types for Python

# adt adt is a library providing algebraic data types in Python, with a clean, intuitive syntax, and support for typing through a mypy plugin.

NOTE: This project is experimental, and not actively maintained by the author. Contributions and forking are more than welcome.

# What are algebraic data types?

An algebraic data type (also known as an ADT) is a way to represent multiple variants of a single type, each of which can have some data associated with it. The idea is very similar to tagged unions and sum types, which in Python are represented as Enums.

ADTs are useful for a variety of data structures, including binary trees:

@adt
class Tree:
EMPTY: Case
LEAF: Case[int]
NODE: Case["Tree", "Tree"]


Abstract syntax trees (like you might implement as part of a parser, compiler, or interpreter):

@adt
class Expression:
LITERAL: Case[float]
UNARY_MINUS: Case["Expression"]
MINUS: Case["Expression", "Expression"]
MULTIPLY: Case["Expression", "Expression"]
DIVIDE: Case["Expression", "Expression"]


Or more generic versions of a variant type, like an Either type that represents a type A or a type B, but not both:

L = TypeVar('L')
R = TypeVar('R')

class Either(Generic[L, R]):
LEFT: Case[L]
RIGHT: Case[R]


## Pattern matching

Now, defining a type isn't that interesting by itself. A lot of the expressivity of ADTs arises when you pattern match over them (sometimes known as "destructuring").

For example, we could use the Either ADT from above to implement a sort of error handling:

# Defined in some other module, perhaps
def some_operation() -> Either[Exception, int]:
return Either.RIGHT(22)  # Example of building a constructor

# Run some_operation, and handle the success or failure
default_value = 5
unpacked_result = some_operation().match(
# In this case, we're going to ignore any exception we receive
left=lambda ex: default_value,
right=lambda result: result)


Aside: this is very similar to how error handling is implemented in languages like Haskell, because it avoids the unpredictable control flow of raising and catching exceptions, and ensures that callers need to make an explicit decision about what to do in an error case.

One can do the same thing with the Expression type above (just more cases to match):

def handle_expression(e: Expression):
return e.match(
literal=lambda n: ...,
unary_minus=lambda expr: ...,
minus=lambda lhs, rhs: ...,
multiply=lambda lhs, rhs: ...,
divide=lambda lhs, rhs: ...)


## Compared to Enums

ADTs are somewhat similar to Enums from the Python standard library (in fact, the uppercase naming convention is purposely similar).

For example, an Enum version of Expression might look like:

from enum import Enum, auto
class EnumExpression(Enum):
LITERAL = auto()
UNARY_MINUS = auto()
MINUS = auto()
MULTIPLY = auto()
DIVIDE = auto()


However, this doesn't allow data to be associated with each of these enum values. A particular value of Expression will tell you about a kind of expression that exists, but the operands to the expressions still have to be stored elsewhere.

From this perspective, ADTs are like Enums that can optionally have data associated with each case.

## Compared to inheritance

Algebraic data types are a relatively recent introduction to object-oriented programming languages, for the simple reason that inheritance can replicate the same behavior.

Continuing our examples with the Expression ADT, here's how one might represent it with inheritance in Python:

from abc import ABC
class ABCExpression(ABC):
pass

class LiteralExpression(ABCExpression):
def __init__(self, value: float):
pass

class UnaryMinusExpression(ABCExpression):
def __init__(self, inner: ABCExpression):
pass

def __init__(self, lhs: ABCExpression, rhs: ABCExpression):
pass

class MinusExpression(ABCExpression):
def __init__(self, lhs: ABCExpression, rhs: ABCExpression):
pass

class MultiplyExpression(ABCExpression):
def __init__(self, lhs: ABCExpression, rhs: ABCExpression):
pass

class DivideExpression(ABCExpression):
def __init__(self, lhs: ABCExpression, rhs: ABCExpression):
pass


This is noticeably more verbose, and the code to consume these types gets much more complex as well:

e: ABCExpression = UnaryMinusExpression(LiteralExpression(3))  # Example of creating an expression

if isinstance(e, LiteralExpression):
result = ... # do something with e.value
elif isinstance(e, UnaryMinusExpression):
result = ... # do something with e.inner
result = ... # do something with e.lhs and e.rhs
elif isinstance(e, MinusExpression):
result = ... # do something with e.lhs and e.rhs
elif isinstance(e, MultiplyExpression):
result = ... # do something with e.lhs and e.rhs
elif isinstance(e, DivideExpression):
result = ... # do something with e.lhs and e.rhs
else:
raise ValueError(f'Unexpected type of expression: {e}')


ADTs offer a simple way to define a type which is one of a set of possible cases, and allowing data to be associated with each case and packed/unpacked along with it.

## Examples in other programming languages

Algebraic data types are very common in functional programming languages, like Haskell or Scala, but they're gaining increasing acceptance in "mainstream" programming languages as well.

Here are a few examples.

### Rust

Rust enums are actually full-fledged ADTs. Here's how an Either ADT could be defined:

enum Either<L, R> {
Left(L),
Right(R),
}


### Swift

Swift enumerations are very similar to Rust's, and behave like algebraic data types through their support of "associated values."

enum Either<L, R> {
case Left(L)
case Right(R)
}


### TypeScript

TypeScript offers ADTs through a language feature known as "discriminated unions".

See this example from their documentation:

interface Square {
kind: "square";
size: number;
}
interface Rectangle {
kind: "rectangle";
width: number;
height: number;
}
interface Circle {
kind: "circle";
}

type Shape = Square | Rectangle | Circle;


# Installation

To add adt as a library in your Python project, simply run pip (or pip3, as it may be named on your system):

pip install algebraic-data-types


## mypy plugin

The library also comes with a plugin for mypy that enables typechecking of @adt definitions. If you are already using mypy, the plugin is required to avoid nonsensical type errors.

To enable the adt typechecking plugin, add the following to a mypy.ini file in your project's working directory:

[mypy]


To begin defining your own data type, import the @adt decorator and Case[…] annotation:

from adt import adt, Case


Then, define a new Python class, upon which you apply the @adt decorator:

@adt
pass


For each case (variant) that your ADT will have, declare a field with the Case annotation. It's conventional to declare your fields with ALL_UPPERCASE names, but the only true restriction is that they cannot be lowercase.

@adt
FIRST_CASE: Case
SECOND_CASE: Case


If you want to associate some data with a particular case, list the type of that data in brackets after Case (similar to the Generic[…] and Tuple[…] annotations from typing). For example, to add a case with an associated string:

@adt
FIRST_CASE: Case
SECOND_CASE: Case
STRING_CASE: Case[str]


You can build cases with arbitrarily many associated pieces of data, as long as all the types are listed:

@adt
FIRST_CASE: Case
SECOND_CASE: Case
STRING_CASE: Case[str]
LOTS_OF_DATA_CASE: Case[int, str, str, Dict[int, int]]


ADTs can also be recursive—i.e., an ADT can itself be stored alongside a specific case—though the class name has to be provided in double quotes (a restriction which also applies to typing).

A typical example of a recursive ADT is a linked list. Here, the list is also made generic over a type T:

T = TypeVar('T')

NIL: Case


See the library's tests for more examples of complete ADT definitions.

## Generated functionality

Given an ADT defined as follows:

@adt
EMPTY: Case
INTEGER: Case[int]
STRING_PAIR: Case[str, str]


The @adt decorator will automatically generate accessor methods of the following form:

    def empty(self) -> None:
return None

def integer(self) -> int:
... # unpacks int value and returns it

def string_pair(self) -> Tuple[str, str]:
... # unpacks strings and returns them in a tuple


These accessors can be used to obtain the data associated with the ADT case, but accessors will throw an exception if the ADT was not constructed with the matching case. This is a shorthand when you already know the case of an ADT object.

@adt will also automatically generate a pattern-matching method, which can be used when you don't know which case you have ahead of time:

    Result = TypeVar('Result')

def match(self,
empty: Callable[[], Result],
integer: Callable[[int], Result],
string_pair: Callable[[str, str], Result]) -> Result:
if ... self was constructed as EMPTY ...:
return empty()
elif ... self was constructed as INTEGER ...:
return integer(self.integer())
elif ... self was constructed as STRING_PAIR ...:
return string_pair(*self.string_pair())

# if pattern match is incomplete, an exception is raised


See the library's tests for examples of using these generated methods.

@adt will also generate __repr__, __str__, and __eq__ methods (only if they are not defined already), to make ADTs convenient to use by default.

## Custom methods

Arbitrary methods can be defined on ADTs by simply including them in the class definition as normal.

For example, to build "safe" versions of the default accessors on ExampleADT, which return None instead of throwing an exception when the case is incorrect:

@adt
EMPTY: Case
INTEGER: Case[int]
STRING_PAIR: Case[str, str]

@property
def safe_integer(self) -> Optional[int]:
return self.match(empty=lambda: None,
integer=lambda n: n,
string_pair=lambda _a, _b: None)

@property
def safe_string_pair(self) -> Optional[Tuple[str, str]]:
return self.match(empty=lambda: None,
integer=lambda n: None,
string_pair=lambda a, b: (a, b))


However, additional fields must not be added to the class, as the decorator will attempt to interpret them as ADT Cases (which will fail).

## Project details

This version 0.2.1 0.2 0.1.3 0.1.2 0.1.1 0.1.0

Uploaded source
Uploaded py3