Python data structure and operations for intervals
Project description
portion  data structure and operations for intervals
The portion
library provides data structure and operations for intervals in Python 3.6+.
 Support intervals of any (comparable) objects.
 Closed or open, finite or (semi)infinite intervals.
 Interval sets (union of atomic intervals) are supported.
 Automatic simplification of intervals.
 Support comparison, transformation, intersection, union, complement, difference and containment.
 Provide test for emptiness, atomicity, overlap and adjacency.
 Discrete iterations on the values of an interval.
 Dictlike structure to map intervals to data.
 Import and export intervals to strings and to Python builtin data types.
 Heavily tested with high code coverage.
Table of contents
Installation
You can use pip
to install it, as usual: pip install portion
. This will install the latest available version from PyPI.
Prereleases are available from the master branch on GitHub
and can be installed with pip install git+https://github.com/AlexandreDecan/portion
.
Note that portion
is also available on condaforge.
You can install portion
and its development environment using pip install e .[test]
at the root of this repository. This automatically installs pytest (for the test suites) and black (for code formatting).
Documentation & usage
Interval creation
Assuming this library is imported using import portion as P
, intervals can be easily
created using one of the following helpers:
>>> P.open(1, 2)
(1,2)
>>> P.closed(1, 2)
[1,2]
>>> P.openclosed(1, 2)
(1,2]
>>> P.closedopen(1, 2)
[1,2)
>>> P.singleton(1)
[1]
>>> P.empty()
()
The bounds of an interval can be any arbitrary values, as long as they are comparable:
>>> P.closed(1.2, 2.4)
[1.2,2.4]
>>> P.closed('a', 'z')
['a','z']
>>> import datetime
>>> P.closed(datetime.date(2011, 3, 15), datetime.date(2013, 10, 10))
[datetime.date(2011, 3, 15),datetime.date(2013, 10, 10)]
Infinite and semiinfinite intervals are supported using P.inf
and P.inf
as upper or lower bounds.
These two objects support comparison with any other object.
When infinities are used as a lower or upper bound, the corresponding boundary is automatically converted to an open one.
>>> P.inf > 'a', P.inf > 0, P.inf > True
(True, True, True)
>>> P.openclosed(P.inf, 0)
(inf,0]
>>> P.closed(P.inf, P.inf) # Automatically converted to an open interval
(inf,+inf)
Intervals created with this library are Interval
instances.
An Interval
instance is a disjunction of atomic intervals each representing a single interval (e.g. [1,2]
).
Intervals can be iterated to access the underlying atomic intervals, sorted by their lower and upper bounds.
>>> list(P.open(10, 11)  P.closed(0, 1)  P.closed(20, 21))
[[0,1], (10,11), [20,21]]
>>> list(P.empty())
[]
Nested (sorted) intervals can also be retrieved with a position or a slice:
>>> (P.open(10, 11)  P.closed(0, 1)  P.closed(20, 21))[0]
[0,1]
>>> (P.open(10, 11)  P.closed(0, 1)  P.closed(20, 21))[2]
(10,11)
>>> (P.open(10, 11)  P.closed(0, 1)  P.closed(20, 21))[:2]
[0,1]  (10,11)
For convenience, intervals are automatically simplified:
>>> P.closed(0, 2)  P.closed(2, 4)
[0,4]
>>> P.closed(1, 2)  P.closed(3, 4)  P.closed(2, 3)
[1,4]
>>> P.empty()  P.closed(0, 1)
[0,1]
>>> P.closed(1, 2)  P.closed(2, 3)  P.closed(4, 5)
[1,3]  [4,5]
Note that simplification of discrete intervals is not supported by portion
(but it can be simulated though, see #24).
For example, combining [0,1]
with [2,3]
will not result in [0,3]
even if there is
no integer between 1
and 2
.
Interval bounds & attributes
An Interval
defines the following properties:

i.empty
isTrue
if and only if the interval is empty.>>> P.closed(0, 1).empty False >>> P.closed(0, 0).empty False >>> P.openclosed(0, 0).empty True >>> P.empty().empty True

i.atomic
isTrue
if and only if the interval is empty or is a disjunction of a single interval.>>> P.empty().atomic True >>> P.closed(0, 2).atomic True >>> (P.closed(0, 1)  P.closed(1, 2)).atomic True >>> (P.closed(0, 1)  P.closed(2, 3)).atomic False

i.enclosure
refers to the smallest atomic interval that includes the current one.>>> (P.closed(0, 1)  P.open(2, 3)).enclosure [0,3)
The left and right boundaries, and the lower and upper bounds of an interval can be respectively accessed
with its left
, right
, lower
and upper
attributes.
The left
and right
bounds are either P.CLOSED
or P.OPEN
.
By definition, P.CLOSED == ~P.OPEN
and viceversa.
>> P.CLOSED, P.OPEN
CLOSED, OPEN
>>> x = P.closedopen(0, 1)
>>> x.left, x.lower, x.upper, x.right
(CLOSED, 0, 1, OPEN)
By convention, empty intervals resolve to (P.inf, P.inf)
:
>>> i = P.empty()
>>> i.left, i.lower, i.upper, i.right
(OPEN, +inf, inf, OPEN)
If the interval is not atomic, then left
and lower
refer to the lower bound of its enclosure,
while right
and upper
refer to the upper bound of its enclosure:
>>> x = P.open(0, 1)  P.closed(3, 4)
>>> x.left, x.lower, x.upper, x.right
(OPEN, 0, 4, CLOSED)
One can easily check for some interval properties based on the bounds of an interval:
>>> x = P.openclosed(P.inf, 0)
>>> # Check that interval is left/right closed
>>> x.left == P.CLOSED, x.right == P.CLOSED
(False, True)
>>> # Check that interval is left/right bounded
>>> x.lower == P.inf, x.upper == P.inf
(True, False)
>>> # Check for singleton
>>> x.lower == x.upper
False
Interval operations
Interval
instances support the following operations:

i.intersection(other)
andi & other
return the intersection of two intervals.>>> P.closed(0, 2) & P.closed(1, 3) [1,2] >>> P.closed(0, 4) & P.open(2, 3) (2,3) >>> P.closed(0, 2) & P.closed(2, 3) [2] >>> P.closed(0, 2) & P.closed(3, 4) ()

i.union(other)
andi  other
return the union of two intervals.>>> P.closed(0, 1)  P.closed(1, 2) [0,2] >>> P.closed(0, 1)  P.closed(2, 3) [0,1]  [2,3]

i.complement(other)
and~i
return the complement of the interval.>>> ~P.closed(0, 1) (inf,0)  (1,+inf) >>> ~(P.open(P.inf, 0)  P.open(1, P.inf)) [0,1] >>> ~P.open(P.inf, P.inf) ()

i.difference(other)
andi  other
return the difference betweeni
andother
.>>> P.closed(0,2)  P.closed(1,2) [0,1) >>> P.closed(0, 4)  P.closed(1, 2) [0,1)  (2,4]

i.contains(other)
andother in i
hold if given item is contained in the interval. It supports intervals and arbitrary comparable values.>>> 2 in P.closed(0, 2) True >>> 2 in P.open(0, 2) False >>> P.open(0, 1) in P.closed(0, 2) True

i.adjacent(other)
tests if the two intervals are adjacent, i.e., if they do not overlap and their union form a single atomic interval. While this definition corresponds to the usual notion of adjacency for atomic intervals, it has stronger requirements for nonatomic ones since it requires all underlying atomic intervals to be adjacent (i.e. that one interval fills the gaps between the atomic intervals of the other one).>>> P.closed(0, 1).adjacent(P.openclosed(1, 2)) True >>> P.closed(0, 1).adjacent(P.closed(1, 2)) False >>> (P.closed(0, 1)  P.closed(2, 3)).adjacent(P.open(1, 2)  P.open(3, 4)) True >>> (P.closed(0, 1)  P.closed(2, 3)).adjacent(P.open(3, 4)) False >>> P.closed(0, 1).adjacent(P.open(1, 2)  P.open(3, 4)) False

i.overlaps(other)
tests if there is an overlap between two intervals.>>> P.closed(1, 2).overlaps(P.closed(2, 3)) True >>> P.closed(1, 2).overlaps(P.open(2, 3)) False
Finally, intervals are hashable as long as their bounds are hashable (and we have defined a hash value for P.inf
and P.inf
).
Comparison operators
Equality between intervals can be checked with the classical ==
operator:
>>> P.closed(0, 2) == P.closed(0, 1)  P.closed(1, 2)
True
>>> P.closed(0, 2) == P.open(0, 2)
False
Moreover, intervals are comparable using >
, >=
, <
or <=
.
These comparison operators have a different behaviour than the usual ones.
For instance, a < b
holds if all values in a
are lower than the minimal value of b
(i.e., a
is
entirely on the left of the lower bound of b
).
>>> P.closed(0, 1) < P.closed(2, 3)
True
>>> P.closed(0, 1) < P.closed(1, 2)
False
Similarly, a <= b
if all values in a
are lower than the maximal value of b
(i.e., a
is
entirely on the left of the upper bound of b
).
>>> P.closed(0, 1) <= P.closed(2, 3)
True
>>> P.closed(0, 2) <= P.closed(1, 3)
True
>>> P.closed(0, 3) <= P.closed(1, 2)
False
If an interval needs to be compared against a single value, convert the value to a singleton interval first:
>>> P.singleton(0) < P.closed(0, 10)
False
>>> P.singleton(0) <= P.closed(0, 10)
True
>>> P.singleton(5) <= P.closed(0, 10)
True
>>> P.closed(0, 1) < P.singleton(2)
True
Note that all these semantics differ from classical comparison operators.
As a consequence, the empty interval is never <
, <=
, >
nor >=
than any other interval, and
no interval is <
, >
, <=
or >=
when compared to the empty interval.
>>> e = P.empty()
>>> e < e or e > e or e <= e or e >= e
False
>>> i = P.closed(0, 1)
>>> e < i or e <= i or e > i or e >= i
False
Moreover, some nonempty intervals are also not comparable in the classical sense, as illustrated hereafter:
>>> a, b = P.closed(0, 4), P.closed(1, 2)
>>> a < b or a > b
False
>>> a <= b or a >= b
False
>>> b <= a and b >= a
True
As a general rule, if a < b
holds, then a <= b
, b > a
, b >= a
,
not (a > b)
, not (b < a)
, not (a >= b)
, and not (b <= a)
hold.
Interval transformation
Intervals are immutable but provide a replace
method to create a new interval based on the
current one. This method accepts four optional parameters left
, lower
, upper
, and right
:
>>> i = P.closed(0, 2)
>>> i.replace(P.OPEN, 1, 3, P.CLOSED)
(1,3]
>>> i.replace(lower=1, right=P.OPEN)
[1,2)
Functions can be passed instead of values. If a function is passed, it is called with the current corresponding value.
>>> P.closed(0, 2).replace(upper=lambda x: 2 * x)
[0,4]
The provided function won't be called on infinities, unless ignore_inf
is set to False
.
>>> i = P.closedopen(0, P.inf)
>>> i.replace(upper=lambda x: 10) # No change, infinity is ignored
[0,+inf)
>>> i.replace(upper=lambda x: 10, ignore_inf=False) # Infinity is not ignored
[0,10)
When replace
is applied on an interval that is not atomic, it is extended and/or restricted such that
its enclosure satisfies the new bounds.
>>> i = P.openclosed(0, 1)  P.closed(5, 10)
>>> i.replace(P.CLOSED, 1, 8, P.OPEN)
[1,1]  [5,8)
>>> i.replace(lower=4)
(4,10]
To apply arbitrary transformations on the underlying atomic intervals, intervals expose an apply
method that acts like map
.
This method accepts a function that will be applied on each of the underlying atomic intervals to perform the desired transformation.
The provided function is expected to return either an Interval
, or a 4uple (left, lower, upper, right)
.
>>> i = P.closed(2, 3)  P.open(4, 5)
>>> # Increment bound values
>>> i.apply(lambda x: (x.left, x.lower + 1, x.upper + 1, x.right))
[3,4]  (5,6)
>>> # Invert bounds
>>> i.apply(lambda x: (~x.left, x.lower, x.upper, ~x.right))
(2,3)  [4,5]
The apply
method is very powerful when used in combination with replace
.
Because the latter allows functions to be passed as parameters and ignores infinities by default, it can be
conveniently used to transform (disjunction of) intervals in presence of infinities.
>>> i = P.openclosed(P.inf, 0)  P.closed(3, 4)  P.closedopen(8, P.inf)
>>> # Increment bound values
>>> i.apply(lambda x: x.replace(upper=lambda v: v + 1))
(inf,1]  [3,5]  [8,+inf)
>>> # Intervals are still automatically simplified
>>> i.apply(lambda x: x.replace(lower=lambda v: v * 2))
(inf,0]  [16,+inf)
>>> # Invert bounds
>>> i.apply(lambda x: x.replace(left=lambda v: ~v, right=lambda v: ~v))
(inf,0)  (3,4)  (8,+inf)
>>> # Replace infinities with 10 and 10
>>> conv = lambda v: 10 if v == P.inf else (10 if v == P.inf else v)
>>> i.apply(lambda x: x.replace(lower=conv, upper=conv, ignore_inf=False))
(10,0]  [3,4]  [8,10)
Discrete iteration
The iterate
function takes an interval, and returns a generator to iterate over
the values of an interval. Obviously, as intervals are continuous, it is required to specify the
step
between consecutive values. The iteration then starts from the lower bound and ends on the upper one. Only values contained by the interval are returned this way.
>>> list(P.iterate(P.closed(0, 3), step=1))
[0, 1, 2, 3]
>>> list(P.iterate(P.closed(0, 3), step=2))
[0, 2]
>>> list(P.iterate(P.open(0, 3), step=2))
[2]
When an interval is not atomic, iterate
consecutively iterates on all underlying atomic
intervals, starting from each lower bound and ending on each upper one:
>>> list(P.iterate(P.singleton(0)  P.singleton(3)  P.singleton(5), step=2)) # Won't be [0]
[0, 3, 5]
>>> list(P.iterate(P.closed(0, 2)  P.closed(4, 6), step=3)) # Won't be [0, 6]
[0, 4]
By default, the iteration always starts on the lower bound of each underlying atomic interval.
The base
parameter can be used to change this behaviour, by specifying how the initial value to start
the iteration from must be computed. This parameter accepts a callable that is called with the lower
bound of each underlying atomic interval, and that returns the initial value to start the iteration from.
It can be helpful to deal with (semi)infinite intervals, or to align the generated values of the iterator:
>>> # Align on integers
>>> list(P.iterate(P.closed(0.3, 4.9), step=1, base=int))
[1, 2, 3, 4]
>>> # Restrict values of a (semi)infinite interval
>>> list(P.iterate(P.openclosed(P.inf, 2), step=1, base=lambda x: max(0, x)))
[0, 1, 2]
The base
parameter can be used to change how iterate
applies on unions of atomic interval, by
specifying a function that returns a single value, as illustrated next:
>>> base = lambda x: 0
>>> list(P.iterate(P.singleton(0)  P.singleton(3)  P.singleton(5), step=2, base=base))
[0]
>>> list(P.iterate(P.closed(0, 2)  P.closed(4, 6), step=3, base=base))
[0, 6]
Notice that defining base
such that it returns a single value can be extremely inefficient in
terms of performance when the intervals are "far apart" each other (i.e., when the gaps between
atomic intervals are large).
Finally, iteration can be performed in reverse order by specifying reverse=True
.
>>> list(P.iterate(P.closed(0, 3), step=1, reverse=True)) # Mind step=1
[3, 2, 1, 0]
>>> list(P.iterate(P.closed(0, 3), step=2, reverse=True)) # Mind step=2
[3, 1]
Again, this library does not make any assumption about the objects being used in an interval, as long as they
are comparable. However, it is not always possible to provide a meaningful value for step
(e.g., what would
be the step between two consecutive characters?). In these cases, a callable can be passed instead of a value.
This callable will be called with the current value, and is expected to return the next possible value.
>>> list(P.iterate(P.closed('a', 'd'), step=lambda d: chr(ord(d) + 1)))
['a', 'b', 'c', 'd']
>>> # Since we reversed the order, we changed plus to minus in step.
>>> list(P.iterate(P.closed('a', 'd'), step=lambda d: chr(ord(d)  1), reverse=True))
['d', 'c', 'b', 'a']
Map intervals to data
The library provides an IntervalDict
class, a dict
like data structure to store and query data
along with intervals. Any value can be stored in such data structure as long as it supports
equality.
>>> d = P.IntervalDict()
>>> d[P.closed(0, 3)] = 'banana'
>>> d[4] = 'apple'
>>> d
{[0,3]: 'banana', [4]: 'apple'}
When a value is defined for an interval that overlaps an existing one, it is automatically updated to take the new value into account:
>>> d[P.closed(2, 4)] = 'orange'
>>> d
{[0,2): 'banana', [2,4]: 'orange'}
An IntervalDict
can be queried using single values or intervals. If a single value is used as a
key, its behaviour corresponds to the one of a classical dict
:
>>> d[2]
'orange'
>>> d[5] # Key does not exist
Traceback (most recent call last):
...
KeyError: 5
>>> d.get(5, default=0)
0
When the key is an interval, a new IntervalDict
containing the values
for the specified key is returned:
>>> d[~P.empty()] # Get all values, similar to d.copy()
{[0,2): 'banana', [2,4]: 'orange'}
>>> d[P.closed(1, 3)]
{[1,2): 'banana', [2,3]: 'orange'}
>>> d[P.closed(2, 1)]
{[0,1]: 'banana'}
>>> d[P.closed(2, 1)]
{}
By using .get
, a default value (defaulting to None
) can be specified.
This value is used to "fill the gaps" if the queried interval is not completely
covered by the IntervalDict
:
>>> d.get(P.closed(2, 1), default='peach')
{[2,0): 'peach', [0,1]: 'banana'}
>>> d.get(P.closed(2, 1), default='peach')
{[2,1]: 'peach'}
>>> d.get(P.singleton(1), default='peach') # Key is covered, default is not used
{[1]: 'banana'}
For convenience, an IntervalDict
provides a way to look for specific data values.
The .find
method always returns a (possibly empty) Interval
instance for which given
value is defined:
>>> d.find('banana')
[0,2)
>>> d.find('orange')
[2,4]
>>> d.find('carrot')
()
The active domain of an IntervalDict
can be retrieved with its .domain
method.
This method always returns a single Interval
instance, where .keys
returns a sorted view of disjoint intervals.
>>> d.domain()
[0,4]
>>> list(d.keys())
[[0,2), [2,4]]
>>> list(d.values())
['banana', 'orange']
>>> list(d.items())
[([0,2), 'banana'), ([2,4], 'orange')]
The .keys
, .values
and .items
methods return exactly one element for each stored value (i.e., if two intervals share a value, they are merged into a disjunction), as illustrated next.
See #44 to know how to obtained a sorted list of atomic intervals instead.
>>> d = P.IntervalDict()
>>> d[P.closed(0, 1)] = d[P.closed(2, 3)] = 'peach'
>>> list(d.items())
[([0,1]  [2,3], 'peach')]
Two IntervalDict
instances can be combined together using the .combine
method.
This method returns a new IntervalDict
whose keys and values are taken from the two
source IntervalDict
. Values corresponding to nonintersecting keys are simply copied,
while values corresponding to intersecting keys are combined together using the provided
function, as illustrated hereafter:
>>> d1 = P.IntervalDict({P.closed(0, 2): 'banana'})
>>> d2 = P.IntervalDict({P.closed(1, 3): 'orange'})
>>> concat = lambda x, y: x + '/' + y
>>> d1.combine(d2, how=concat)
{[0,1): 'banana', [1,2]: 'banana/orange', (2,3]: 'orange'}
Resulting keys always correspond to an outer join. Other joins can be easily simulated
by querying the resulting IntervalDict
as follows:
>>> d = d1.combine(d2, how=concat)
>>> d[d1.domain()] # Left join
{[0,1): 'banana', [1,2]: 'banana/orange'}
>>> d[d2.domain()] # Right join
{[1,2]: 'banana/orange', (2,3]: 'orange'}
>>> d[d1.domain() & d2.domain()] # Inner join
{[1,2]: 'banana/orange'}
Finally, similarly to a dict
, an IntervalDict
also supports len
, in
and del
, and defines
.clear
, .copy
, .update
, .pop
, .popitem
, and .setdefault
.
For convenience, one can export the content of an IntervalDict
to a classical Python dict
using
the as_dict
method. This method accepts an optional atomic
parameter (whose default is False
). When set to True
, the keys of the resulting dict
instance are atomic intervals.
Import & export intervals to strings
Intervals can be exported to string, either using repr
(as illustrated above) or with the to_string
function.
>>> P.to_string(P.closedopen(0, 1))
'[0,1)'
The way string representations are built can be easily parametrized using the various parameters supported by
to_string
:
>>> params = {
... 'disj': ' or ',
... 'sep': '  ',
... 'left_closed': '<',
... 'right_closed': '>',
... 'left_open': '..',
... 'right_open': '..',
... 'pinf': '+oo',
... 'ninf': 'oo',
... 'conv': lambda v: '"{}"'.format(v),
... }
>>> x = P.openclosed(0, 1)  P.closed(2, P.inf)
>>> P.to_string(x, **params)
'.."0"  "1"> or <"2"  +oo..'
Similarly, intervals can be created from a string using the from_string
function.
A conversion function (conv
parameter) has to be provided to convert a bound (as string) to a value.
>>> P.from_string('[0, 1]', conv=int) == P.closed(0, 1)
True
>>> P.from_string('[1.2]', conv=float) == P.singleton(1.2)
True
>>> converter = lambda s: datetime.datetime.strptime(s, '%Y/%m/%d')
>>> P.from_string('[2011/03/15, 2013/10/10]', conv=converter)
[datetime.datetime(2011, 3, 15, 0, 0),datetime.datetime(2013, 10, 10, 0, 0)]
Similarly to to_string
, function from_string
can be parametrized to deal with more elaborated inputs.
Notice that as from_string
expects regular expression patterns, we need to escape some characters.
>>> s = '.."0"  "1"> or <"2"  +oo..'
>>> params = {
... 'disj': ' or ',
... 'sep': '  ',
... 'left_closed': '<',
... 'right_closed': '>',
... 'left_open': r'\.\.', # from_string expects regular expression patterns
... 'right_open': r'\.\.', # from_string expects regular expression patterns
... 'pinf': r'\+oo', # from_string expects regular expression patterns
... 'ninf': 'oo',
... 'conv': lambda v: int(v[1:1]),
... }
>>> P.from_string(s, **params)
(0,1]  [2,+inf)
When a bound contains a comma or has a representation that cannot be automatically parsed with from_string
,
the bound
parameter can be used to specify the regular expression that should be used to match its representation.
>>> s = '[(0, 1), (2, 3)]' # Bounds are expected to be tuples
>>> P.from_string(s, conv=eval, bound=r'\(.+?\)')
[(0, 1),(2, 3)]
Import & export intervals to Python builtin data types
Intervals can also be exported to a list of 4uples with to_data
, e.g., to support JSON serialization.
P.CLOSED
and P.OPEN
are represented by Boolean values True
(inclusive) and False
(exclusive).
>>> P.to_data(P.openclosed(0, 2))
[(False, 0, 2, True)]
The values used to represent positive and negative infinities can be specified with
pinf
and ninf
. They default to float('inf')
and float('inf')
respectively.
>>> x = P.openclosed(0, 1)  P.closedopen(2, P.inf)
>>> P.to_data(x)
[(False, 0, 1, True), (True, 2, inf, False)]
The function to convert bounds can be specified with the conv
parameter.
>>> x = P.closedopen(datetime.date(2011, 3, 15), datetime.date(2013, 10, 10))
>>> P.to_data(x, conv=lambda v: (v.year, v.month, v.day))
[(True, (2011, 3, 15), (2013, 10, 10), False)]
Intervals can be imported from such a list of 4tuples with from_data
.
The same set of parameters can be used to specify how bounds and infinities are converted.
>>> x = [(True, (2011, 3, 15), (2013, 10, 10), False)]
>>> P.from_data(x, conv=lambda v: datetime.date(*v))
[datetime.date(2011, 3, 15),datetime.date(2013, 10, 10))
Changelog
This library adheres to a semantic versioning scheme. See CHANGELOG.md for the list of changes.
Contributions
Contributions are very welcome! Feel free to report bugs or suggest new features using GitHub issues and/or pull requests.
License
Distributed under LGPLv3  GNU Lesser General Public License, version 3.
You can refer to this library using:
@software{portion,
author = {Decan, Alexandre},
title = {portion: Python data structure and operations for intervals},
url = {https://github.com/AlexandreDecan/portion},
}
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.