Editable interval tree data structure for Python 2 and 3
Project description
intervaltree
A mutable, selfbalancing interval tree for Python 2 and 3. Queries may be by point, by range overlap, or by range envelopment.
This library was designed to allow tagging text and time intervals, where the intervals include the lower bound but not the upper bound.
Version 3 changes!
 The
search(begin, end, strict)
method no longer exists. Instead, use one of these:at(point)
overlap(begin, end)
envelop(begin, end)
 The
extend(items)
method no longer exists. Instead, useupdate(items)
.  Methods like
merge_overlaps()
which took astrict
argument consistently default tostrict=True
. Before, some methods defaulted toTrue
and others toFalse
.
Installing
pip install intervaltree
Features

Supports Python 2.7 and Python 3.5+ (Tested under 2.7, and 3.5 thru 3.8)

Initializing
 blank
tree = IntervalTree()
 from an iterable of
Interval
objects (tree = IntervalTree(intervals)
)  from an iterable of tuples (
tree = IntervalTree.from_tuples(interval_tuples)
)
 blank

Insertions
tree[begin:end] = data
tree.add(interval)
tree.addi(begin, end, data)

Deletions
tree.remove(interval)
(raisesValueError
if not present)tree.discard(interval)
(quiet if not present)tree.removei(begin, end, data)
(short fortree.remove(Interval(begin, end, data))
)tree.discardi(begin, end, data)
(short fortree.discard(Interval(begin, end, data))
)tree.remove_overlap(point)
tree.remove_overlap(begin, end)
(removes all overlapping the range)tree.remove_envelop(begin, end)
(removes all enveloped in the range)

Point queries
tree[point]
tree.at(point)
(same as previous)

Overlap queries
tree[begin:end]
tree.overlap(begin, end)
(same as previous)

Envelop queries
tree.envelop(begin, end)

Membership queries
interval_obj in tree
(this is fastest, O(1))tree.containsi(begin, end, data)
tree.overlaps(point)
tree.overlaps(begin, end)

Iterable
for interval_obj in tree:
tree.items()

Sizing
len(tree)
tree.is_empty()
not tree
tree.begin()
(thebegin
coordinate of the leftmost interval)tree.end()
(theend
coordinate of the rightmost interval)

Setlike operations

union
result_tree = tree.union(iterable)
result_tree = tree1  tree2
tree.update(iterable)
tree = other_tree

difference
result_tree = tree.difference(iterable)
result_tree = tree1  tree2
tree.difference_update(iterable)
tree = other_tree

intersection
result_tree = tree.intersection(iterable)
result_tree = tree1 & tree2
tree.intersection_update(iterable)
tree &= other_tree

symmetric difference
result_tree = tree.symmetric_difference(iterable)
result_tree = tree1 ^ tree2
tree.symmetric_difference_update(iterable)
tree ^= other_tree

comparison
tree1.issubset(tree2)
ortree1 <= tree2
tree1 <= tree2
tree1.issuperset(tree2)
ortree1 > tree2
tree1 >= tree2
tree1 == tree2


Restructuring
chop(begin, end)
(slice intervals and remove everything betweenbegin
andend
, optionally modifying the data fields of the choppedup intervals)slice(point)
(slice intervals atpoint
)split_overlaps()
(slice at all interval boundaries, optionally modifying the data field)merge_overlaps()
(joins overlapping intervals into a single interval, optionally merging the data fields)merge_equals()
(joins intervals with matching ranges into a single interval, optionally merging the data fields)

Copying and typecasting
IntervalTree(tree)
(Interval
objects are same as those in tree)tree.copy()
(Interval
objects are shallow copies of those in tree)set(tree)
(can later be fed intoIntervalTree()
)list(tree)
(ditto)

Picklefriendly

Automatic AVL balancing
Examples

Getting started
>>> from intervaltree import Interval, IntervalTree >>> t = IntervalTree() >>> t IntervalTree()

Adding intervals  any object works!
>>> t[1:2] = "12" >>> t[4:7] = (4, 7) >>> t[5:9] = {5: 9}

Query by point
The result of a query is a
set
object, so if ordering is important, you must sort it first.>>> sorted(t[6]) [Interval(4, 7, (4, 7)), Interval(5, 9, {5: 9})] >>> sorted(t[6])[0] Interval(4, 7, (4, 7))

Query by range
Note that ranges are inclusive of the lower limit, but noninclusive of the upper limit. So:
>>> sorted(t[2:4]) []
Since our search was over
2 ≤ x < 4
, neitherInterval(1, 2)
norInterval(4, 7)
was included. The first interval,1 ≤ x < 2
does not includex = 2
. The second interval,4 ≤ x < 7
, does includex = 4
, but our search interval excludes it. So, there were no overlapping intervals. However:>>> sorted(t[1:5]) [Interval(1, 2, '12'), Interval(4, 7, (4, 7))]
To only return intervals that are completely enveloped by the search range:
>>> sorted(t.envelop(1, 5)) [Interval(1, 2, '12')]

Accessing an
Interval
object>>> iv = Interval(4, 7, (4, 7)) >>> iv.begin 4 >>> iv.end 7 >>> iv.data (4, 7) >>> begin, end, data = iv >>> begin 4 >>> end 7 >>> data (4, 7)

Constructing from lists of intervals
We could have made a similar tree this way:
>>> ivs = [(1, 2), (4, 7), (5, 9)] >>> t = IntervalTree( ... Interval(begin, end, "%d%d" % (begin, end)) for begin, end in ivs ... )
Or, if we don't need the data fields:
>>> t2 = IntervalTree(Interval(*iv) for iv in ivs)
Or even:
>>> t2 = IntervalTree.from_tuples(ivs)

Removing intervals
>>> t.remove(Interval(1, 2, "12")) >>> sorted(t) [Interval(4, 7, '47'), Interval(5, 9, '59')] >>> t.remove(Interval(500, 1000, "Doesn't exist")) # raises ValueError Traceback (most recent call last): ValueError >>> t.discard(Interval(500, 1000, "Doesn't exist")) # quietly does nothing >>> del t[5] # same as t.remove_overlap(5) >>> t IntervalTree()
We could also empty a tree entirely:
>>> t2.clear() >>> t2 IntervalTree()
Or remove intervals that overlap a range:
>>> t = IntervalTree([ ... Interval(0, 10), ... Interval(10, 20), ... Interval(20, 30), ... Interval(30, 40)]) >>> t.remove_overlap(25, 35) >>> sorted(t) [Interval(0, 10), Interval(10, 20)]
We can also remove only those intervals completely enveloped in a range:
>>> t.remove_envelop(5, 20) >>> sorted(t) [Interval(0, 10)]

Chopping
We could also chop out parts of the tree:
>>> t = IntervalTree([Interval(0, 10)]) >>> t.chop(3, 7) >>> sorted(t) [Interval(0, 3), Interval(7, 10)]
To modify the new intervals' data fields based on which side of the interval is being chopped:
>>> def datafunc(iv, islower): ... oldlimit = iv[islower] ... return "oldlimit: {0}, islower: {1}".format(oldlimit, islower) >>> t = IntervalTree([Interval(0, 10)]) >>> t.chop(3, 7, datafunc) >>> sorted(t)[0] Interval(0, 3, 'oldlimit: 10, islower: True') >>> sorted(t)[1] Interval(7, 10, 'oldlimit: 0, islower: False')

Slicing
You can also slice intervals in the tree without removing them:
>>> t = IntervalTree([Interval(0, 10), Interval(5, 15)]) >>> t.slice(3) >>> sorted(t) [Interval(0, 3), Interval(3, 10), Interval(5, 15)]
You can also set the data fields, for example, reusing
datafunc()
from above:>>> t = IntervalTree([Interval(5, 15)]) >>> t.slice(10, datafunc) >>> sorted(t)[0] Interval(5, 10, 'oldlimit: 15, islower: True') >>> sorted(t)[1] Interval(10, 15, 'oldlimit: 5, islower: False')
Future improvements
See the issue tracker on GitHub.
Based on
 Eternally Confuzzled's AVL tree
 Wikipedia's Interval Tree
 Heavily modified from Tyler Kahn's Interval Tree implementation in Python (GitHub project)
 Incorporates contributions from:
 konstantint/Konstantin Tretyakov of the University of Tartu (Estonia)
 siniG/Avi Gabay
 lmcarril/Luis M. Carril of the Karlsruhe Institute for Technology (Germany)
 depristo/MarkDePristo
Copyright
 Chaim Leib Halbert, 20132020
 Modifications, Konstantin Tretyakov, 2014
Licensed under the Apache License, version 2.0.
The source code for this project is at https://github.com/chaimleib/intervaltree
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.