Skip to main content

Relation tools for Python.

Project description

https://travis-ci.org/ymoch/reltools.svg?branch=master https://codecov.io/gh/ymoch/reltools/branch/master/graph/badge.svg https://badge.fury.io/py/reltools.svg https://img.shields.io/badge/python-3.5+-blue.svg

Relation tools for Python. This relates two data (sorted by certain keys) like SQL joining.

Inspired by itertools.groupby, as long as input data are sorted, almost all processes are evaluated lazily, which results in the reduction of memory usage. This feature is for the big data joining without any SQL engines.

Installation

Install with pip.

pip install reltools

Samples

One-To-Many

Here is a sample for one-to-many relations using relate_one_to_many. Input collections are sorted in 1st and 2nd keys.

>>> lhs = [
...     (1, 'a', 's'),
...     (2, 'a', 't'),
...     (3, 'b', 'u'),
... ]
>>> rhs = [
...     (1, 'a', 'v'),
...     (1, 'b', 'w'),
...     (3, 'b', 'x'),
... ]
>>> from reltools import relate_one_to_many
>>> one_to_many_related = relate_one_to_many(lhs, rhs)
>>> for left, right in one_to_many_related:
...     left, list(right)
((1, 'a', 's'), [(1, 'a', 'v'), (1, 'b', 'w')])
((2, 'a', 't'), [])
((3, 'b', 'u'), [(3, 'b', 'x')])

You can use custom keys for all API functions.

>>> import operator
>>> custom_key = operator.itemgetter(0, 1)
>>> one_to_many_related = relate_one_to_many(
...     lhs, rhs, lhs_key=custom_key, rhs_key=custom_key)
>>> for left, right in one_to_many_related:
...     left, list(right)
((1, 'a', 's'), [(1, 'a', 'v')])
((2, 'a', 't'), [])
((3, 'b', 'u'), [(3, 'b', 'x')])

Left Outer Join

Here is a sample for SQL left outer joining using left_join. While SQL left joining returns all the combinations, this returns the pair of items. Note that the right can empty, like SQL left joining.

>>> from reltools import left_join
>>> lhs = [(1, 'a'), (1, 'b'), (2, 'c'), (4, 'd')]
>>> rhs = [(1, 's'), (1, 't'), (3, 'u'), (4, 'v')]
>>> relations = left_join(lhs, rhs)
>>> for left, right in relations:
...     list(left), list(right)
([(1, 'a'), (1, 'b')], [(1, 's'), (1, 't')])
([(2, 'c')], [])
([(4, 'd')], [(4, 'v')])

Right Outer Join

Right outer join is not supported because it is left-and-right-opposite of left joining. Use left_join(rhs, lhs, rhs_key, lhs_key).

Full Outer Join

An original feature that outer_join provides. In contrast to left_join, full outer joining preserve keys that are only in rhs.

>>> from reltools import outer_join
>>> lhs = [(1, 'a'), (1, 'b'), (2, 'c'), (4, 'd')]
>>> rhs = [(1, 's'), (1, 't'), (3, 'u'), (4, 'v')]
>>> relations = outer_join(lhs, rhs)
>>> for left, right in relations:
...     list(left), list(right)
([(1, 'a'), (1, 'b')], [(1, 's'), (1, 't')])
([(2, 'c')], [])
([], [(3, 'u')])
([(4, 'd')], [(4, 'v')])

Inner Join

Here is a sample for SQL inner joining using inner_join. In contrast to left_join, right cannot be empty, like SQL inner joining.

>>> from reltools import inner_join
>>> relations = inner_join(lhs, rhs)
>>> for left, right in relations:
...     list(left), list(right)
([(1, 'a'), (1, 'b')], [(1, 's'), (1, 't')])
([(4, 'd')], [(4, 'v')])

Many-To-Many

SQL-like many-to-many relationing using an internal table is not supported. This is because reltools supports only sorted data and does not prefer random accessing. To achieve many-to-many relationing, unnormalize data on preproceing and use outer joining or inner joining.

License

https://img.shields.io/badge/License-MIT-brightgreen.svg

Copyright (c) 2018 Yu MOCHIZUKI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reltools-0.9.2.tar.gz (5.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page