Skip to main content

Language Integrated Query for Python

Project description

Build Status License codecov Coverage Status PyPI version

Install

pip install -U Linq

Additional: Some magic here: Mix Linq with Pipe

Here is an example to get top 10 frequent pixels in a picture.

from linq import Flow
import numpy as np

def most_frequent(arr: np.ndarray) -> np.ndarray:
    return  Flow(arr.flatten())                     \
                    .GroupBy(lambda _: _)           \
                    .Then(lambda x: x.items())      \
                    .Map(lambda k, v: (k, len(v)))  \
                    .Sorted(by=lambda _, v: -v)     \
                    .Take(10)                       \
                    .Map(lambda k, _: k)            \
                    .ToList()                       \
                    .Then(np.array).Unboxed()

About Linq

The well-known EDSL in .NET, Language Integrated Query, in my opinion, is one of the best design in .NET environment.
Here is an example of C# Linq.
// Calculate MSE loss.
/// <param name="Prediction"> the prediction of the neuron network</param>
/// <param name="Expected"> the expected target of the neuron network</param>

Prediction.Zip(Expected, (pred, expected)=> Math.Square(pred-expected)).Average()

It’s so human readable and it doesn’t cost much.

And there are so many scenes very awkward to Python programmer, using Linq might help a lot.

Awkward Scenes in Python

seq1 = range(100)
seq2 = range(100, 200)
zipped = zip(seq1, seq2)
mapped = map(lambda ab: ab[0] / ab[1], zipped)
grouped = dict();
group_fn = lambda x: x // 0.2
for e in mapped:
    group_id = group_fn(e)
    if group_id not in grouped:
        grouped[group_id] = [e]
        continue
    grouped[group_id].append(e)
for e in grouped.items():
    print(e)

The codes seems to be too long…

Now we extract the function group_by:

def group_by(f, container):
    grouped = dict()
    for e in container:
        group_id = f(e)
        if group_id not in grouped:
            grouped[group_id] = [e]
            continue
        grouped[group_id].append(e)
    return grouped
res = group_by(lambda x: x//0.2, map(lambda ab[0]/ab[1], zip(seq1, seq2)))

Okay, it’s not at fault, however, it makes me upset —— why do I have to write these ugly codes?

Now, let us try Linq!

from linq import Flow, extension_std
seq = Flow(range(100))
res = seq.Zip(range(100, 200)).Map(lambda fst, snd : fst/snd).GroupBy(lambda num: num//0.2).Unboxed()

How does Linq.py work?

There is a core class object, linq.core.flow.Flow, which just has one member stream.
When you want to get a specific extension method from Flow object, the type of its stream member will be used to search whether the extension method exists.
In other words, extension methods are binded with the type(precisely, {type.__module__}.{type.__name__}).
class Flow:
    __slots__ = ['stream']

    def __init__(self, sequence):
        self.stream = sequence

    def __getattr__(self, k):
        for cls in self.stream.__class__.__mro__:
            namespace = Extension['{}.{}'.format(cls.__module__, cls.__name__)]
            if k in namespace:
                return partial(namespace[k], self)
        raise NameError(
            "No extension method named `{}` for {}.".format(
                k, '{}.{}'.format(object.__module__, object.__name__)))

    def __str__(self):
        return self.stream.__str__()

    def __repr__(self):
        return self.__str__()

Extension Method

Here are three methods for you to do so.

  • Firstly, you can use extension_std to add extension methods for all Flow objects.

  • Next, you use extension_class(cls: type) to add extension methods for all Flow objects whose member stream’s type is named {cls.__module}.{cls.__name__}.

  • Finally, you can use extension_class(cls_name: str, of_module='builtins') to add extension methods for all Flow objects whose member stream’s type is named is named {of_module}.{cls_name}.

(This way to make extension methods is for the implicit types in Python, each of which cannot be got except from its instances’ meta member __class__.)

@extension_std  # For all Flow objects
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

@extension_class(int) # Just for type `int`
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

@extension_class_name('int',  of_module=int.__module__) # Also for type `int`.
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

Documents of Standard Extension Methods

Note: Docs haven’t been finished yet.

How to Contribute

Feel free to pull requests here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

Linq-0.3-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file Linq-0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for Linq-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1278f84cadeae9b2a11033e3e85b0862080b8fec0a33884ff78a29984e06e801
MD5 eb7bc0c4308aa2f68e9cc1b74f0b651c
BLAKE2b-256 a54c4048da114ae82fdd6b2ed0e540e4c491314ac75d4e26872a0e37f930f65d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page