Skip to main content

Language Integrated Query for Python

Project description

Build Status License codecov Coverage Status PyPI version

Install

pip install -U Linq

Additional: Some magic here: Mix Linq with Pipe

Here is an example to get top 10 frequent pixels in a picture.

from linq import Flow
import numpy as np

def most_frequent(arr: np.ndarray) -> np.ndarray:
    return  Flow(arr.flatten())                     \
                    .GroupBy(lambda _: _)           \
                    .Then(lambda x: x.items())      \
                    .Map(lambda k, v: (k, len(v)))  \
                    .Sorted(by=lambda _, v: -v)     \
                    .Take(10)                       \
                    .Map(lambda k, _: k)            \
                    .ToList()                       \
                    .Then(np.array).Unboxed()

About Linq

The well-known EDSL in .NET, Language Integrated Query, in my opinion, is one of the best design in .NET environment.
Here is an example of C# Linq.
// Calculate MSE loss.
/// <param name="Prediction"> the prediction of the neuron network</param>
/// <param name="Expected"> the expected target of the neuron network</param>

Prediction.Zip(Expected, (pred, expected)=> Math.Square(pred-expected)).Average()

It’s so human readable and it doesn’t cost much.

And there are so many scenes very awkward to Python programmer, using Linq might help a lot.

Awkward Scenes in Python

seq1 = range(100)
seq2 = range(100, 200)
zipped = zip(seq1, seq2)
mapped = map(lambda ab: ab[0] / ab[1], zipped)
grouped = dict();
group_fn = lambda x: x // 0.2
for e in mapped:
    group_id = group_fn(e)
    if group_id not in grouped:
        grouped[group_id] = [e]
        continue
    grouped[group_id].append(e)
for e in grouped.items():
    print(e)

The codes seems to be too long…

Now we extract the function group_by:

def group_by(f, container):
    grouped = dict()
    for e in container:
        group_id = f(e)
        if group_id not in grouped:
            grouped[group_id] = [e]
            continue
        grouped[group_id].append(e)
    return grouped
res = group_by(lambda x: x//0.2, map(lambda ab[0]/ab[1], zip(seq1, seq2)))

Okay, it’s not at fault, however, it makes me upset —— why do I have to write these ugly codes?

Now, let us try Linq!

from linq import Flow, extension_std
seq = Flow(range(100))
res = seq.Zip(range(100, 200)).Map(lambda fst, snd : fst/snd).GroupBy(lambda num: num//0.2).Unboxed()

How does Linq.py work?

There is a core class object, linq.core.flow.Flow, which just has one member stream.
When you want to get a specific extension method from Flow object, the type of its stream member will be used to search whether the extension method exists.
In other words, extension methods are binded with the type(precisely, {type.__module__}.{type.__name__}).
class Flow:
    __slots__ = ['stream']

    def __init__(self, sequence):
        self.stream = sequence

    def __getattr__(self, k):
        for cls in self.stream.__class__.__mro__:
            namespace = Extension['{}.{}'.format(cls.__module__, cls.__name__)]
            if k in namespace:
                return partial(namespace[k], self)
        raise NameError(
            "No extension method named `{}` for {}.".format(
                k, '{}.{}'.format(object.__module__, object.__name__)))

    def __str__(self):
        return self.stream.__str__()

    def __repr__(self):
        return self.__str__()

Extension Method

Here are three methods for you to do so.

  • Firstly, you can use extension_std to add extension methods for all Flow objects.
  • Next, you use extension_class(cls: type) to add extension methods for all Flow objects whose member stream’s type is named {cls.__module}.{cls.__name__}.
  • Finally, you can use extension_class(cls_name: str,  of_module='builtins') to add extension methods for all Flow objects whose member stream’s type is named is named {of_module}.{cls_name}.

(This way to make extension methods is for the implicit types in Python, each of which cannot be got except from its instances’ meta member __class__.)

@extension_std  # For all Flow objects
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

@extension_class(int) # Just for type `int`
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

@extension_class_name('int',  of_module=int.__module__) # Also for type `int`.
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

Documents of Standard Extension Methods

Note: Docs haven’t been finished yet.

How to Contribute

Feel free to pull requests here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
Linq-0.3.1-py3-none-any.whl (12.7 kB) Copy SHA256 hash SHA256 Wheel 3.6

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page