Skip to main content

Language Integrated Query for Python

Project description

Build Status License codecov Coverage Status PyPI version

Install

pip install -U Linq

Additional: Some magic here: Mix Linq with Pipe

Here is an example to get top 10 frequent pixels in a picture.

from linq import Flow
import numpy as np

def most_frequent(arr: np.ndarray) -> np.ndarray:
    return  Flow(arr.flatten())                     \
                    .GroupBy(lambda _: _)           \
                    .Then(lambda x: x.items())      \
                    .Map(lambda k, v: (k, len(v)))  \
                    .Sorted(by=lambda _, v: -v)     \
                    .Take(10)                       \
                    .Map(lambda k, _: k)            \
                    .ToList()                       \
                    .Then(np.array).Unboxed()

About Linq

The well-known EDSL in .NET, Language Integrated Query, in my opinion, is one of the best design in .NET environment.
Here is an example of C# Linq.
// Calculate MSE loss.
/// <param name="Prediction"> the prediction of the neuron network</param>
/// <param name="Expected"> the expected target of the neuron network</param>

Prediction.Zip(Expected, (pred, expected)=> Math.Square(pred-expected)).Average()

It’s so human readable and it doesn’t cost much.

And there are so many scenes very awkward to Python programmer, using Linq might help a lot.

Awkward Scenes in Python

seq1 = range(100)
seq2 = range(100, 200)
zipped = zip(seq1, seq2)
mapped = map(lambda ab: ab[0] / ab[1], zipped)
grouped = dict();
group_fn = lambda x: x // 0.2
for e in mapped:
    group_id = group_fn(e)
    if group_id not in grouped:
        grouped[group_id] = [e]
        continue
    grouped[group_id].append(e)
for e in grouped.items():
    print(e)

The codes seems to be too long…

Now we extract the function group_by:

def group_by(f, container):
    grouped = dict()
    for e in container:
        group_id = f(e)
        if group_id not in grouped:
            grouped[group_id] = [e]
            continue
        grouped[group_id].append(e)
    return grouped
res = group_by(lambda x: x//0.2, map(lambda ab[0]/ab[1], zip(seq1, seq2)))

Okay, it’s not at fault, however, it makes me upset —— why do I have to write these ugly codes?

Now, let us try Linq!

from linq import Flow, extension_std
seq = Flow(range(100))
res = seq.Zip(range(100, 200)).Map(lambda fst, snd : fst/snd).GroupBy(lambda num: num//0.2).Unboxed()

How does Linq.py work?

There is a core class object, linq.core.flow.Flow, which just has one member stream.
When you want to get a specific extension method from Flow object, the type of its stream member will be used to search whether the extension method exists.
In other words, extension methods are binded with the type(precisely, {type.__module__}.{type.__name__}).
class Flow:
    __slots__ = ['stream']

    def __init__(self, sequence):
        self.stream = sequence

    def __getattr__(self, k):
        for cls in self.stream.__class__.__mro__:
            namespace = Extension['{}.{}'.format(cls.__module__, cls.__name__)]
            if k in namespace:
                return partial(namespace[k], self)
        raise NameError(
            "No extension method named `{}` for {}.".format(
                k, '{}.{}'.format(object.__module__, object.__name__)))

    def __str__(self):
        return self.stream.__str__()

    def __repr__(self):
        return self.__str__()

Extension Method

Here are three methods for you to do so.

  • Firstly, you can use extension_std to add extension methods for all Flow objects.

  • Next, you use extension_class(cls: type) to add extension methods for all Flow objects whose member stream’s type is named {cls.__module}.{cls.__name__}.

  • Finally, you can use extension_class(cls_name: str, of_module='builtins') to add extension methods for all Flow objects whose member stream’s type is named is named {of_module}.{cls_name}.

(This way to make extension methods is for the implicit types in Python, each of which cannot be got except from its instances’ meta member __class__.)

@extension_std  # For all Flow objects
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

@extension_class(int) # Just for type `int`
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

@extension_class_name('int',  of_module=int.__module__) # Also for type `int`.
def Add(self, i):
    return Flow(self.stream + (i.stream if isinstance(i, Flow) else i)))

Documents of Standard Extension Methods

Note: Docs haven’t been finished yet.

How to Contribute

Feel free to pull requests here.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

linq_t-0.1-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file linq_t-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for linq_t-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1ea10ebccf239515e08321bf756d0936bd28bd6b40f726192e6a7ef11044257a
MD5 22300ba7dc9d6f93e78b3e74f396ab7c
BLAKE2b-256 fa6e08f9944bd7ba3d29377744bb241fc3b6cdd19d49383f15801455b1455b4e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page