datalookup

Deep nested data filtering library

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

https://img.shields.io/badge/python-3.9-blue.svg

https://github.com/pyshare/datalookup/actions/workflows/tests.yml/badge.svg

https://github.com/pyshare/datalookup/actions/workflows/linters.yml/badge.svg

https://img.shields.io/badge/code%20style-black-000000.svg

The Datalookup library makes it easier to filter and manipulate your data. The module is inspired by the Django Queryset Api and it’s lookups.

Installation

$ pip install datalookup

Example

Throughout the below examples, we’ll refer to the following data, which comprise a list of authors with the books they wrote.

data = [
    {
        "id": 1,
        "author": "J. K. Rowling",
        "books": [
            {
                "name": "Harry Potter and the Chamber of Secrets",
                "genre": "Fantasy",
                "published": "1998"
            },
            {
                "name": "Harry Potter and the Prisoner of Azkaban",
                "genre": "Fantasy",
                "published": "1999"
            }
        ]
    },
    {
        "id": 2,
        "author": "Agatha Christie",
        "books": [
            {
                "name": "And Then There Were None",
                "genre": "Mystery",
                "published": "1939"
            }
        ]
    }
]

Datalookup makes it easy to find an author by calling one of the methods of the Dataset class like filter() or exclude(). There are multiple ways to retrieve an author.

Basic filtering

Use one of the field of your author dictionary to filter your data.

from datalookup import Dataset

# Use Dataset to manipulate and filter your data
books = Dataset(data)

# Retrieve an author using the author name
authors = books.filter(author="J. K. Rowling")
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

# Retrieve an author using '__in' lookup
authors = books.filter(id__in=[2, 3])
assert len(authors) == 1
assert authors[0].author == "Agatha Christie"

# Retrieve an author using 'exclude' and '__contains' lookup
authors = books.exclude(author__contains="Christie")
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

Related field filtering

Use a related field like books separated by a __ (double-underscore) and a field of the books. Something like books__name.

# Retrieve an author using the date when the book was published
authors = books.filter(books__published="1939")
assert len(authors) == 1
assert authors[0].author == "Agatha Christie"

# Retrieve an author using '__regex' lookup
authors = books.filter(books__name__regex=".*Potter.*")
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

AND, OR - filtering

Keyword argument queries - in filter(), etc. - are “AND”ed together. If you need to execute more complex queries (for example, queries with OR statements), you can combine two filter request with “|”.

# Retrieve an author using multiple filters with a single request (AND). This
# filter use the '__icontains' lookup. Same as '__contains' but case-insensitive
authors = books.filter(books__name__icontains="and", books__genre="Fantasy")
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

# Retrieve an author by combining filters (OR)
authors = books.filter(author="Stephane Capponi") | books.filter(
    author="J. K. Rowling"
)
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

Filter nested related field

The library provides also a way to filter nested relationship. This means that you can make requests to only retrieve books in the author collection. Or you can use that output to filter the authors.

# filter_related is the method to use to filter all related nodes
related_books = books.filter_related('books', genre="Mystery")
assert len(related_books) == 1
assert related_books[0].name == "And Then There Were None"

# You can also use filter_related to filter authors.
authors = books.filter(
    books=books.filter_related('books', name__regex=".*Potter.*")
)
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

Cascade filtering

Sometimes you will want to filter the author but also the related books. It is possible to do that by calling the on_cascade() method before filtering.

# Filter the author but also the books of the author
authors = books.on_cascade().filter(
    books__name="Harry Potter and the Chamber of Secrets"
)
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"

# The books are also filtered
assert len(authors[0].books) == 1
assert authors[0].books[0].name == "Harry Potter and the Chamber of Secrets"

List of available lookups

Field lookups are used to specify how a the dataset should query the results it returns. They’re specified as keyword arguments to the Dataset methods filter() and exclude(). Basic lookups keyword arguments take the form “field__lookuptype=value”. (That’s a double-underscore).

As a convenience when no lookup type is provided (like in books.filter(id=1)) the lookup type is assumed to be exact.

# author is one of the field of the dictionary
# '__contains' is the lookup
books.filter(author__contains="Row")

Lookup	Case-insensitive lookup	Description
exact	iexact	Exact match
contains	icontains	Containment test
startswtih	istartswith	Starts with a specific string
endswith	iendswith	Ends with a specific string
regex	iregex	Regular expression match
in		In a given iterable; strings (being iterables) are accepted
gt		Grater than
gte		Greater that or equal
lt		Lower than
lte		Lower than or equal to
range		Range between two values. Integer only
isnull		Check that a field is null. Takes either True or False
contained_by		Check data is a subset of the passed values. ArrayField only
overlap		Data shares any results with the passed values. ArrayField only
len		Check length of the array. ArrayField only

Documentation

Datalookup does not stop here. The full documentation is in the docs directory or online at https://datalookup.readthedocs.io/en/latest/

Contribution

Anyone can contribute to Datalookup’s development. Checkout our documentation on how to get involved: https://datalookup.readthedocs.io/en/latest/internals/contributing.html

License

Copyright Stephane Capponi and others, 2023 Distributed under the terms of the MIT license, Datalookup is free and open source software.

Datalookup was inspired by Django and only the RegisterLookupMixin was copied. Everything else was inspired and re-interpreted. You can find the license of Django in the licenses folder.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.1

Apr 19, 2024

1.0.0

Jun 12, 2023

1.0.0rc3 pre-release

Jun 12, 2023

1.0.0rc1 pre-release

Jun 10, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datalookup-1.0.1.tar.gz (41.3 kB view hashes)

Uploaded Apr 19, 2024 Source

Built Distribution

datalookup-1.0.1-py3-none-any.whl (14.7 kB view hashes)

Uploaded Apr 19, 2024 Python 3

Hashes for datalookup-1.0.1.tar.gz

Hashes for datalookup-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`85119cc0bbefca1c85b33185bdd580b2d099e5e918a7e92c1df2b47d00cc94c4`
MD5	`0ce3282be8ee2db138f57e3ce26daa1e`
BLAKE2b-256	`d4ef3f64222107848557885d77507d3eec2cf278b049e0610a37363abff71920`

Hashes for datalookup-1.0.1-py3-none-any.whl

Hashes for datalookup-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c454dc23a482947e632380c5f022574565c65059cd718fa417fa2fb9de01c80`
MD5	`266c12df8f49661167ca1b0f4be54913`
BLAKE2b-256	`45f3d00588353777b9308acbb384219e8f905a48f58aa26a87ec473875757b70`