Skip to main content

Swipe through data once with a comb that will pick up the points you're looking for.

Project description

swipe

Swipe through data once, with a comb that will pick up the points you're looking for.

Think single pass search.

Think k nearest neighbors, but brute force, but with multiple targets.

Extremely light weight: Pure python -- no other dependencies.

Example

The main function of swipe is highest_score_swipe. The other objects are there to support you in making your own kind of swiping function.

What highest_score_swipe does is get the k items from the iterable it who score the highest with score_of. (think of score_of an inverse of key argument of the sorted python function).

As far as the output is concerned, you can acheive something about the same with

highest_score_swipe(it, score_of, k, output)

as with

output(sorted(it, key=score_of, reverse=True)[:k])

(With slightly different output functions)

The difference is that in the last one,

  • you have to fit all of the data in memory
  • you have to sort all of the data

But to get the top k elements you don't have to. You just have to scan though the data once while maintaining a list of the top items. So when there's a lot of data, highest_score_swipe will save you both memory and computation.

>>> from swipe import highest_score_swipe
>>>
>>> data = [('Christian', 12), ('Seb', 88), ('Thor', 27), ('Sylvain', 42)]

Let's see what you get out of the box (i.e. only specifying what's required, using defaults for all the rest). We'll iter(data) just to this once to show that data only has to be iterable.

>>> highest_score_swipe(iter(data))
[(('Thor', 27), ('Thor', 27))]

Now, out of the box, you don't get much, and looks a bit strange. Reason is if you don't specify k you just get the top item, and if you don't specify what score should be used to measure the "top", it'll just use python's default comparison operator which here brings ('Thor', 27) on the top because it's lexicographically the last.

And why is ('Thor', 27) repeated twice? Because it acts both as a score (the first) and a data item (the second).

Where it becomes interesting (and useful) is when you specify what score function it should use. So let's.

>>> length_of_name = lambda x: len(x[0])
>>> by_age = lambda x: x[1]
>>> highest_score_swipe(data, by_age)
[(88, ('Seb', 88))]
>>> highest_score_swipe(data, length_of_name)
[(9, ('Christian', 12))]
>>> highest_score_swipe(data, length_of_name, k=2)
[(7, ('Sylvain', 42)), (9, ('Christian', 12))]

Now let's see about that output argument. It's used to specify how you want the result to be processed before returning.

>>> highest_score_swipe(data, length_of_name, k=2, output='top_tuples')
[(9, ('Christian', 12)), (7, ('Sylvain', 42))]
>>> highest_score_swipe(data, length_of_name, k=2, output='items')
[('Sylvain', 42), ('Christian', 12)]
>>> highest_score_swipe(data, length_of_name, k=2, output='scores')
[7, 9]
>>> highest_score_swipe(data, length_of_name, k=2, output='top_score_items')
[('Christian', 12), ('Sylvain', 42)]

You can also specify a custom function:

>>> highest_score_swipe(
...     data, length_of_name, k=2,
...     output=lambda km: [f"{name} (whose name has {score} letters), is {age}" for score, (name, age) in km]
... )
['Sylvain (whose name has 7 letters), is 42', 'Christian (whose name has 9 letters), is 12']

What if you wanted the indices (that is, the integer indexing the data) of the top 2 as your output? Here's a recipe for that:

>>> highest_score_swipe(
...     enumerate(data),  # enumerate the data to get a (i, item) iterator
...     lambda x: length_of_name(x[1]),  # apply your scoring function to the item
...     k=2,
...     output=lambda km: [x[1][0] for x in km]  # extract the indices
... )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swipe-0.0.6.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

swipe-0.0.6-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file swipe-0.0.6.tar.gz.

File metadata

  • Download URL: swipe-0.0.6.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.28.0 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.62.3 importlib-metadata/4.11.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.6

File hashes

Hashes for swipe-0.0.6.tar.gz
Algorithm Hash digest
SHA256 634baeb8fa3941fc9c3c380e5733001e0ee51c36b742d3a1cde1d1eaea5cee89
MD5 199403e635329bb60124eb3dc1e66b92
BLAKE2b-256 c86c3a5ec05976e7d4c1cd1c604c9e92ca4ab01355d4bdbc643b001cf02a2028

See more details on using hashes here.

File details

Details for the file swipe-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: swipe-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.28.0 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.62.3 importlib-metadata/4.11.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.6

File hashes

Hashes for swipe-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 acad193d7b7da1492331259dcee6e9fdeef2a0708caa262144a35a4c282e6b9e
MD5 5060bee92dda564392df669fe85eb0f3
BLAKE2b-256 f9ab5be1535010139ff2b8dc4ebc0cfaf2c247d8cba68287629845a12b85f2d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page