Skip to main content

Like sorted but using external sorting so that large data sets can be sorted.

Project description

https://travis-ci.org/moagstar/xsorted.svg?branch=master https://coveralls.io/repos/github/moagstar/xsorted/badge.svg?branch=master

xsorted

Like sorted but using external sorting so that large data sets can be sorted, for example:

>>> from random import random
>>> from six.moves import xrange
>>> from xsorted import xsorted
>>> nums = (random() for _ in xrange(pow(10, 7)))
>>> for x in xsorted(nums): pass

The only restriction is that the items must be pickleable (or you can provide your own serializer for externalizing partitions of items).

Motivation

It is sometimes necessary to sort a dataset without having to load the entire set into memory. For example, if you want to group a very large csv file by one of it’s columns. There are several ways in which this can be achieved, a common solution is to use the unix command sort. However unix sort does not offer the flexibility of the python csv module. xsorted attempts to generalize external sorting of any python iterable in a similar way in which sorted generalises the sorting of any iterable.

Installation

$ pip install xsorted

Usage

Just like sorted

>>> from xsorted import xsorted
>>> ''.join(xsorted('qwertyuiopasdfghjklzxcvbnm'))
'abcdefghijklmnopqrstuvwxyz'

With reverse

>>> ''.join(xsorted('qwertyuiopasdfghjklzxcvbnm', reverse=True))
'zyxwvutsrqponmlkjihgfedcba'

And a custom key

>>> list(xsorted(('qwerty', 'uiop', 'asdfg', 'hjkl', 'zxcv', 'bnm'), key=lambda x: x[1]))
['uiop', 'hjkl', 'bnm', 'asdfg', 'qwerty', 'zxcv']

The implementation details of xsorted can be customized using the factory xsorter (in order to provide the same interface as sorted the partition_size is treated as an implementation detail):

>>> from xsorted import xsorter
>>> xsorted_custom = xsorter(partition_size=4)
>>> ''.join(xsorted_custom('qwertyuiopasdfghjklzxcvbnm'))
'abcdefghijklmnopqrstuvwxyz'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xsorted-0.0.4.tar.gz (119.9 kB view details)

Uploaded Source

File details

Details for the file xsorted-0.0.4.tar.gz.

File metadata

  • Download URL: xsorted-0.0.4.tar.gz
  • Upload date:
  • Size: 119.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for xsorted-0.0.4.tar.gz
Algorithm Hash digest
SHA256 874971cb334ead12475a8b3879f3b32ac556baaae1ccffa112d9d3206f8171d9
MD5 3779eb7a5ca80fbb622aecf9f4fa9c6d
BLAKE2b-256 f0f5e6b846f9e4b58ee09921fc47c04604d1716712712d48010f437ff03b8b30

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page