A command line tool to randomly sample k items from an input S containing n items.
Project description
reservoir-sampling-cli
======================
A command line tool to randomly sample k items from an input S containing n items.
> Reservoir sampling is a family of randomized algorithms for randomly choosing a sample of k items from a list S containing n items, where n is either a very large or unknown number.
> --<cite><http://en.wikipedia.org/wiki/Reservoir_sampling></cite>
Installation
------------
pip install -e git+ssh://git@github.com/RyanBalfanz/preservoir-sampling-cli.git#egg=resamp
Usage
-----
Show help message
$ resamp -h
usage: Randomly sample k items from an input S containing n items.
[-h] [-k NUM_ITEMS] [--preserve-order]
[infile] [outfile]
positional arguments:
infile
outfile
optional arguments:
-h, --help show this help message and exit
-k NUM_ITEMS, --num-items NUM_ITEMS
An integer number giving the size of the reservoir
--preserve-order Preserve input ordering
Sample 10 words from /usr/share/dict/words preserving the original order
$ cat /usr/share/dict/words | resamp -k10 --preserve-order
Paralipomenon
frankalmoign
hauntingly
hellion
laniiform
lithify
semicollapsible
sniveled
stolkjaerre
unaloud
======================
A command line tool to randomly sample k items from an input S containing n items.
> Reservoir sampling is a family of randomized algorithms for randomly choosing a sample of k items from a list S containing n items, where n is either a very large or unknown number.
> --<cite><http://en.wikipedia.org/wiki/Reservoir_sampling></cite>
Installation
------------
pip install -e git+ssh://git@github.com/RyanBalfanz/preservoir-sampling-cli.git#egg=resamp
Usage
-----
Show help message
$ resamp -h
usage: Randomly sample k items from an input S containing n items.
[-h] [-k NUM_ITEMS] [--preserve-order]
[infile] [outfile]
positional arguments:
infile
outfile
optional arguments:
-h, --help show this help message and exit
-k NUM_ITEMS, --num-items NUM_ITEMS
An integer number giving the size of the reservoir
--preserve-order Preserve input ordering
Sample 10 words from /usr/share/dict/words preserving the original order
$ cat /usr/share/dict/words | resamp -k10 --preserve-order
Paralipomenon
frankalmoign
hauntingly
hellion
laniiform
lithify
semicollapsible
sniveled
stolkjaerre
unaloud
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for reservoir-sampling-cli-0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e901fe41b6f9b407fa116ac8b4df38518ec8203907f5537f318d67c662a9bf12 |
|
MD5 | a5e0706e62b22c9bd0e9154b1cabd66a |
|
BLAKE2b-256 | ee86fef0d5af1038e2cfa1d766da67e68ad9bc43b779fce0a0b30ee27edfd7c0 |
Close
Hashes for reservoir_sampling_cli-0.1-py27-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c17e1239c16283420ce57f5d219e73384d42aab809e8de04332f327f677c6ea5 |
|
MD5 | 5533ad22e4671aa82d5d72364dfc1fd7 |
|
BLAKE2b-256 | 2f44c4ecbd9528bf5f2937c3ffb133192433d3056e4af4aaddefae94c698f514 |