A command line tool to randomly sample k items from an input S containing n items.
Project description
reservoir-sampling-cli
======================
A command line tool to randomly sample k items from an input S containing n items.
> Reservoir sampling is a family of randomized algorithms for randomly choosing a sample of k items from a list S containing n items, where n is either a very large or unknown number.
> --<cite><http://en.wikipedia.org/wiki/Reservoir_sampling></cite>
Installation
------------
pip install -e git+ssh://git@github.com/RyanBalfanz/preservoir-sampling-cli.git#egg=resamp
Usage
-----
Show help message
$ resamp -h
usage: Randomly sample k items from an input S containing n items.
[-h] [-k NUM_ITEMS] [--preserve-order]
[infile] [outfile]
positional arguments:
infile
outfile
optional arguments:
-h, --help show this help message and exit
-k NUM_ITEMS, --num-items NUM_ITEMS
An integer number giving the size of the reservoir
--preserve-order Preserve input ordering
Sample 10 words from /usr/share/dict/words preserving the original order
$ cat /usr/share/dict/words | resamp -k10 --preserve-order
Paralipomenon
frankalmoign
hauntingly
hellion
laniiform
lithify
semicollapsible
sniveled
stolkjaerre
unaloud
======================
A command line tool to randomly sample k items from an input S containing n items.
> Reservoir sampling is a family of randomized algorithms for randomly choosing a sample of k items from a list S containing n items, where n is either a very large or unknown number.
> --<cite><http://en.wikipedia.org/wiki/Reservoir_sampling></cite>
Installation
------------
pip install -e git+ssh://git@github.com/RyanBalfanz/preservoir-sampling-cli.git#egg=resamp
Usage
-----
Show help message
$ resamp -h
usage: Randomly sample k items from an input S containing n items.
[-h] [-k NUM_ITEMS] [--preserve-order]
[infile] [outfile]
positional arguments:
infile
outfile
optional arguments:
-h, --help show this help message and exit
-k NUM_ITEMS, --num-items NUM_ITEMS
An integer number giving the size of the reservoir
--preserve-order Preserve input ordering
Sample 10 words from /usr/share/dict/words preserving the original order
$ cat /usr/share/dict/words | resamp -k10 --preserve-order
Paralipomenon
frankalmoign
hauntingly
hellion
laniiform
lithify
semicollapsible
sniveled
stolkjaerre
unaloud
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file reservoir-sampling-cli-0.1.tar.gz
.
File metadata
- Download URL: reservoir-sampling-cli-0.1.tar.gz
- Upload date:
- Size: 1.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e901fe41b6f9b407fa116ac8b4df38518ec8203907f5537f318d67c662a9bf12 |
|
MD5 | a5e0706e62b22c9bd0e9154b1cabd66a |
|
BLAKE2b-256 | ee86fef0d5af1038e2cfa1d766da67e68ad9bc43b779fce0a0b30ee27edfd7c0 |
File details
Details for the file reservoir_sampling_cli-0.1-py27-none-any.whl
.
File metadata
- Download URL: reservoir_sampling_cli-0.1-py27-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 2.7
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c17e1239c16283420ce57f5d219e73384d42aab809e8de04332f327f677c6ea5 |
|
MD5 | 5533ad22e4671aa82d5d72364dfc1fd7 |
|
BLAKE2b-256 | 2f44c4ecbd9528bf5f2937c3ffb133192433d3056e4af4aaddefae94c698f514 |