Python tools to sample randomly with dont pick closest `n` elements constraints. Also contains a batch generator for the same to sample with replacement and with repeats if necessary.
Project description
Sampling Utils
Python tools to sample randomly with dont pick closest n
elements constraints.
Also contains a batch generator for the same to sample with replacement and with repeats if necessary.
Installation
Simply install using pip
pip install sampling_utils
Usage
Dont Pick Closest
from sampling_utils import sample_from_list
sample_from_list([1,2,3,4,5,6,7,8], dont_pick_closest=2)
You are guaranteed to get samples that are at least dont_pick_closest
apart# (in value, not in indices).
Here you will get samples where sample
- any_other_sample
is always greater than 2.
For example, if 2 is picked, no other item in range [2+dont_pick_closest
and 2-dont_pick_closest
] will be picked
Another example looped 5 times:
for _ in range(5):
sample_from_list([1,2,3,4,5,6,8,9,10,12,14], dont_pick_closest=2)
# Output
# [5, 10, 2, 14]
# [9, 6, 14, 1]
# [3, 8, 12]
# [10, 3, 6, 14]
# [2, 5, 8, 12]
If 12 is sampled, sampling 10 and 14 is not allowed.
#Will be called as dont_pick_closest rule hereafter.
Number of samples
You can also specify how many samples you want from the list using number_of_samples
parameter.
By default, you get maximum possible samples (without replacement).
for _ in range(5):
sample_from_list([1,2,3,4,5,6,8,9,10,12,14], dont_pick_closest=2, num_samples=2)
# Output
# [8, 2]
# [6, 3]
# [12, 1]
# [4, 10]
# [9, 1]
If you try to sample more than what's possible, you will get an error saying that it's not possible.
Min and max samples
You may want to just know how much you can sample from a given list obeying the dont_pick_closest rule
from sampling_utils import get_min_samples, get_max_samples
print(get_min_samples([1,2,3,4,5,6,8,9,10,12,14], dont_pick_closest=2))
print(get_max_samples([1,2,3,4,5,6,8,9,10,12,14], dont_pick_closest=2))
# Output
# Min 3
# Max 4
Sampling without replacement successively / Generating batches of samples for one epoch
If you want to successively sample without replacement i.e. sample as many samples from the list without repeating,
you can use batch_rand_generator
as shown below.
This is particularly useful to generate batches of data
until no more batches can be generated (equivalent to one epoch).
from sampling_utils import batch_rand_generator
from sampling_utils import get_batch_generator_elements
batch_size = 2
brg = batch_rand_generator([1,2,3,4,5,6,8,9,10,12,14], batch_size=batch_size, dont_pick_closest=2)
print(get_batch_generator_elements(brg, batch_size=batch_size))
# Output
# [[1, 4], [8, 5], [14, 3], [2, 6]]
Notice that the elements
- within each batch obey the dont_pick_closest rule (e.g. 1 and 4 from batch 1)
- from different batches need not obey the rule (e.g. 4 and 5 from batch 1 and 2 respectively).
Contributing
Pull requests are very welcome.
- Fork the repo
- Create new branch with feature name as branch name
- Check if things work with a jupyter notebook
- Raise a pull request
Licence
Please see attached Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sampling_utils-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c3a09f26fc99f54420551978c08f3bba98a92d20f6b4a641058f8aaef383ce0 |
|
MD5 | caf0d8fc8a3f2c30b291bb6c4569977e |
|
BLAKE2b-256 | e1d70dadf3199d5f7320a9cf8ce4511338e71ce279b438dcfa9d5de8b5890b87 |