This is a sampling simulator
Project description
samplingsimulatorpy
samplingsimulatorpy is a Python package intended to assist those teaching or learning basic statistical inference.
Authors
| Name | GitHub |
|---|---|
| Holly Williams | hwilliams10 |
| Lise Braaten | lisebraaten |
| Tao Guo | tguo9 |
| Yue (Alex) Jiang | YueJiangMDSV |
Overview
This package allows users to generate virtual populations which can be sampled from in order to compare and contrast sample vs sampling distributions for different sample sizes. The package also allows users to sample from the generated virtual population (or any other population), plot the distributions, and view summaries for the parameters of interest.
Installation:
pip install -i https://test.pypi.org/simple/ samplingsimulatorpy
Function Descriptions
generate_virtual_popcreates a virtual population.- Inputs : distribution function (i.e.
np.random.lognormal,np.random.binomial, etc), the parameters required by the distribution function, and the size of the population. - Outputs: the virtual population as a tibble
- Inputs : distribution function (i.e.
draw_samplesgenerates samples of different sizes- Inputs : population to sample from, the sample size, and the number of samples
- Outputs: returns a tibble with the sample number in one column and value in a second column.
plot_sample_histcreates sample distributions for different sample sizes.- Inputs : population to sample from, the samples to plot, and a vector of the sample sizes
- Outputs: returns a grid of sample distribution plots
plot_sampling_distcreates sampling distributions for different sample sizes.- Inputs : population to sample from, the samples to plot, and a vector of the sample sizes
- Outputs: returns a grid of sampling distribution plots
stat_summary: returns a summary of the statistical parameters of interest- Inputs: population, samples, parameter(s) of interest
- Outputs: summary tibble
How do these fit into the Python ecosystem?
To the best of our knowledge, there is currently no existing Python package with the specific functionality to create virtual populations and make the specific sample and sampling distributions described above. We do make use of many existing Python packages and expand on them to make very specific functions. These include:
scipy.statsto get distribution functionsnp.randomto generate random samples- Altair to create plots
Python pandas already includes some summary statistics functions such as .describe(), however our package will be more customizable. Our summary will only include the statistical parameters of interest and will provide a comparison between the sample, sampling, and true population parameters.
Dependencies
- python = "^3.7"
- pandas = "^1.0.1"
- numpy = "^1.18.1"
- altair = "^4.0.1"
Usage
generate_virtual_pop
from samplingsimulatorpy import generate_virtual_pop
generate_virtual_pop(size, distribution_func, *para)
Arguments:
size: The number of samplesdistribution_func: The distribution that we are generating samples from*para: The arguments required for the distribution function
Example:
pop = generate_virtual_pop(100, np.random.normal, 0, 1)
draw_samples
from samplingsimulatorpy import draw_samples
draw_samples(pop, reps, n_s)
Arguments:
popthe virtual population as a data framerepsthe number of replication for each sample size as an integer valuen_sthe sample size for each one of the samples as a list
Example:
samples = draw_samples(pop, 3, [5, 10, 15, 20])
plot_sample_hist
from samplingsimulatorpy import plot_sample_hist
plot_sample_hist(pop, samples)
Arguments:
popthe virtual population as a data framesamplesthe samples as a data frame
Example:
plot_sample_hist(samples)
plot_sampling_hist
from samplingsimulatorpy import plot_sampling_hist
plot_sampling_hist(pop, samples)
Arguments:
samplesthe samples as a data frame
Example:
plot_sampling_hist(samples)
stat_summary
from samplingsimulatorpy import stat_summary
plot_sampling_hist(pop, samples, parameter)
Arguments
populationThe virtual populationsamplesThe drawed samplesparameterThe parameter(s) of interest
Example
stat_summary(pop, samples, ['np.mean', 'np.std'])
Example Usage Scenario
from samplingsimulatorpy import generate_virtual_pop,
draw_samples,
plot_sample_dist,
plot_sampling_dist,
stat_summary
# create virtual population
pop = generate_virtual_pop(100, np.random.normal, 0, 1)
# take samples
samples = draw_samples(pop, 3, [10, 20])
# plot sample histogram
plot_sample_hist(pop, samples)
# plot sampling distribution
plot_sampling_hist(samples)
# compare mean and standard deviation
stat_summary(pop, samples, ['np.mean', 'np.std'])
Documentation
The official documentation is hosted on Read the Docs: https://samplingsimulatorpy.readthedocs.io/en/latest/
Credits
This package was created with Cookiecutter and the UBC-MDS/cookiecutter-ubc-mds project template, modified from the pyOpenSci/cookiecutter-pyopensci project template and the audreyr/cookiecutter-pypackage.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file samplingsimulatorpy-0.1.0.tar.gz.
File metadata
- Download URL: samplingsimulatorpy-0.1.0.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0.post20191030 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4099c717507edabf8cd96b1bc3f5f929548488e1e82003fb3eefe7b80db86f1
|
|
| MD5 |
c50393420e16cedf90fccf9b85580f0d
|
|
| BLAKE2b-256 |
c737b2b5865760c3a254f21f587122f591976c37344d981c1da95b013626e346
|
File details
Details for the file samplingsimulatorpy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: samplingsimulatorpy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0.post20191030 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c176593ea7ac6d8e112bc8de15284ceb3dfeefd52cd1b0945b12ec59af3f6f64
|
|
| MD5 |
a137447e19da687a927fca0c3f21f66b
|
|
| BLAKE2b-256 |
227c20052b9bbc794a0d7013cb6f669a9a4bafea884866ea5f3fb0c08f1e1540
|