Dolvin's Math and Stats Library

These details have not been verified by PyPI

Project description

Dolvins

This project provides a set of functions and classes for optimization, probability, and statistical analysis, with a focus on handling multi-dimensional data, hyperplanes, and distribution analysis.

Installation
Usage
Examples
License

Installation

Dolvins is built on the following packages:

psutil
numpy
pandas
tqdm
scipy

To install Dolvins automatically with all its dependencies, please run:


pip install dolvins

Usage

General Math Functions

`next_power_of_two(x: int) -> int`

Returns the next power of two greater than or equal to x.

Arguments:

x (int): The input number.

Returns:

int: The next power of two.

Example:


x = 5

next_power = next_power_of_two(x)

print(next_power)



>> 8

`round_down_to_nearest_power_of_two(x: int) -> int`

Rounds down x to the nearest power of two.

Arguments:

x (int): The input number.

Returns:

int: The nearest power of two.

Example:


x = 10

nearest_power = round_down_to_nearest_power_of_two(x)

print(nearest_power)



>> 8

`gcd_of_list(numbers: list) -> int`

Returns the GCD of a list of numbers.

Arguments:

numbers (list): A list of integers.

Returns:

int: The GCD of the list.

Example:


numbers = [12, 15, 21]

gcd_result = gcd_of_list(numbers)

print(gcd_result)



>> 3

Mathematical Objects

`Hyperplane`

A class representing a hyperplane.

Methods:

__init__(self, normal: np.array, coef: float)

Initializes a Hyperplane object with a normal vector and coefficient.

Arguments:
- normal (np.array): The normal vector to the hyperplane.
- coef (float): The coefficient of the hyperplane.
project_point(self, *point: float) -> np.array

Projects a point onto the hyperplane.

Arguments:
- point (float): The vector/point to project.
Returns:
- np.array: The projected point.

Example:


normal = np.array([1, 1, 1])

coef = 3

hyperplane = Hyperplane(normal, coef)

projected_point = hyperplane.project_point(2, 4, 0)

print(projected_point)



>> np.array([1, 2, 0])

Probability and Random Variables Functions

`sterlings_approximation(n: int) -> float`

Returns an approximation of n! using Sterling's approximation.

Arguments:

n (int): The input number.

Returns:

float: The approximate factorial of n.

Example:


n = 10

approx_factorial = sterlings_approximation(n)

print(approx_factorial)



>>> 3598695.6187410373

`permutate(n: int, r: int) -> int`

Calculates permutations of n objects taken r at a time (using Sterling's if n is too large)

Arguments:

n (int): Number of objects.
r (int): Number you are choosing where order matters.

Returns:

int: n permutate r.

Example:


n = 5

r = 3

perm_result = permutate(n, r)

print(perm_result)



>> 60

`combinate(n: int, r: int) -> int`

Calculates combinations of n objects taken r at a time where order does not matter.

Arguments:

n (int): Number of objects.
r (int): Number you are choosing.

Returns:

int: n combinate r.

Example:


n = 5

r = 3

comb_result = combinate(n, r)

print(comb_result)



>> 10

`discrete_distribution_prob(exp: pd.Series, obs: pd.Series) -> float`

Calculates the exact probability of observing the observed distribution given the expected distribution. Note: scale does not matter (i.e., the sum of obs vs. the sum of exp does not matter as the exp is converted to a probability)

Arguments:

exp (pd.Series): The ground truth (expected) distribution.
obs (pd.Series): The observed distribution.

Returns:

float: The probability of observing the distribution.

Example:


exp = pd.Series([50, 50, 50])

obs = pd.Series([2, 1, 2])

prob = discrete_distribution_prob(exp, obs)

print(prob)



>>> 0.1234

`generate_combinations(num_classes: int, num_obs: int) -> set`

Returns a set of all possible combinations of num_classes integers that add up to num_obs.

Arguments:

num_classes (int): Number of classes to choose from.
num_obs (int): Total number the classes should sum.

Returns:

set: The set of all possible combinations.

Example:


num_classes = 2

num_obs = 4

combinations = generate_combinations(num_classes, num_obs)

print(combinations)



>> {(0, 4), (1, 3), (2, 2), (3, 1), (4, 0)}

`generate_normal_exponent(mean: float, std_dev: float) -> Callable`

Generates a function representing the exponent of a normal distribution with the specified mean and standard deviation.

Arguments:

mean (float): Mean (mu) of the normal distribution.
std_dev (float): Standard deviation (sigma) of the normal distribution.

Returns:

Callable: A function representing the exponent.

Example:


mean = 0

std_dev = 1

normal_exp = generate_normal_exponent(mean, std_dev)

normal_exp = the functional equivalent to $- \frac{1}{2} \cdot (\frac{x - \mu}{\sigma})^2$ where $\mu$ = mean and $\sigma$ = std_dev

`generate_joint_pdf(exp: pd.Series, num_obs: int) -> Callable`

Generates a joint probability density function (PDF) for all possible outcomes based on the expected distribution and the total number of observations.

Arguments:

exp (pd.Series): The ground truth (expected) distribution.
num_obs (int): The number of observations.

Returns:

Callable: The joint PDF function.

Explanation:

Approximates each classes distribution with a Normal PDF
Multiplies each classes approximation to get a Joint PDF

Example:


exp = pd.Series([4, 6])

num_obs = 100

joint_pdf = generate_joint_pdf(exp, num_obs)

joint_pdf = the functional equivalent to $\frac{1}{\sqrt(2\cdot\pi\cdot40\cdot\frac{6}{10})\sqrt(2\cdot\pi\cdot60\cdot\frac{4}{10})} \cdot e^{- \frac{1}{2} \cdot (\frac{x - 40}{\sqrt(40\cdot\frac{6}{10}})^2 - \frac{1}{2} \cdot (\frac{y - 60}{\sqrt(60\cdot\frac{4}{10}})^2}$

Calculus Functions

`hyperplane_integration(f: Callable, hyperplane: list, max_val: float = None, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = 42, pbar: Callable = None) -> float`

Integrates the PDF over an N-d hyperplane using quasi-Monte Carlo integration (Sobol sampling) - Currently only supports integration in the positive quadrant.

Arguments:

f (Callable): The function to integrate.
hyperplane (object): The hyperplane over which to integrate.
max_val (float): The max value at which to cap integration (defaulted to None) - any region in which the function goes beyond that value is not counted.
chunk_size (int): The amount of samples to handle at one time (defaulted to auto).
random_state (int): Random state to use to ensure the integration is deterministic.
pbar (tqdm): Progress bar to update with every chunk completed (defaulted to None)

Returns:

float: The result of integration.

Example:


f = lambda x, y, z: x + y + z

hyperplane = Hyperplane(normal=np.array([1, 1, 1]), coef=3)

result = hyperplane_integration(f, hyperplane)

print(result)



>> 13.5

Distribution Analysis Functions

`E(exp: pd.Series, obs: pd.Series, approximate: bool, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = None) -> float`

Performs an E-test on an expected distribution and observed distribution.

Arguments:

exp (pd.Series): The expected (ground-truth) distribution.
obs (pd.Series): The observed distribution.
approximate (bool): If False, the exact discrete probability is calculated; if True, an approximate is calculated based on continuous probability.
chunk_size (int): The amount of samples to do simultaneously (defaulted to "auto").
num_samples (int): The number of samples to calculate in total - lower is faster but less precise.
random_state (int): If specified, leads to deterministic results.

Returns:

float: The E-value.

Explanation:

The E-test seeks to generate a more interpretable and accurate probability value (p-value) for testing the statistical difference between two distributions
The E-test assumes the expected and observed distributions are identical, and under those assumptions, calculates an E-value which is the probability of receiving a distribution more Extreme or as Extreme than that which has been observed.
Thus, the lower the E-value (i.e., the lower the chances of receiving a distribution that extreme if the distributions were in fact identical), the greater the indication that the distributions are different
The exact E-value can be calculated using discrete probability, however, an continuous probability estimate must be calculated in cases where there are many observations
Note: time complexity in either case is exponential so while continuous can approximate larger observations, it may take a significant amount of time for massive samples without some method of scaling them down (to be researched)

Example:


exp = pd.Series([50, 50, 50])

obs = pd.Series([300, 300, 300])

e_value = E(exp, obs, approximate=True)

print(e_value)



>> 1.0





exp = pd.Series([50, 0, 0])

obs = pd.Series([100, 0, 0])

e_value = E(exp, obs, approximate=True)

print(e_value)



>> 0





exp = pd.Series([15, 15, 15])

obs = pd.Series([155, 145, 150])

e_value = E(exp, obs, approximate=True)

print(e_value)



>> 0.77743

License

This project is licensed under the MIT License.

This README file provides detailed documentation for each function and class, including arguments, return values, and example usage. You can adjust the details based on your specific project and needs.

Written with StackEdit.

Project details

These details have not been verified by PyPI

Development Status
- 4 - Beta
Intended Audience
- Science/Research
Operating System
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.6 yanked

Aug 8, 2024

This version

0.0.5

Aug 6, 2024

0.0.4

Aug 6, 2024

0.0.3

Aug 5, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dolvins-0.0.5.tar.gz (12.7 kB view details)

Uploaded Aug 6, 2024 Source

Built Distribution

dolvins-0.0.5-py3-none-any.whl (10.0 kB view details)

Uploaded Aug 6, 2024 Python 3

File details

Details for the file dolvins-0.0.5.tar.gz.

File metadata

Download URL: dolvins-0.0.5.tar.gz
Upload date: Aug 6, 2024
Size: 12.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for dolvins-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`181b55814fc89cd09737549802f2b7375e852360dda11d15b9b850e3975f1214`
MD5	`a1c0b708166c2f19bc87a3c3fa957dd5`
BLAKE2b-256	`ec40a1487b354e93c59759568416907b5938d24ff8703f190fdce0ec753e0e8a`

See more details on using hashes here.

File details

Details for the file dolvins-0.0.5-py3-none-any.whl.

File metadata

Download URL: dolvins-0.0.5-py3-none-any.whl
Upload date: Aug 6, 2024
Size: 10.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for dolvins-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`56b797545052c72a587bd0ed8bdadd8c4891cf2bac20bddccfa23cf5702fce2b`
MD5	`ecbd90746bddb4184695185585bc6161`
BLAKE2b-256	`914968cf40c094c0af27a3143ecf2ee5faab27ab9925fb7f36c3d81441b45528`

See more details on using hashes here.

dolvins 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Dolvins

Table of Contents

Installation

Usage

General Math Functions

next_power_of_two(x: int) -> int

round_down_to_nearest_power_of_two(x: int) -> int

gcd_of_list(numbers: list) -> int

Mathematical Objects

Hyperplane

Probability and Random Variables Functions

sterlings_approximation(n: int) -> float

permutate(n: int, r: int) -> int

combinate(n: int, r: int) -> int

discrete_distribution_prob(exp: pd.Series, obs: pd.Series) -> float

generate_combinations(num_classes: int, num_obs: int) -> set

generate_normal_exponent(mean: float, std_dev: float) -> Callable

generate_joint_pdf(exp: pd.Series, num_obs: int) -> Callable

Calculus Functions

hyperplane_integration(f: Callable, hyperplane: list, max_val: float = None, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = 42, pbar: Callable = None) -> float

Distribution Analysis Functions

E(exp: pd.Series, obs: pd.Series, approximate: bool, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = None) -> float

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`next_power_of_two(x: int) -> int`

`round_down_to_nearest_power_of_two(x: int) -> int`

`gcd_of_list(numbers: list) -> int`

`Hyperplane`

`sterlings_approximation(n: int) -> float`

`permutate(n: int, r: int) -> int`

`combinate(n: int, r: int) -> int`

`discrete_distribution_prob(exp: pd.Series, obs: pd.Series) -> float`

`generate_combinations(num_classes: int, num_obs: int) -> set`

`generate_normal_exponent(mean: float, std_dev: float) -> Callable`

`generate_joint_pdf(exp: pd.Series, num_obs: int) -> Callable`

`hyperplane_integration(f: Callable, hyperplane: list, max_val: float = None, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = 42, pbar: Callable = None) -> float`

`E(exp: pd.Series, obs: pd.Series, approximate: bool, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = None) -> float`