Join the official 2020 Python Developers Survey

My Bayes algorithm, for the name of Thomas Bayes.

## Project description

# Bayes Classifier

## Principle

### Naive Bayes

$$p(c|x)=\frac{p(x|c)p(c)}{p(x)}\sim p(x|c)p(c)\\ \sim \prod_ip(x_i|c)p(c) = \prod_ip(x_i,c)p(c)^{1-n}~~~~~~~~~\text{(Naive condition)}\\ \sim\prod_i\frac{N(x_i,c)}{N}p(c)^{1-n}$$

### Semi Naive Bayes

$$p(c|x,y)=\sim p(x|c)p(c|y)\\ \sim \prod_ip(x_i|c)p(c|y) ~~~~~~~~~\text{(Semi-Naive condition)}$$

where $p(c|y)​$ will be estimated by say of neural networks.

### Hemi Naive Bayes, in more general form

When $y$ is empty, it is equiv. to the naive one.

$$p(c|x,y_1,\cdots y_m) \sim \prod_ip(x_i|c)\prod_ip(c|y_i)p(c)^{1-m} ~~~~~~~~~~(Hemi-condition)\\ \sim \prod_ip(x_i|c)\prod_if_c(y_i)p(c)^{1-m}\\ \sim \prod_ip(x_i,c)\prod_if_c(y_i)p(c)^{1-m-n}$$

## Predict

$$\frac{p(c|x,y)}{p(c'|x,y)}= \prod_i(\frac{p(x_i|c)}{p(x_i|c')})\frac{p(c|y)}{p(c'|y)}\\ = \prod_i(\frac{p(x_i,c)}{p(x_i,c')})\frac{p(c|y)}{p(c'|y)}(\frac{p(c')}{p(c)})^n ~~~~~~~~~\text{(Semi-Naive condition)}\\ \sim \prod_i(\frac{N(x_i,c)}{N(x_i,c')})\frac{p(c|y)}{p(c'|y)}(\frac{N(c')}{N(c)})^n ~~~~~~~~~~~~~~~~~~~~~\text{(estimate)}$$

$$\frac{p(c|x,y_1,\cdots, y_m)}{p(c'|x,y_1,...,y_m)}\sim ... (\frac{N(c')}{N(c)})^{n+m-1}\prod_i\frac{p(c|y_i)}{p(c'|y_i)} ~~~~~~~~~(\text{Hemi-condition})$$

### 0-1 cases

$$r = \frac{p(1|x,y)}{p(0|x,y)}\sim \prod_i(\frac{N(x_i,1)}{N(x_i,0)})\frac{p(1|y)}{1-p(1|y)}(\frac{N(0)}{N(1)})^n (Semi)\\ r \sim \prod_i(\frac{N(x_i,1)}{N(x_i,0)})\prod_i\frac{p(1|y_i)}{1-p(1|y_i)}(\frac{N(0)}{N(1)})^{n+m-1} (Hemi)$$

iff $r\geq 1$, $(x,y)$ is in class 1, else in class 0.

## Estimate (for continuous rv)

$p(x)\sim \frac{N(x)}{N}, N(x):$ the number of samples in a neighborhood of $x$

## Project details

This version 0.1.1 0.1.0