Leave one out encoding of categorical features
Project description
leave-one-out-encoder
Leave one out coding for categorical features
See the source for this project here: https://github.com/welfare520/leave-one-out-encoder.
Getting Started
Installing
$ pip install loo_encoder
Example
Fit encoder according to X and y, and then transform it.
from loo_encoder.encoder import LeaveOneOutEncoder
import pandas as pd
import numpy as np
enc = LeaveOneOutEncoder(cols=['gender', 'country'], handle_unknown='impute', sigma=0.02, random_state=42)
X = pd.DataFrame(
{
"gender": ["male", "male", "female", "male"],
"country": ["Germany", "USA", "USA", "UK"],
"clicks": [10, 33, 47, 21]
}
)
y = pd.Series([150, 250, 300, 100], name="orders")
df_train = enc.fit_transform(X=X, y=y, sample_weight=X['clicks'])
Perform the transformation to new categorical data.
X_val = pd.DataFrame(
{
"gender": ["unknown", "male", "female", "male"],
"country": ["Germany", "USA", "Germany", "Japan"]
}
)
df_test = enc.transform(X=X_val)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
loo_encoder-0.0.9.tar.gz
(3.7 kB
view hashes)
Built Distribution
Close
Hashes for loo_encoder-0.0.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd9cd2836cdf95b561928ead4d7d37c450680c385ec4c4b436c6f30354a8c47c |
|
MD5 | b853013ed785bdb661beb27607a63d42 |
|
BLAKE2b-256 | 3af437eeb3d8feb8c55c07b6460e8dd9c46ff0aa1f780afe9a1a884734444230 |