Label encoder backed by pandas
Project description
Pandas-powered LabelEncoder
Performance benchmark
From the test, compare to sklearn's LabelEncoder.
Total rows: 24,123,464
Scikit-learn's LabelEncoder - 13.35 seconds
Pandas-powered LabelEncoder - 2.44 seconds
Usage
Installation
pip install pandas-label-encoder
Initiation and fitting
import pandas_label_encoder as ec
from pandas_label_encoder import EncoderCategoryError
categories = ['Cat', 'Dog', 'Bird'] # can be pd.Series, np.array, list
# Fit at inititation
animal_encoder = ec.Encoder(categories)
# Fit later
animal_encoder = ec.Encoder()
animal_encoder.fit(categories)
animal_encoder.categories # ['Cat', 'Dog', 'Bird'], read-only
# Trying to use functions before assign appropiate categories will raise EncoderCategoryError
ec.Encoder().transform() # Raise EncoderCategoryError
ec.Encoder().inverse_transform() # Raise EncoderCategoryError
Transform
- Unknown categories would be parsed as -1
- If you want to raise an error, there are 2 validation options.
- validation=
all
-- Raise EncoderError if any result is -1 - validation=
any
-- Raise EncoderError if all of them are -1
- validation=
from pandas_label_encoder import EncoderValidationError
animal_encoder.transform(['Cat']) # [2]
animal_encoder.transform(['Fish']) # [-1]
animal_encoder.transform(['Fish'], validation='all') # Raise EncoderValidationError
animal_encoder.transform(['Fish'], validation='any') # Raise EncoderValidationError
try:
animal_encoder.transform(['Fish', 'Cat'], validation='all') # Raise EncoderValidationError
except EncoderError:
print('There is an unknown animal.')
animal_encoder.transform(['Fish', 'Cat'], validation='any') # [-1, 2]
Inverse transform
- Unknown categories would be parsed as NaN
- If you want to raise an error, there are 2 validation options.
- validation=
all
-- Raise EncoderError if any result is NaN - validation=
any
-- Raise EncoderError if all of them are NaN
- validation=
from pandas_label_encoder import EncoderValidationError
animal_encoder.inverse_transform([2]) # ['Cat']
animal_encoder.inverse_transform([9]) # [NaN]
animal_encoder.inverse_transform([9], validation='all') # Raise EncoderValidationError
animal_encoder.inverse_transform([9], validation='any') # Raise EncoderValidationError
try:
animal_encoder.inverse_transform([9, 2], validation='all') # Raise EncoderValidationError
except EncoderError:
print('There is an unknown animal.')
animal_encoder.inverse_transform([9, 2], validation='any') # [NaN, 'Cat']
Save and load the encoder
The load_encoder and encoder.Encoder.load methods will load the encoder and check for the encoder version.
Different encoder version may have some changes that cause errors.
To check current encoder version, use encoder.Encoder.__version__
.
from pandas_label_encoder import save_encoder, load_encoder
# Save or load other encoder directly from the encoder itself
animal_encoder.save(path) # save current encoder
animal_encoder.load(path) # load other encoder and assign to current encoder
# Save or load other encoder by using functions
animal_encoder = load_encoder(path)
save_encoder(path)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for pandas_label_encoder-1.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e21d36993b90fe85e7a679ac607c03506c4dfbd5e698521e9a22136350e73b3 |
|
MD5 | 42cda95b6e52a530909d5c33ef759367 |
|
BLAKE2b-256 | c40063ed3f15b935d652e616e74e49377f8d0c7f2d1d816ec547127f4ef1a7ab |
Close
Hashes for pandas_label_encoder-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12c8d9bdc5a1c3fdb3686a7f4ddde99776decf24f0339144793d0e308c24a94e |
|
MD5 | 9b59726841a1c64e4b4d29954b5ac2c9 |
|
BLAKE2b-256 | e8a8b8714cd50a60f7bf06bcd9befb2275adeeb48316e5ed4451a4b61be7d5f0 |