Skip to main content

A python package to optimally encode a list

Project description

Description

This is an easy to use library to encode categorical data in a feature into optimized set of features with each categorical value mapping to a unique bitstring.

>>> import optiEncoder
>>> enc = optiEncoder.Encoder(["France","Canada","England"])
>>> print("Mappings : ", enc.getMappings())
{'France':[0,0],'Canada':[0,1],'England':[1,0]}
>>> print("Encoded List : ", enc.getEncodedList())
[[0,0],[0,1],[1,0]]

Usage in Data Preprocessing

>>> import optiEncoder
>>> import pandas
>>> d = pd.read_csv('data.csv').dropna()
        Performance Measure  BRATS 2018  
0          Dice Coefficient       90  
1       Jaccard Coefficient       80  
2            Area under ROC       90  
4        Hausdorff Distance       10  
5               Sensitivity       90  
6               Specificity       90  
7                 F-Measure       90  
8                 Precision       80  
9   Vol Similarity Distance       90  
10                  Fallout        7  
12                       TP     1900  
13                       FP      200  
14                       TN     2500  
15                       FN      600  

>>> enc = optiEncoder.Encoder(list('Performance Measure'))
>>> enc.getEncodedList()
[[1, 0, 0, 1], [1, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 1], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 0], [1, 0, 1, 0], [0, 0, 1, 1], [1, 0, 1, 1], [1, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [1, 1, 0, 0], [1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 0, 0], [1, 0, 0, 0], [1, 1, 0, 0]]
>>> d = d.iloc[:,1:]
>>> d
    BRATS 2018  
0        90  
1        80  
2        90  
4        10  
5        90  
6        90  
7        90  
8        80  
9        90  
10        7  
12     1900  
13      200  
14     2500  
15      600  

>>> encodedList = enc.getEncodedList()
>>> for i in range(0,len(encodedList[0])):
...     d[str(i)]=pd.DataFrame(encodedList).iloc[:,i]
...
>>> d
    BRATS 2018  0  1  2  3
0        90  1  0  0  1
1        80  1  1  0  0
2        90  1  0  0  0
4        10  0  1  0  0
5        90  1  0  0  0
6        90  0  1  1  0
7        90  1  0  1  0
8        80  0  0  1  1
9        90  1  0  1  1
10        7  1  1  0  0
12     1900  0  0  1  0
13      200  1  1  0  0
14     2500  1  0  1  0
15      600  0  1  0  1

License

MIT

Author

Sahil Ahuja

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-optiEncoder-2.0.1.tar.gz (2.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page