Skip to main content

A python package to optimally encode a list

Project description

Description

This is an easy to use library to encode categorical data in a feature into optimized set of features with each categorical value mapping to a unique bitstring.

>>> import optiEncoder
>>> enc = optiEncoder.Encoder(["France","Canada","England"])
>>> print("Mappings : ", enc.getMappings())
{'France':[0,0],'Canada':[0,1],'England':[1,0]}
>>> print("Encoded List : ", enc.getEncodedList())
[[0,0],[0,1],[1,0]]

Usage in Data Preprocessing

>>> import optiEncoder
>>> import pandas
>>> d = pd.read_csv('data.csv').dropna()
        Performance Measure  BRATS 2018  
0          Dice Coefficient       90  
1       Jaccard Coefficient       80  
2            Area under ROC       90  
4        Hausdorff Distance       10  
5               Sensitivity       90  
6               Specificity       90  
7                 F-Measure       90  
8                 Precision       80  
9   Vol Similarity Distance       90  
10                  Fallout        7  
12                       TP     1900  
13                       FP      200  
14                       TN     2500  
15                       FN      600  

>>> enc = optiEncoder.Encoder(list('Performance Measure'))
>>> enc.getEncodedList()
[[1, 0, 0, 1], [1, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 1], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 0], [1, 0, 1, 0], [0, 0, 1, 1], [1, 0, 1, 1], [1, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [1, 1, 0, 0], [1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 0, 0], [1, 0, 0, 0], [1, 1, 0, 0]]
>>> d = d.iloc[:,1:]
>>> d
    BRATS 2018  
0        90  
1        80  
2        90  
4        10  
5        90  
6        90  
7        90  
8        80  
9        90  
10        7  
12     1900  
13      200  
14     2500  
15      600  

>>> encodedList = enc.getEncodedList()
>>> for i in range(0,len(encodedList[0])):
...     d[str(i)]=pd.DataFrame(encodedList).iloc[:,i]
...
>>> d
    BRATS 2018  0  1  2  3
0        90  1  0  0  1
1        80  1  1  0  0
2        90  1  0  0  0
4        10  0  1  0  0
5        90  1  0  0  0
6        90  0  1  1  0
7        90  1  0  1  0
8        80  0  0  1  1
9        90  1  0  1  1
10        7  1  1  0  0
12     1900  0  0  1  0
13      200  1  1  0  0
14     2500  1  0  1  0
15      600  0  1  0  1

License

MIT

Author

Sahil Ahuja

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-optiEncoder-2.0.1.tar.gz (2.9 kB view details)

Uploaded Source

File details

Details for the file python-optiEncoder-2.0.1.tar.gz.

File metadata

  • Download URL: python-optiEncoder-2.0.1.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.3

File hashes

Hashes for python-optiEncoder-2.0.1.tar.gz
Algorithm Hash digest
SHA256 260ae84df80984b01843c18e6f4397c304fd171352b7c766f392308c125ce817
MD5 cfe3e25a8f991e827667d12447984088
BLAKE2b-256 2908bbb5fc574a7711c09262e49435cdaef9471e1286ee5946b5d1a0a3b096ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page