Skip to main content

Sklearn transformers that work with Pandas dataframes

Project description

Installation:

$ pip install pdtransform

A little package with a few transformers to work with Pandas dataframes in the Sklearn pipeline, which I found myself writing quite frequently. Example usage:

from pdtransform import DFTransform, DFFeatureUnion

pipeline = Pipeline([
    ('ordinal_to_nums', DFTransform(_ordinal_to_nums, copy=True)),
    ('union', DFFeatureUnion([
        ('categorical', Pipeline([
            ('select', DFTransform(lambda X: X.select_dtypes(include=['object']))),
            ('fill_na', DFTransform(lambda X: X.fillna('NA'))),
            ('one_hot', DFTransform(_one_hot_encode)),
        ])),
        ('numerical', Pipeline([
            ('select', DFTransform(lambda X: X.select_dtypes(exclude=['object']))),
            ('fill_median', DFTransform(lambda X: X.fillna(X.median()))),
            ('add_features', DFTransform(_add_features, copy=True)),
            ('remove_skew', DFTransform(_remove_skew, copy=True)),
            ('find_outliers', DFTransform(_find_outliers, copy=True)),
            ('normalize', DFTransform(lambda X: X.div(X.max())))
        ])),
    ])),
])

For more information read this blog post.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdtransform-0.2.tar.gz (2.2 kB view details)

Uploaded Source

Built Distribution

pdtransform-0.2-py2.py3-none-any.whl (3.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pdtransform-0.2.tar.gz.

File metadata

  • Download URL: pdtransform-0.2.tar.gz
  • Upload date:
  • Size: 2.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pdtransform-0.2.tar.gz
Algorithm Hash digest
SHA256 1ae101f7bfdaead85269b1dec5fd796f054bec491910a6d0430504645c44c628
MD5 6a3e19899bf5ce22544a30dcbd322ac6
BLAKE2b-256 4e6ccf8b761f13811f6ba52ff41cd74730a1219c26c7b44130db5770c23cd5f5

See more details on using hashes here.

File details

Details for the file pdtransform-0.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pdtransform-0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 88359e9649cc7d530b209731b97c3120d2d303f3f122e1593e94bdbe5468dfdc
MD5 55b032f24af3eb172a8ab2bcc23093b0
BLAKE2b-256 763787df6568d24234b9d258d7a93bd549b01ef7572987dfa50a44c1c7466575

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page