Sklearn transformers that work with Pandas dataframes
Project description
sklearn-pdtransform
-------------------
A little module with a few transformers to work with Pandas dataframes in the
Sklearn pipeline. Example usage:
.. code:: python
pipeline = Pipeline([
('ordinal_to_nums', DFTransform(_ordinal_to_nums, copy=True)),
('union', DFFeatureUnion([
('categorical', Pipeline([
('select', DFTransform(lambda X: X.select_dtypes(include=['object']))),
('fill_na', DFTransform(lambda X: X.fillna('NA'))),
('one_hot', DFTransform(_one_hot_encode)),
])),
('numerical', Pipeline([
('select', DFTransform(lambda X: X.select_dtypes(exclude=['object']))),
('fill_median', DFTransform(lambda X: X.fillna(X.median()))),
('add_features', DFTransform(_add_features, copy=True)),
('remove_skew', DFTransform(_remove_skew, copy=True)),
('find_outliers', DFTransform(_find_outliers, copy=True)),
('normalize', DFTransform(lambda X: X.div(X.max())))
])),
])),
])
For more information read `this blog post <http://signal-to-noise.xyz/why-you-should-use-scikit-learns-pipeline-object.html>`_.
-------------------
A little module with a few transformers to work with Pandas dataframes in the
Sklearn pipeline. Example usage:
.. code:: python
pipeline = Pipeline([
('ordinal_to_nums', DFTransform(_ordinal_to_nums, copy=True)),
('union', DFFeatureUnion([
('categorical', Pipeline([
('select', DFTransform(lambda X: X.select_dtypes(include=['object']))),
('fill_na', DFTransform(lambda X: X.fillna('NA'))),
('one_hot', DFTransform(_one_hot_encode)),
])),
('numerical', Pipeline([
('select', DFTransform(lambda X: X.select_dtypes(exclude=['object']))),
('fill_median', DFTransform(lambda X: X.fillna(X.median()))),
('add_features', DFTransform(_add_features, copy=True)),
('remove_skew', DFTransform(_remove_skew, copy=True)),
('find_outliers', DFTransform(_find_outliers, copy=True)),
('normalize', DFTransform(lambda X: X.div(X.max())))
])),
])),
])
For more information read `this blog post <http://signal-to-noise.xyz/why-you-should-use-scikit-learns-pipeline-object.html>`_.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdtransform-0.1.tar.gz
(2.1 kB
view hashes)
Built Distribution
Close
Hashes for pdtransform-0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05203e7b552fc5782ab7e8ed8abb6b22d39a14292947417e5702ae35f376cfa1 |
|
MD5 | aee6860fae7cd3b050627ae08b3a9b7a |
|
BLAKE2b-256 | 34d82acd8d05c456037af125d02d4b6b3fab0c0188aeb9f2b1f90612c32b2f22 |