Skip to main content

Imputing missing values in your data science.

Project description

🎯 What is Imputr?

Imputr is an open-source library that allows users to stably impute tabular data sets with ML-based and conventional techniques. It is designed to have an extremely simple, yet extensive API, making it possible for users of all levels and tasks to deploy the library in their workflows.

🚀 Getting started

Install Imputr with PIP:

pip install imputr

AutoImputer

Here is an example of the simplest usage of the AutoImputer (our recommended workflow for newbies and intermediates), which by default automatically imputes the missing values for all columns with a modern version of the missForest algorithm.

from imputr.autoimputer import AutoImputer
import pandas as pd

# Import dataset into Pandas DataFrame
df = pd.read_csv("example.csv")

# Initialize AutoImputer with data - set exec_now=False to delay imputation 
imputer = AutoImputer(data=df)

# Retrieve imputed dataset from AutoImputer object
imputed_df = imputer.get_result()

Here you can see an example of how the AutoImputer works internally.

To see what else be done with the AutoImputer API to customise its behaviour, reference our documentation.

📕 Documentation

Multiple links to documentation:

👨🏽‍💻 Contribution

Imputr is an ever-evolving open source library and can always use contributors who want to help build with the community.

See the Contribution Jumpstart page to get started with your first contribution!


Imputr is distributed under an Apache License Version 2.0. A complete version can be found here. All future contributions will continue to be distributed under this license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imputr-0.1.0.tar.gz (15.7 kB view hashes)

Uploaded Source

Built Distribution

imputr-0.1.0-py3-none-any.whl (19.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page