Skip to main content

Tools that make working with scikit-learn and pandas easier.

Project description

# Otto Group BI Data Science Toolbox

NOTE: THIS IS NOT YET RELEASE READY, PLEASE BE PATIENT.

This repository contains tools that make working with [scikit-learn](http://scikit-learn.org/) and [pandas](http://pandas.pydata.org/) easier.

[![Build Status](https://travis-ci.org/ottogroup/dstoolbox.svg?branch=master)](https://travis-ci.org/ottogroup/dstoolbox)

## What is this?

dstoolbox is not one big tool but rather an amalgamation of small re-usable tools. They are intended to work well with scikit-learn and pandas make the integration of those libraries easier.

The best way to get started is to have a look at the [notebooks folder](https://github.com/ottogroup/dstoolbox/tree/master/notebooks), especially at the [showcase notebook](https://github.com/ottogroup/dstoolbox/blob/master/notebooks/Showcase.ipynb).

The tools included here are used by us at Otto Group BI for our production services, as well as by individual members for machine learning related things, such as participating in Kaggle competitions.

## Installtion instructions

TODO.

## Development

  • Python 3 only.

  • Code should be re-usable and succinct.

  • Where applicable, it should be compatible with [scikit-learn](http://scikit-learn.org/), [pandas](http://pandas.pydata.org/), and [Palladium](https://github.com/ottogroup/palladium).

  • It should be documented and unit-tested using pytest (100% code coverage desired).

  • It should conform to the coding standards prescribed by pylint (where it makes sense).

  • There should be usage examples that cover the most common use cases (the best place would be an IPython/Jupyter notebook).

  • Keep dependencies down.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dstoolbox-0.5.1.tar.gz (298.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page