Skip to main content

No project description provided

Project description

Compose

"Build better training examples in a fraction of the time."

Tests Codecov ReadTheDocs PyPI Version StackOverflow PyPI Downloads


Compose is a machine learning tool for automated prediction engineering. It allows you to structure prediction problems and generate labels for supervised learning. An end user defines an outcome of interest by writing a labeling function, then runs a search to automatically extract training examples from historical data. Its result is then provided to Featuretools for automated feature engineering and subsequently to EvalML for automated machine learning. The workflow of an applied machine learning engineer then becomes:


Compose


By automating the early stage of the machine learning pipeline, our end user can easily define a task and solve it. See the documentation for more information.

Install

Compose is available on PyPI and Conda-forge for Python 3.6 or later.

pip

To install from PyPI, run the command:

pip install composeml

conda

To install from Conda-forge, run the command:

conda install -c conda-forge composeml

Example

Will a customer spend more than 300 in the next hour of transactions?

In this example, we automatically generate new training examples from a historical dataset of transactions.

import composeml as cp
df = cp.demos.load_transactions()
df = df[df.columns[:7]]
df.head()
transaction_id session_id transaction_time product_id amount customer_id device
298 1 2014-01-01 00:00:00 5 127.64 2 desktop
10 1 2014-01-01 00:09:45 5 57.39 2 desktop
495 1 2014-01-01 00:14:05 5 69.45 2 desktop
460 10 2014-01-01 02:33:50 5 123.19 2 tablet
302 10 2014-01-01 02:37:05 5 64.47 2 tablet

First, we represent the prediction problem with a labeling function and a label maker.

def total_spent(ds):
    return ds['amount'].sum()

label_maker = cp.LabelMaker(
    target_entity="customer_id",
    time_index="transaction_time",
    labeling_function=total_spent,
    window_size="1h",
)

Then, we run a search to automatically generate the training examples.

label_times = label_maker.search(
    df.sort_values('transaction_time'),
    num_examples_per_instance=2,
    minimum_data='2014-01-01',
    drop_empty=False,
    verbose=False,
)

label_times = label_times.threshold(300)
label_times.head()
customer_id time total_spent
1 2014-01-01 00:00:00 True
1 2014-01-01 01:00:00 True
2 2014-01-01 00:00:00 False
2 2014-01-01 01:00:00 False
3 2014-01-01 00:00:00 False

We now have labels that are ready to use in Featuretools to generate features.

Support

The Innovation Labs open source community is happy to provide support to users of Compose. Project support can be found in three places depending on the type of question:

  1. For usage questions, use Stack Overflow with the composeml tag.
  2. For bugs, issues, or feature requests start a Github issue.
  3. For discussion regarding development on the core library, use Slack.

Citing Compose

Compose is built upon a newly defined part of the machine learning process — prediction engineering. If you use Compose, please consider citing this paper: James Max Kanter, Gillespie, Owen, Kalyan Veeramachaneni. Label, Segment,Featurize: a cross domain framework for prediction engineering. IEEE DSAA 2016.

BibTeX entry:

@inproceedings{kanter2016label,
  title={Label, segment, featurize: a cross domain framework for prediction engineering},
  author={Kanter, James Max and Gillespie, Owen and Veeramachaneni, Kalyan},
  booktitle={2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)},
  pages={430--439},
  year={2016},
  organization={IEEE}
}

Acknowledgements

The open source development has been supported in part by DARPA's Data driven discovery of models program (D3M).

Innovation Labs

Innovation Labs

Compose has been developed and open sourced by Innovation Labs. We developed Compose to enable flexible definition of the machine learning task. To see the other open source projects we're working on visit Innovation Labs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

composeml-0.6.0.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

composeml-0.6.0-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file composeml-0.6.0.tar.gz.

File metadata

  • Download URL: composeml-0.6.0.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.9

File hashes

Hashes for composeml-0.6.0.tar.gz
Algorithm Hash digest
SHA256 44a0707f7ae4cc482858fc4674931b082167d080c0c2c0a410fa85a022b02a6e
MD5 bff47b83ac31ef7e57e8953f1a299881
BLAKE2b-256 ff4d7e16e9e2b219c455b157c77983e2b77d73ed9e2fc2b7674ccd799a01e13a

See more details on using hashes here.

File details

Details for the file composeml-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: composeml-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.9

File hashes

Hashes for composeml-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e66320c00e56d7778a042c3ab96dcb5beb53a0f08253ae9f935dab14387039fc
MD5 f7ed9ad1f2838ddf2c2ada6f58857255
BLAKE2b-256 165a8c32a1a9acfac1abb632f75396cd91d1f2638d09b8f09a7087bfd332da87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page