bigmler

A command-line tool for BigML.io, the public BigML API

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Software Development :: Libraries :: Python Modules

Project description

BigMLer - A command-line tool for BigML’s API

BigMLer makes BigML even easier.

BigMLer wraps BigML’s API Python bindings to offer a high-level command-line script to easily create and publish datasets and models, create ensembles, make local predictions from multiple models, and simplify many other machine learning tasks. For additional information, see the full documentation for BigMLer on Read the Docs.

BigMLer is open sourced under the Apache License, Version 2.0.

Support

Please report problems and bugs to our BigML.io issue tracker.

Discussions about the different bindings take place in the general BigML mailing list. Or join us in our Campfire chatroom.

Requirements

Python 2.7 is currently supported by BigMLer.

BigMLer requires bigml 1.0.5 or higher.

BigMLer Installation

To install the latest stable release with pip:

$ pip install bigmler

You can also install the development version of bigmler directly from the Git repository:

$ pip install -e git://github.com/bigmlcom/bigmler.git#egg=bigmler

For a detailed description of install instructions on Windows see the BigMLer on Windows section.

BigML Authentication

All the requests to BigML.io must be authenticated using your username and API key and are always transmitted over HTTPS.

BigML module will look for your username and API key in the environment variables BIGML_USERNAME and BIGML_API_KEY respectively. You can add the following lines to your .bashrc or .bash_profile to set those variables automatically when you log in:

export BIGML_USERNAME=myusername
export BIGML_API_KEY=ae579e7e53fb9abd646a6ff8aa99d4afe83ac291

Otherwise, you can initialize directly when running the BigMLer script as follows:

bigmler --train data/iris.csv --username myusername --api_key ae579e7e53fb9abd646a6ff8aa99d4afe83ac291

For a detailed description of authentication instructions on Windows see the BigMLer on Windows section.

BigMLer on Windows

To install BigMLer on Windows environments, you’ll need Python for Windows (v.2.7.x) installed.

In addition to that, you’ll need the pip tool to install BigMLer. To install pip, first you need to open your command line window (write cmd in the input field that appears when you click on Start and hit enter), download this python file and execute it:

c:\Python27\python.exe distribute_setup.py

After that, you’ll be able to install pip by typing the following command:

c:\Python27\Scripts\easy_install.exe pip

And finally, to install BigMLer, just type:

c:\Python27\Scripts\pip.exe install bigmler

and BigMLer should be installed in your computer. Then issuing:

bigmler --version

should show BigMLer version information.

Finally, to start using BigMLer to handle your BigML resources, you need to set your credentials in BigML for authentication. If you want them to be permanently stored in your system, use:

setx BIGML_USERNAME myusername
setx BIGML_API_KEY ae579e7e53fb9abd646a6ff8aa99d4afe83ac291

BigML Development Mode

Also, you can instruct BigMLer to work in BigML’s Sandbox environment by using the parameter ---dev:

bigmler --train data/iris.csv --dev

Using the development flag you can run tasks under 1 MB without spending any of your BigML credits.

Using BigMLer

To run BigMLer you can use the console script directly. The –help option will describe all the available options:

bigmler --help

Alternatively you can just call bigmler as follows:

python bigmler.py --help

This will display the full list of optional arguments. You can read a brief explanation for each option below.

Quick Start

Let’s see some basic usage examples. Check the installation and authentication sections in BigMLer on Read the Docs if you are not familiar with BigML.

Basics

You can create a new model just with

bigmler --train data/iris.csv

If you check your dashboard at BigML, you will see a new source, dataset, and model. Isn’t it magic?

You can generate predictions for a test set using:

bigmler --train data/iris.csv --test data/test_iris.csv

You can also specify a file name to save the newly created predictions:

bigmler --train data/iris.csv --test data/test_iris.csv --output predictions

If you do not specify the path to an output file, BigMLer will auto-generate one for you under a new directory named after the current date and time (e.g., MonNov1212_174715/predictions.csv). With --prediction-info flag set to brief only the prediction result will be stored (default is normal and includes confidence information).

A different objective field (the field that you want to predict) can be selected using:

bigmler --train data/iris.csv --test data/test_iris.csv  --objective 'sepal length'

If you do not explicitly specify an objective field, BigML will default to the last column in your dataset.

Also, if your test file uses a particular field separator for its data, you can tell BigMLer using --test-separator. For example, if your test file uses the tab character as field separator the call should be like:

bigmler --train data/iris.csv --test data/test_iris.tsv \
        --test-separator '\t'

If you don’t provide a file name for your training source, BigMLer will try to read it from the standard input:

cat data/iris.csv | bigmler --train

BigMLer will try to use the locale of the model both to create a new source (if --train flag is used) and to interpret test data. In case it fails, it will try en_US.UTF-8 or English_United States.1252 and a warning message will be printed. If you want to change this behaviour you can specify your preferred locale:

bigmler --train data/iris.csv --test data/test_iris.csv \
--locale "English_United States.1252"

If you check your working directory you will see that BigMLer creates a file with the model ids that have been generated (e.g., FriNov0912_223645/models). This file is handy if then you want to use those model ids to generate local predictions. BigMLer also creates a file with the dataset id that has been generated (e.g., TueNov1312_003451/dataset) and another one summarizing the steps taken in the session progress: bigmler_sessions. You can also store a copy of every created or retrieved resource in your output directory (e.g., TueNov1312_003451/model_50c23e5e035d07305a00004f) by setting the flag --store.

Prior Versions Compatibility Issues

BigMLer will accept flags written with underscore as word separator like --clear_logs for compatibility with prior versions. Also --field-names is accepted, although the more complete --field-attributes flag is preferred. --stat_pruning and --no_stat_pruning are discontinued and their effects can be achived by setting the actual --pruning flag to statistical or no-pruning values respectively.

Running the Tests

To run the tests you will need to install lettuce:

$ pip install lettuce

and set up your authentication via environment variables, as explained above. With that in place, you can run the test suite simply by:

$ cd tests
$ lettuce

Additional Information

For additional information, see the full documentation for BigMLer on Read the Docs.

History

1.4.4 (2014-02-03)

Fix when using the combined method in –max-categories models. The combination function now uses confidence to choose the predicted category.
Allowing full content text fields to be also used as –max-categories objective fields.
Fix solving objective issues when its column number is zero.

1.4.3 (2014-01-28)

Adding the –objective-weights option to point to a CSV file containing the weights assigned to each class.
Adding the –label-aggregates option to create new aggregate fields on the multi label fields such as count, first or last.

1.4.2 (2014-01-24)

Fix in local random forests’ predictions. Sometimes the fields used in all the models were not correctly retrieved and some predictions could be erroneus.

1.4.1 (2014-01-23)

Fix to allow the input data for multi-label predictions to be expanded.
Fix to retrieve from the models definition info the labels that were given by the user in its creation in multi-label models.

1.4.0 (2014-01-20)

Adding new –balance option to automatically balance all the classes evenly.
Adding new –weight-field option to use the field contents as weights for the instances.

1.3.0 (2014-01-17)

Adding new –source-attributes, –ensemble-attributes, –evaluation-attributes and –batch-prediction-attributes options.
Refactoring –multi-label resources to include its related info in the user_metadata attribute.
Refactoring the main routine.
Adding –batch-prediction-tag for delete operations.

1.2.3 (2014-01-16)

Fix to transmit –training-separator when creating remote sources.

1.2.2 (2014-01-14)

Fix for multiple multi-label fields: headers did not match rows contents in some cases.

1.2.1 (2014-01-12)

Fix for datasets generated using the –new-fields option. The new dataset was not used in model generation.

1.2.0 (2014-01-09)

Adding –multi-label-fields to provide a comma-separated list of multi-label fields in a file.

1.1.0 (2014-01-08)

Fix for ensembles’ local predictions when order is used in tie break.
Fix for duplicated model ids in models file.
Adding new –node-threshold option to allow node limit in models.
Adding new –model-attributes option pointing to a JSON file containing model attributes for model creation.

1.0.1 (2014-01-06)

Fix for missing modules during installation.

1.0 (2014-01-02)

Adding the –max-categories option to handle datasets with a high number of categories.
Adding the –method combine option to produce predictions with the sets of datasets generated using –max-categories option.
Fixing problem with –max-categories when the categorical field is not a preferred field of the dataset.
Changing the –datasets option behaviour: it points to a file where dataset ids are stored, one per line, and now it reads all of them to be used in model and ensemble creation.

0.7.2 (2013-12-20)

Adding confidence to predictions output in full format

0.7.1 (2013-12-19)

Bug fixing: multi-label predictions failed when the –ensembles option is used to provide the ensemble information

0.7.0 (2013-11-24)

Bug fixing: –dataset-price could not be set.
Adding the threshold combination method to the local ensemble.

0.6.1 (2013-11-23)

Bug fixing: –model-fields option with absolute field names was not compatible with multi-label classification models.
Changing resource type checking function.
Bug fixing: evaluations did not use the given combination method.
Bug fixing: evaluation of an ensemble had turned into evaluations of its

models.
Adding pruning to the ensemble creation configuration options

0.6.0 (2013-11-08)

Changing fields_map column order: previously mapped dataset column number to model column number, now maps model column number to dataset column number.
Adding evaluations to multi-label models.
Bug fixing: unicode characters greater than ascii-127 caused crash in multi-label classification

0.5.0 (2013-10-08)

Adapting to predictions issued by the high performance prediction server and the 0.9.0 version of the python bindings.
Support for shared models using the same version on python bindings.
Support for different server names using environment variables.

0.4.1 (2013-10-02)

Adding ensembles’ predictions for multi-label objective fields
Bug fixing: in evaluation mode, evaluation for –dataset and –number-of-models > 1 did not select the 20% hold out instances to test the generated ensemble.

0.4.0 (2013-08-15)

Adding text analysis through the corresponding bindings

0.3.7 (2013-09-17)

Adding support for multi-label objective fields
Adding –prediction-headers and –prediction-fields to improve –prediction-info formatting options for the predictions file
Adding the ability to read –test input data from stdin
Adding –seed option to generate different splits from a dataset

0.3.6 (2013-08-21)

Adding –test-separator flag

0.3.5 (2013-08-16)

Bug fixing: resume crash when remote predictions were not completed
Bug fixing: Fields object for input data dict building lacked fields
Bug fixing: test data was repeated in remote prediction function
Bug fixing: Adding replacement=True as default for ensembles’ creation

0.3.4 (2013-08-09)

Adding –max-parallel-evaluations flag
Bug fixing: matching seeds in models and evaluations for cross validation

0.3.3 (2013-08-09)

Changing –model-fields and –dataset-fields flag to allow adding/removing fields with +/- prefix
Refactoring local and remote prediction functions
Adding ‘full data’ option to the –prediction-info flag to join test input data with prediction results in predictions file
Fixing errors in documentation and adding install for windows info

0.3.2 (2013-07-04)

Adding new flag to control predictions file information
Bug fixing: using default sample-rate in ensemble evaluations
Adding standard deviation to evaluation measures in cross-validation
Bug fixing: using only-model argument to download fields in models

0.3.1 (2013-05-14)

Adding delete for ensembles
Creating ensembles when the number of models is greater than one
Remote predictions using ensembles

0.3.0 (2013-04-30)

Adding cross-validation feature
Using user locale to create new resources in BigML
Adding –ensemble flag to use ensembles in predictions and evaluations

0.2.1 (2013-03-03)

Deep refactoring of main resources management
Fixing bug in batch_predict for no headers test sets
Fixing bug for wide dataset’s models than need query-string to retrieve all fields
Fixing bug in test asserts to catch subprocess raise
Adding default missing tokens to models
Adding stdin input for –train flag
Fixing bug when reading descriptions in –field-attributes
Refactoring to get status from api function
Adding confidence to combined predictions

0.2.0 (2012-01-21)

Evaluations management
console monitoring of process advance
resume option
user defaults
Refactoring to improve readability

0.1.4 (2012-12-21)

Improved locale management.
Adds progressive handling for large numbers of models.
More options in field attributes update feature.
New flag to combine local existing predictions.
More methods in local predictions: plurality, confidence weighted.

0.1.3 (2012-12-06)

New flag for locale settings configuration.
Filtering only finished resources.

0.1.2 (2012-12-06)

Fix to ensure windows compatibility.

0.1.1 (2012-11-07)

Initial release.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

5.8.0

Apr 5, 2024

5.7.0

Dec 15, 2023

5.6.1

Nov 14, 2023

5.6.0

Nov 13, 2023

5.5.1

Oct 31, 2023

5.5.0

Apr 17, 2023

5.4.0

Jan 10, 2023

5.3.0

Oct 25, 2022

5.2.0

Oct 1, 2022

5.1.0

Sep 22, 2022

5.0.0

Apr 1, 2022

4.1.1

May 6, 2021

4.1.0

Sep 18, 2020

4.0.0

Aug 4, 2020

3.27.2

Jul 15, 2020

3.27.1

Jun 2, 2020

3.27.0

May 28, 2020

3.26.3

Apr 21, 2020

3.26.2

Apr 14, 2020

3.26.1

Mar 20, 2020

3.26.0

Jan 27, 2020

3.25.0

Nov 28, 2019

3.24.0

Sep 20, 2019

3.23.1

Aug 8, 2019

3.23.0

Jun 30, 2019

3.22.0

May 19, 2019

3.21.1

May 6, 2019

3.21.0

Apr 19, 2019

3.20.0

Apr 8, 2019

3.19.0

Apr 5, 2019

3.18.8

Apr 4, 2019

3.18.7

Jan 9, 2019

3.18.6

Jan 8, 2019

3.18.5

Dec 13, 2018

3.18.4

Nov 8, 2018

3.18.3

Oct 15, 2018

3.18.2

Oct 12, 2018

3.18.1

Sep 27, 2018

3.18.0

May 22, 2018

3.17.0

Feb 7, 2018

3.16.0

Jan 26, 2018

3.15.2

Jan 10, 2018

3.15.1

Dec 26, 2017

3.15.0

Dec 25, 2017

3.14.1

Nov 25, 2017

3.14.0

Nov 22, 2017

3.13.2

Nov 21, 2017

3.13.1

Nov 7, 2017

3.13.0

Oct 15, 2017

3.12.0

Jul 29, 2017

3.11.2

Jun 16, 2017

3.11.1

May 25, 2017

3.11.0

May 24, 2017

3.10.3

Apr 24, 2017

3.10.2

Apr 17, 2017

3.10.1

Mar 27, 2017

3.10.0

Mar 22, 2017

3.9.3

Mar 10, 2017

3.9.2

Feb 16, 2017

3.9.1

Jan 5, 2017

3.9.0

Jan 4, 2017

3.8.7

Nov 4, 2016

3.8.6

Oct 25, 2016

3.8.5

Oct 22, 2016

3.8.4

Oct 14, 2016

3.8.3

Sep 30, 2016

3.8.2

Sep 24, 2016

3.8.1

Jul 6, 2016

3.8.0

Jul 5, 2016

3.7.1

Jun 27, 2016

3.7.0

Jun 2, 2016

3.6.4

Apr 7, 2016

3.6.3

Apr 4, 2016

3.6.2

Apr 1, 2016

3.6.1

Mar 23, 2016

3.6.0

Mar 9, 2016

3.5.4

Feb 9, 2016

3.5.3

Feb 8, 2016

3.5.2

Jan 2, 2016

3.5.1

Dec 26, 2015

3.5.0

Dec 26, 2015

3.4.0

Dec 22, 2015

3.3.9

Dec 20, 2015

3.3.8

Nov 25, 2015

3.3.7

Nov 19, 2015

3.3.6

Nov 16, 2015

3.3.5

Nov 15, 2015

3.3.4

Nov 11, 2015

3.3.3

Nov 10, 2015

3.3.2

Nov 1, 2015

3.3.1

Oct 15, 2015

3.3.0

Sep 21, 2015

3.2.1

Sep 6, 2015

3.2.0

Aug 25, 2015

3.1.1

Aug 17, 2015

3.1.0

Aug 7, 2015

3.0.5

Jul 30, 2015

3.0.4

Jul 23, 2015

3.0.3

Jul 2, 2015

3.0.2

Jun 23, 2015

3.0.1

Jun 14, 2015

3.0.0

Jun 2, 2015

2.2.0

Apr 16, 2015

2.1.0

Apr 13, 2015

2.0.1

Apr 8, 2015

2.0.0

Apr 5, 2015

1.15.6

Jan 27, 2015

1.15.5

Jan 20, 2015

1.15.4

Jan 15, 2015

1.15.3

Dec 25, 2014

1.15.2

Dec 17, 2014

1.15.1

Dec 16, 2014

1.15.0

Dec 6, 2014

1.14.6

Dec 1, 2014

1.14.5

Nov 19, 2014

1.14.4

Nov 10, 2014

1.14.3

Nov 8, 2014

1.14.2

Nov 2, 2014

1.14.1

Oct 26, 2014

1.14.0

Oct 21, 2014

1.13.3

Oct 19, 2014

1.13.2

Oct 8, 2014

1.13.1

Sep 22, 2014

1.13.0

Sep 12, 2014

1.12.4

Aug 4, 2014

1.12.3

Aug 1, 2014

1.12.2

Jul 29, 2014

1.12.1

Jul 26, 2014

1.12.0

Jul 18, 2014

1.11.0

Jul 11, 2014

1.10.0

Jul 11, 2014

1.9.2

Jul 7, 2014

1.9.1

Jul 2, 2014

1.9.0

Jul 2, 2014

1.8.12

Jun 11, 2014

1.8.11

May 31, 2014

1.8.10

May 31, 2014

1.8.9

May 28, 2014

1.8.8

May 27, 2014

1.8.7

May 24, 2014

1.8.6

May 22, 2014

1.8.5

May 19, 2014

1.8.4

May 9, 2014

1.8.3

May 7, 2014

1.8.2

May 6, 2014

1.8.1

May 5, 2014

1.8.0

May 3, 2014

1.7.1

Apr 22, 2014

1.7.0

Apr 22, 2014

1.6.0

Apr 19, 2014

1.5.1

Apr 5, 2014

1.5.0

Apr 3, 2014

1.4.7

Mar 14, 2014

1.4.6

Feb 21, 2014

1.4.5

Feb 4, 2014

This version

1.4.4

Feb 3, 2014

1.4.3

Jan 30, 2014

1.4.2

Jan 25, 2014

1.4.1

Jan 23, 2014

1.4.0

Jan 21, 2014

1.3.0

Jan 18, 2014

1.2.3

Jan 16, 2014

1.2.2

Jan 15, 2014

1.2.1

Jan 12, 2014

1.2.0

Jan 11, 2014

1.1.0

Jan 8, 2014

1.0.1

Jan 6, 2014

1.0

Jan 2, 2014

0.7.2

Dec 19, 2013

0.7.1

Dec 19, 2013

0.7.0

Nov 25, 2013

0.6.1

Nov 22, 2013

0.6.0

Nov 13, 2013

0.5.0

Oct 17, 2013

0.4.0

Sep 26, 2013

0.3.6

Aug 21, 2013

0.3.5

Aug 19, 2013

0.3.4

Aug 15, 2013

0.3.3

Aug 14, 2013

0.3.2

Jul 4, 2013

0.3.1

May 14, 2013

0.3.0

May 6, 2013

0.2.1

Mar 8, 2013

0.2.0

Jan 26, 2013

0.1.4

Dec 22, 2012

0.1.3

Dec 6, 2012

0.1.2

Dec 6, 2012

0.1.1

Dec 5, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigmler-1.4.4.tar.gz (129.4 kB view hashes)

Uploaded Feb 3, 2014 Source

Hashes for bigmler-1.4.4.tar.gz

Hashes for bigmler-1.4.4.tar.gz
Algorithm	Hash digest
SHA256	`9151f77542019e5df265a933c895a28b946f7111832bd64b87f2a4f434c5eeae`
MD5	`587bc09c8281cf67f76a5c4a1a99e2fc`
BLAKE2b-256	`2232ed195b6a73972004c6c32ac6131926ab3d6b088edb026044311f2962c918`

bigmler 1.4.4

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

BigMLer - A command-line tool for BigML’s API

Support

Requirements

BigMLer Installation

BigML Authentication

BigMLer on Windows

BigML Development Mode

Using BigMLer

Quick Start

Basics

Prior Versions Compatibility Issues

Running the Tests

Additional Information

History

1.4.4 (2014-02-03)

1.4.3 (2014-01-28)

1.4.2 (2014-01-24)

1.4.1 (2014-01-23)

1.4.0 (2014-01-20)

1.3.0 (2014-01-17)

1.2.3 (2014-01-16)

1.2.2 (2014-01-14)

1.2.1 (2014-01-12)

1.2.0 (2014-01-09)

1.1.0 (2014-01-08)

1.0.1 (2014-01-06)

1.0 (2014-01-02)

0.7.2 (2013-12-20)

0.7.1 (2013-12-19)

0.7.0 (2013-11-24)

0.6.1 (2013-11-23)

0.6.0 (2013-11-08)

0.5.0 (2013-10-08)

0.4.1 (2013-10-02)

0.4.0 (2013-08-15)

0.3.7 (2013-09-17)

0.3.6 (2013-08-21)

0.3.5 (2013-08-16)

0.3.4 (2013-08-09)

0.3.3 (2013-08-09)

0.3.2 (2013-07-04)

0.3.1 (2013-05-14)

0.3.0 (2013-04-30)

0.2.1 (2013-03-03)

0.2.0 (2012-01-21)

0.1.4 (2012-12-21)

0.1.3 (2012-12-06)

0.1.2 (2012-12-06)

0.1.1 (2012-11-07)

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution