Skip to main content

Jubatus Toolkit

Project description

Travis Coveralls PyPi

jubakit: Jubatus Toolkit

jubakit is a Python module to access Jubatus features easily. jubakit can be used in conjunction with scikit-learn so that you can use powerful features like cross validation and model evaluation. See the Jubakit Documentation for the detailed description.

Currently jubakit supports Classifier, Regression, Anomaly, Recommender, NearestNeighbor, Clustering, Burst, Bandit and Weight engines.


pip install jubakit


  • Python 2.7, 3.3, 3.4 or 3.5.

  • Jubatus needs to be installed.

  • Although not mandatory, installing scikit-learn is required to use some features like K-fold cross validation.

Quick Start

The following example shows how to perform train/classify using CSV dataset.

from jubakit.classifier import Classifier, Schema, Dataset, Config
from jubakit.loader.csv import CSVLoader

# Load a CSV file.
loader = CSVLoader('iris.csv')

# Define types for each column in the CSV file.
schema = Schema({
  'Species': Schema.LABEL,
}, Schema.NUMBER)

# Get the shuffled dataset.
dataset = Dataset(loader, schema).shuffle()

# Run the classifier service (`jubaclassifier` process).
classifier =

# Train the classifier.
for _ in classifier.train(dataset): pass

# Classify using the trained classifier.
for (idx, label, result) in classifier.classify(dataset):
  print("true label: {0}, estimated label: {1}".format(label, result[0][0]))

Examples by Topics

See the example directory for working examples.



Requires scikit-learn

Handling CSV file and numeric features

Handling CSV file and string features

Handling toy dataset (digits)

Handling LIBSVM file

K-fold cross validation and metrics

Finding best hyper parameter

Finding best hyper parameter using hyperopt

Bulk Train-Test Classifier

Handling Twitter Streams

Extract contents of Classfier model file

Classification using scikit-learn wrapper

Grid Search example using scikit-learn wrapper

Visualize a training process using TensorBoard

Regression with toy dataset (boston)

Regression with CSV file

Regression using scikit-learn wrapper

Anomaly detection and metrics

Recommend similar items

Search neighbor items

Clustering 2-dimensional dataset

Burst detection with stream data

Multi-armed bandit with slot machine example

Tracing fv_converter behavior using Weight

Extract contents of Weight model file


MIT License

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jubakit-0.6.2.tar.gz (57.0 kB view hashes)

Uploaded source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page