Skip to main content

clustermil - clustering based multiple instance learning

Project description

clustermil

Build Status GitHub issues

Python package for multiple instance learning (MIL) for large n_instance dataset.

Features

  • support count-based multiple instance assumptions (see wikipedia)
  • support multi-class setting
  • support scikit-learn Clustering algorithms (such as MiniBatchKMeans)
  • fast even if n_instance is large

Installation

pip install clustermil

Usage

# Prepare follwing dataset
#
# - bags ... list of np.ndarray
#            (num_instance_in_the_bag * num_features)
# - lower_threshold ... np.ndarray (num_bags * num_classes)
# - upper_threshold ... np.ndarray (num_bags * num_classes)
#
# bags[i_bag] contains not less than lower_thrshold[i_bag, i_class]
# i_class instances.

# Prepare single-instance clustering algorithms
from sklearn.cluster import MiniBatchKMeans
n_clusters = 100
clustering = MiniBatchKMeans(n_clusters=n_clusters)
clusters = clustering.fit_predict(np.vstack(bags)) # flatten bags into instances

# Prepare one-hot encoder
from sklearn.preprocessing import OneHotEncoder
onehot_encoder = OneHotEncoder()
onehot_encoder.fit(clusters)

# generate ClusterMilClassifier with helper function
from clustermil import generate_mil_classifier

milclassifier = generate_mil_classifier(
            clustering,
            onehot_encoder,
            bags,
            lower_threshold,
            upper_threshold,
            n_clusters)

# after multiple instance learning,
# you can predict instance class
milclassifier.predict([instance_feature])

See tests/test_classification.py for an example of a fully working test data generation process.

License

clustermil is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clustermil-0.2.0.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

clustermil-0.2.0-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file clustermil-0.2.0.tar.gz.

File metadata

  • Download URL: clustermil-0.2.0.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Windows/10

File hashes

Hashes for clustermil-0.2.0.tar.gz
Algorithm Hash digest
SHA256 87eabdf8980ea71fbbc7634b089b845aaf1826d9e0c9878750fabaec1fb58579
MD5 db35079792965e68e873d195f536cd1f
BLAKE2b-256 8cd931b0bc2b87ba4c72472f60500d9be3786d760c0350fee6bd5dba6aca5ef4

See more details on using hashes here.

File details

Details for the file clustermil-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: clustermil-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Windows/10

File hashes

Hashes for clustermil-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 248dfad281588acf410f8385bed54c763aad6722f77eb79af6f5607411af26d9
MD5 09323dff6595869f3271c2aaed695bdd
BLAKE2b-256 db25cc164faab4151cfa4fea3705ced9b7a05c57dd87cb82f7308800a9805759

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page