clustermil - clustering based multiple instance learning
Project description
clustermil
Python package for multiple instance learning (MIL) for large n_instance dataset.
Features
- support count-based multiple instance assumptions (see wikipedia)
- support multi-class setting
- support scikit-learn Clustering algorithms (such as
MiniBatchKMeans
) - fast even if n_instance is large
Installation
pip install clustermil
Usage
# Prepare follwing dataset
#
# - bags ... list of np.ndarray
# (num_instance_in_the_bag * num_features)
# - lower_threshold ... np.ndarray (num_bags * num_classes)
# - upper_threshold ... np.ndarray (num_bags * num_classes)
#
# bags[i_bag] contains not less than lower_thrshold[i_bag, i_class]
# i_class instances.
# Prepare single-instance clustering algorithms
from sklearn.cluster import MiniBatchKMeans
n_clusters = 100
clustering = MiniBatchKMeans(n_clusters=n_clusters)
clusters = clustering.fit_predict(np.vstack(bags)) # flatten bags into instances
# Prepare one-hot encoder
from sklearn.preprocessing import OneHotEncoder
onehot_encoder = OneHotEncoder()
onehot_encoder.fit(clusters)
# generate ClusterMilClassifier with helper function
from clustermil import generate_mil_classifier
milclassifier = generate_mil_classifier(
clustering,
onehot_encoder,
bags,
lower_threshold,
upper_threshold,
n_clusters)
# after multiple instance learning,
# you can predict instance class
milclassifier.predict([instance_feature])
See tests/test_classification.py
for an example of a fully working test data generation process.
License
clustermil is available under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
clustermil-0.2.0.tar.gz
(4.3 kB
view details)
Built Distribution
File details
Details for the file clustermil-0.2.0.tar.gz
.
File metadata
- Download URL: clustermil-0.2.0.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.9.13 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87eabdf8980ea71fbbc7634b089b845aaf1826d9e0c9878750fabaec1fb58579 |
|
MD5 | db35079792965e68e873d195f536cd1f |
|
BLAKE2b-256 | 8cd931b0bc2b87ba4c72472f60500d9be3786d760c0350fee6bd5dba6aca5ef4 |
File details
Details for the file clustermil-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: clustermil-0.2.0-py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.9.13 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 248dfad281588acf410f8385bed54c763aad6722f77eb79af6f5607411af26d9 |
|
MD5 | 09323dff6595869f3271c2aaed695bdd |
|
BLAKE2b-256 | db25cc164faab4151cfa4fea3705ced9b7a05c57dd87cb82f7308800a9805759 |