Skip to main content

Sparse Multiple-Instance Learning: SVM, NSK, sMIL and sAwMIL.

Project description

PyPI version Python versions Wheel License DOI

Sparse Multiple-Instance Learning in Python

MIL models based on the Support Vector Machines (NSK, sMIL, sAwMIL). Inspired by the outdated misvm package.

Note: This is an alpha version.

Implemented Models

Normalized Set Kernels (NSK)

Gärtner, Thomas, Peter A. Flach, Adam Kowalczyk, and Alex J. Smola. Multi-instance kernels. Proceedings of the 19th International Conference on Machine Learning (2002).

Sparse MIL (sMIL)

Bunescu, Razvan C., and Raymond J. Mooney. Multiple instance learning for sparse positive bags. Proceedings of the 24th International Conference on Machine Learning (2007).

Sparse Aware MIL (sAwMIL)

Classifier used in trilemma-of-truth:

Savcisens, Germans, and Tina Eliassi-Rad. The Trilemma of Truth in Large Language Models. arXiv preprint arXiv:2506.23921 (2025).

Installation

pip install sawmil

Requirements

numpy>=1.22
scikit-learn>=1.7.0
gurobipy>=12.0.3
python>=11.0 # recommended: >=12.3

At this point, sawmil package works only with the Gurobi optimizer. You need to obtain a academic/commercial license to use it. We plan to add implementations with other solvers.

Quick start

1. Generate dummy data

from dataset import make_complex_bags
import numpy as np
rng = np.random.default_rng(0)

ds = make_complex_bags(
    n_pos=300, n_neg=100, inst_per_bag=(5, 15), d=2,
    pos_centers=((+2,+1), (+4,+3)),
    neg_centers=((-1.5,-1.0), (-3.0,+0.5)),
    pos_scales=((2.0, 0.6), (1.2, 0.8)),
    neg_scales=((1.5, 0.5), (2.5, 0.9)),
    pos_intra_rate=(0.25, 0.85),
    ensure_pos_in_every_pos_bag=True,
    neg_pos_noise_rate=(0.00, 0.05),
    pos_neg_noise_rate=(0.00, 0.20),
    outlier_rate=0.1,
    outlier_scale=8.0,
    random_state=42,
)

2. NSK with RBF Kernel

Load a kernel:

from sawmil.kernels import get_kernel
from sawmil.bag_kernels import make_bag_kernel
k = get_kernel("rbf", gamma=0.5) # base (single-instance kernel)
bag_k  = make_bag_kernel(k, use_intra_labels=False) # convert single-instance kernel to bagged kernel

Fit NSK Model:

from sawmil.nsk import NSK

clf = NSK(C=0.1, bag_kernel=bag_k, scale_C=True, tol=1e-8, verbose=False).fit(ds, None)
print("Train acc:", clf.score(ds, np.array([b.y for b in ds.bags])))

3. Fit sMIL Model with Linear Kernel

from src.sawmil.smil import sMIL

k = get_kernel("linear", normalizer="none") # base (single-instance kernel)
bag_k  = make_bag_kernel(Linear(), normalizer="none", use_intra_labels=False)
clf = sMIL(C=0.1, bag_kernel=bag_k, scale_C=True, tol=1e-6, verbose=False).fit(ds, None)

print("Train acc:", clf.score(ds, np.array([1 if b.y > 0 else -1 for b in ds.bags])))

See more examples in the example.ipynb notebook.

Citation

If you use sawmil package in academic work, please cite:

Savcisens, G. & Eliassi-Rad, T. sAwMIL: Python package for Sparse Multiple-Instance Learning (2025).

@software{savcisens2025sawmil,
  author = {Savcisens, Germans and Eliassi-Rad, Tina},
  title = {sAwMIL: Python package for Sparse Multiple-Instance Learning},
  year = {2025},
  doi = {10.5281/zenodo.16990499},
  url = {https://github.com/carlomarxdk/sawmil}
}

If you want to reference a specific version of the package, find the correct DOI here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sawmil-0.1.3.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sawmil-0.1.3-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file sawmil-0.1.3.tar.gz.

File metadata

  • Download URL: sawmil-0.1.3.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sawmil-0.1.3.tar.gz
Algorithm Hash digest
SHA256 14e900b443928e4ce1b987015cdd39cfa8d73f1ce58214915569bfa40ac3d499
MD5 ded72f3a594f4aa2d3a2bfb6cce48ba4
BLAKE2b-256 4cd15b10f5409fa62125d55eb85f5cc0d395cb9e716f389936e765e448aa8626

See more details on using hashes here.

Provenance

The following attestation bundles were made for sawmil-0.1.3.tar.gz:

Publisher: publish.yml on carlomarxdk/sawmil

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sawmil-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: sawmil-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sawmil-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8951b3060f4d030531d01f76385c6d3038c0ce6f647e5b75deaff2e3e26ecc07
MD5 564eaeb699ae719edc5ff23864c0b8aa
BLAKE2b-256 e48e21dfe261a7fc438acfbef15684313da6d06fd00bc3688809f192817ac0f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for sawmil-0.1.3-py3-none-any.whl:

Publisher: publish.yml on carlomarxdk/sawmil

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page