Skip to main content

Sparse Multiple-Instance Learning: SVM, NSK, sMIL and sAwMIL.

Project description

PyPI version Python versions Wheel License DOI

Sparse Multiple-Instance Learning in Python

[!Warning]

It is an alpha version of the package.

MIL models based on the Support Vector Machines (NSK, sMIL, sAwMIL). Inspired by the outdated misvm package.

Note: This is an alpha version.

Implemented Models

Normalized Set Kernels (NSK)

Gärtner, Thomas, Peter A. Flach, Adam Kowalczyk, and Alex J. Smola. Multi-instance kernels. Proceedings of the 19th International Conference on Machine Learning (2002).

Sparse MIL (sMIL)

Bunescu, Razvan C., and Raymond J. Mooney. Multiple instance learning for sparse positive bags. Proceedings of the 24th International Conference on Machine Learning (2007).

Sparse Aware MIL (sAwMIL)

Classifier used in trilemma-of-truth:

Savcisens, Germans, and Tina Eliassi-Rad. The Trilemma of Truth in Large Language Models. arXiv preprint arXiv:2506.23921 (2025).


Installation

sawmil supports two QP backends: Gurobi and OSQP. By default, the base package installs without any solver; pick one (or both) via extras.

Base package (no solver)

pip install sawmil

Option 1 — Gurobi backend

Gurobi is commercial software. You’ll need a valid license (academic or commercial), refer to the official website.

pip install "sawmil[gurobi]"
# it installs numpy>=1.22 and scikit-learn>=1.7.0

Option 2 — OSQP backend

pip install "sawmil[osqp]"
# in additionl to the base packages, it installs osqp>=1.0.4 and scipy

Option 3 — All supported solvers

pip install "sawmil[full]"

Picking the solver in code

from sawmil import SVM

# solver= "osqp" (default is "gurobi")
clf = SVM(C=1.0, kernel="rbf", gamma=0.5, solver="osqp").fit(X, y)

Requirements

numpy>=1.22
scikit-learn>=1.7.0

Quick start

1. Generate dummy data

from dataset import make_complex_bags
import numpy as np
rng = np.random.default_rng(0)

ds = make_complex_bags(
    n_pos=300, n_neg=100, inst_per_bag=(5, 15), d=2,
    pos_centers=((+2,+1), (+4,+3)),
    neg_centers=((-1.5,-1.0), (-3.0,+0.5)),
    pos_scales=((2.0, 0.6), (1.2, 0.8)),
    neg_scales=((1.5, 0.5), (2.5, 0.9)),
    pos_intra_rate=(0.25, 0.85),
    ensure_pos_in_every_pos_bag=True,
    neg_pos_noise_rate=(0.00, 0.05),
    pos_neg_noise_rate=(0.00, 0.20),
    outlier_rate=0.1,
    outlier_scale=8.0,
    random_state=42,
)

2. NSK with RBF Kernel

Load a kernel:

from sawmil.kernels import get_kernel
from sawmil.bag_kernels import make_bag_kernel
k = get_kernel("rbf", gamma=0.5) # base (single-instance kernel)
bag_k  = make_bag_kernel(k, use_intra_labels=False) # convert single-instance kernel to bagged kernel

Fit NSK Model:

from sawmil.nsk import NSK

clf = NSK(C=0.1, bag_kernel=bag_k, scale_C=True, tol=1e-8, verbose=False).fit(ds, None)
print("Train acc:", clf.score(ds, np.array([b.y for b in ds.bags])))

3. Fit sMIL Model with Linear Kernel

from src.sawmil.smil import sMIL

k = get_kernel("linear", normalizer="none") # base (single-instance kernel)
bag_k  = make_bag_kernel(Linear(), normalizer="none", use_intra_labels=False)
clf = sMIL(C=0.1, bag_kernel=bag_k, scale_C=True, tol=1e-6, verbose=False).fit(ds, None)

print("Train acc:", clf.score(ds, np.array([1 if b.y > 0 else -1 for b in ds.bags])))

See more examples in the example.ipynb notebook.

4. Fit sAwMIL with Combined Kernels

from src.sawmil.kernels import Product, Polynomial, Linear, RBF, Sum, Scale
from src.sawmil.sawmil import sAwMIL

k = Sum(Linear(), 
        Scale(0.5, 
              Product(Polynomial(degree=2), RBF(gamma=1.0))))

clf = sAwMIL(C=0.1, base_kernel=k,
             solver="gurobi", eta=0.95) # here eta is high, since all items in the bag are relevant
clf.fit(ds)
print("Train acc:", clf.score(ds, np.array([b.y for b in ds.bags])))

Citation

If you use sawmil package in academic work, please cite:

Savcisens, G. & Eliassi-Rad, T. sAwMIL: Python package for Sparse Multiple-Instance Learning (2025).

@software{savcisens2025sawmil,
  author = {Savcisens, Germans and Eliassi-Rad, Tina},
  title = {sAwMIL: Python package for Sparse Multiple-Instance Learning},
  year = {2025},
  doi = {10.5281/zenodo.16990499},
  url = {https://github.com/carlomarxdk/sawmil}
}

If you want to reference a specific version of the package, find the correct DOI here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sawmil-0.1.6.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sawmil-0.1.6-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file sawmil-0.1.6.tar.gz.

File metadata

  • Download URL: sawmil-0.1.6.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sawmil-0.1.6.tar.gz
Algorithm Hash digest
SHA256 3b4b92261739d492c5b6c5c39797bc9c504cd311a70531c570d54d81e6c4b8ef
MD5 31f0e51bac5d90aa98c2601ea74e58c6
BLAKE2b-256 63fd12ac4b28a2cf4a3d5d57128cbf5974e4c032a0565c9dfad1966eb1c105e3

See more details on using hashes here.

Provenance

The following attestation bundles were made for sawmil-0.1.6.tar.gz:

Publisher: publish.yml on carlomarxdk/sawmil

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sawmil-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: sawmil-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sawmil-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2a199d9febb2f7a6dd64ddddffbd559f5c6528386084894ee28a4afe91754e52
MD5 b39fcd06b3ed409e4da93722697ca82b
BLAKE2b-256 098e1804999ef84d33e445def84ef426abe6704074f208b3f628eaab7b2014bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for sawmil-0.1.6-py3-none-any.whl:

Publisher: publish.yml on carlomarxdk/sawmil

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page