Skip to main content

Implementation of the MatrixRegression (MR) algorithm for online-learning multi-label text classification, by Popa, Zeitouni & Gardarin.

Project description

Matrix Regression

Multi-label text classification algorithm for online learning.

CodeFactor codecov PyPI PyPI - Python Version PyPI - Downloads GitHub CI Pipeline Build Ruff uv Code style: black

Description

Implementation of the MatrixRegression (MR) algorithm for multi-label text classification that can be used in an online learning context. It is presented in the following paper:

Popa, I. & Zeitouni, Karine & Gardarin, Georges & Nakache, Didier & Métais, Elisabeth. (2007). Text Categorization for Multi-label Documents and Many Categories. 421 - 426. 10.1109/CBMS.2007.108.

Abstract:

In this paper, we propose a new classification method that addresses classification in multiple categories of textual documents. We call it Matrix Regression (MR) due to its resemblance to regression in a high dimensional space. Experiences on a medical corpus of hospital records to be classified by ICD (International Classification of Diseases) code demonstrate the validity of the MR approach. We compared MR with three frequently used algorithms in text categorization that are k-Nearest Neighbors, Centroide and Support Vector Machine. The experimental results show that our method outperforms them in both precision and time of classification.

Installation

Via PyPi using pip, as easy as:

pip install matrixreg

Usage

from matrixreg import MatrixRegression

mr = MatrixRegression()

# Fit
mr.fit(X_train, y_train)

# Predict
mr.predict(X_test)

# Partial fit
mr.partial_fit(new_X, new_y)

Parameters optimization

This implementation is scikit-friendly; thus, it supports GridSearchCV.

# Parameter to optimize
param_grid = [{"threshold": [0.3, 0.6, 0.9]}]

# Initialization
mr = MatrixRegression()
clf = GridSearchCV(mr, param_grid, cv=5, verbose=10, n_jobs=-1, scoring='f1_micro')

# Fit
clf.fit(X_train, y_train)

# Results
clf.best_params_, clf.best_score_

Author

Nicolò Verardo

Buy Me A Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matrixreg-0.3.0.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

matrixreg-0.3.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file matrixreg-0.3.0.tar.gz.

File metadata

  • Download URL: matrixreg-0.3.0.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for matrixreg-0.3.0.tar.gz
Algorithm Hash digest
SHA256 74e0ae1c90d6d9adc5d6fa2b7e7429c8afcc7864efcad2e06dc0aad0741f2ae0
MD5 6386e97ca95decc38a1a823b049982fd
BLAKE2b-256 61ce1b1549369643bf5751af7b05014c965b8c22309f081f6b815e1b84010ca2

See more details on using hashes here.

File details

Details for the file matrixreg-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: matrixreg-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for matrixreg-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e6dcd42e8130bf99d6e3315cb6b037c43103353b349fb91c833b4c9ce4e5c585
MD5 8d9c86cf2a5f5b65106326d517922334
BLAKE2b-256 3064d870146aa55ead841585900d243b08bf5090ff4ad1b31448490e73cacf54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page