Skip to main content

Implementation of the MatrixRegression (MR) algorithm for online-learning multi-label text classification, by Popa, Zeitouni & Gardarin.

Project description

Matrix Regression

CodeFactor codecov PyPI PyPI - Python Version PyPI - Downloads GitHub Code style: black

Buy Me A Coffee

Table of contents:

  1. Description
  2. Installation
  3. Usage

Description

Implementation of the MatrixRegression (MR) algorithm for multi-label text classification that can be used in an online learning context. It is presented in the following paper:

Popa, I. & Zeitouni, Karine & Gardarin, Georges & Nakache, Didier & Métais, Elisabeth. (2007). Text Categorization for Multi-label Documents and Many Categories. 421 - 426. 10.1109/CBMS.2007.108.

Abstract:

In this paper, we propose a new classification method that addresses classification in multiple categories of textual documents. We call it Matrix Regression (MR) due to its resemblance to regression in a high dimensional space. Experiences on a medical corpus of hospital records to be classified by ICD (International Classification of Diseases) code demonstrate the validity of the MR approach. We compared MR with three frequently used algorithms in text categorization that are k-Nearest Neighbors, Centroide and Support Vector Machine. The experimental results show that our method outperforms them in both precision and time of classification.

Installation

Via PyPi using pip, as easy as:

pip install matrixreg

Usage

from matrixregr.matrixregression import MatrixRegression

mr = MatrixRegression()

# Fit
mr.fit(X_train, y_train)

# Predict
mr.predict(X_test)

# Partial fit
mr.partial_fit(new_X, new_y)

Parameters optimization

This implementation is scikit-friendly; thus, it supports GridSearchCV

# Parameter to optimize
param_grid = [{"threshold": [0.3, 0.6, 0.9]}]

# Initialization
mr = MatrixRegression()
clf = GridSearchCV(mr, param_grid, cv = 5, verbose=10, n_jobs=-1, scoring='f1_micro')

# Fit
clf.fit(X_train, y_train)

# Results
clf.best_params_, clf.best_score_

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matrixreg-0.2.0.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

matrixreg-0.2.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file matrixreg-0.2.0.tar.gz.

File metadata

  • Download URL: matrixreg-0.2.0.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.28.1

File hashes

Hashes for matrixreg-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1a4c00340ebcb7721be6348a112bd284efa2fb0a877511550caf6b4978727e2e
MD5 5080dcbb5581cbd2fbca6489af86f801
BLAKE2b-256 f55ee1bc328f25a45b40f280ffd10dcb8934807ead13aa4e33b5c6750e5b4cdd

See more details on using hashes here.

File details

Details for the file matrixreg-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: matrixreg-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.28.1

File hashes

Hashes for matrixreg-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2346779ff8e534da77569827838addf13f99ad8557525c0ff5a497a99e3ada65
MD5 b5bf82b67f9c427c680242f918d40b42
BLAKE2b-256 a0dd8a9a9467538602d0af5d334ab3275294722ea045c7947fb5ef0a90641eb4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page