Skip to main content

A package for predicting questionnaire scores from reduced item sets

Project description

FACtor Score IteM reductIon with Lasso Estimator (FACSIMILE)

GitHub Actions Workflow Status Python version Code style: black

This package implements the FACSIMILE method for approximating sum scores, subscale scores, or factor scores based on reduced item sets. Given a scenario where a large number of items are available to measure a latent trait, FACSIMILE selects a subset of items that can be used to approximate the variable that would be obtained if all items were used.

The method uses Lasso-regularised regression to select items that are most predictive of the scores, and determine coefficients for the selected items that can be used to approximate the scores.

Installation

First, clone or download this repository. The package can then be installed using pip from the root directory:

git clone https://github.com/the-wise-lab/FACSIMILE.git
cd FACSIMILE
pip install .

Documentation

Documentation and examples are available a https://facsimile.thewiselab.org.

Basic usage

The package can be used to select items and approximate scores for a given dataset. In general, the simplest way to do this is to use the provided optimisation methods, which will evaluate the performance of different levels of regularisation (resulting in different numbers of items being included).

from facsimile.eval import FACSIMILEOptimiser

# Initialise the optimiser
optimiser = FACSIMILEOptimiser(n_iter=100, n_jobs=10)

# Fit 
optimiser.fit(X_train, y_train, X_val, y_val)

The best performing model can then be selected and used to approximate scores for a new dataset:

# Get the best classifier
best_clf = optimiser.get_best_classifier()

# Fit
best_clf.fit(X_train, y_train)

# Get predictions
y_pred = best_clf.predict(X_test)

Similarly, it is possible to select the best performing model subject to the requirements

# Get the best classifier
best_clf_70 = optimiser.get_best_classifier_max_items(70, metric='min_r2')

# Fit
best_clf_70.fit(X_train, y_train)

# Get predictions
y_pred = best_clf_70.predict(X_test)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

facsimile_py-0.1.0.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

facsimile_py-0.1.0-py2.py3-none-any.whl (16.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file facsimile_py-0.1.0.tar.gz.

File metadata

  • Download URL: facsimile_py-0.1.0.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for facsimile_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a5fc011da1e56f12b5c89fba82e0d58987fa6a7d209e5d5671658810818484a1
MD5 d5990062042b06805e482ca19109fb1a
BLAKE2b-256 0ff722d7c6b36e423a180698b241c29b10ec263fe85489f65e9a3a71d8773814

See more details on using hashes here.

File details

Details for the file facsimile_py-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: facsimile_py-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for facsimile_py-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f37669d432963988f0d9b4a86395316767d906eae9ab1e69f5b8829dcd6743bc
MD5 e656567631a3d7fcdd051fd6080d56b2
BLAKE2b-256 a8da682a0ccecb4f536f0521e83dda13b366cdbb82ac465d213967a6e88791fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page