Skip to main content

Toolkit for ML-based survey quality control

Project description

The ml4qc Python package offers a toolkit for employing machine learning technologies in survey data quality control.

Installation

Installing the latest version with pip:

pip install ml4qc

Overview

The ml4qc package builds on the scikit-learn toolset. It includes the following utility classes for working with survey data:

  • SurveyML provides core functionality, including preprocessing and outlier detection

  • SurveyMLClassifier builds on SurveyML, adding support for running classification models and reporting out results

Examples

This package is best illustrated by way of example. The following example analyses are available:

Documentation

See the full reference documentation here:

https://ml4qc.readthedocs.io/

Development

To develop locally:

  1. git clone https://github.com/orangechairlabs/ml4qc.git

  2. cd ml4qc

  3. python -m venv venv

  4. source venv/bin/activate

  5. pip install -r requirements.txt

For convenience, the repo includes .idea project files for PyCharm.

To rebuild the documentation:

  1. Update version number in /docs/source/conf.py

  2. Update layout or options as needed in /docs/source/index.rst

  3. In a terminal window, from the project directory:
    1. cd docs

    2. SPHINX_APIDOC_OPTIONS=members,show-inheritance sphinx-apidoc -o source ../src/ml4qc --separate --force

    3. make clean html

To rebuild the distribution packages:

  1. For the PyPI package:
    1. Update version number (and any build options) in /setup.py

    2. Confirm credentials and settings in ~/.pypirc

    3. Run /setup.py for bdist_wheel build type (Tools… Run setup.py task… in PyCharm)

    4. Delete old builds from /dist

    5. In a terminal window:
      1. twine upload dist/* --verbose

  2. For GitHub:
    1. Commit everything to GitHub and merge to main branch

    2. Add new release, linking to new tag like v#.#.# in main branch

  3. For readthedocs.io:
    1. Go to https://readthedocs.org/projects/ml4qc/, log in, and click to rebuild from GitHub (only if it doesn’t automatically trigger)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml4qc-0.1.1.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

ml4qc-0.1.1-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file ml4qc-0.1.1.tar.gz.

File metadata

  • Download URL: ml4qc-0.1.1.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.8

File hashes

Hashes for ml4qc-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c25582e97e1e46d0aae62be1b2914b3f403ee37d5dd6c46edec076a6c37c1105
MD5 236d6738a6cc472ca7089facca5ded7d
BLAKE2b-256 9d3a0080d16016043ff0605a60cd9a83eb680d3c27b624f42e553ce027189310

See more details on using hashes here.

File details

Details for the file ml4qc-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ml4qc-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.8

File hashes

Hashes for ml4qc-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dfd557ea41a3bf1167e6a9043b63a07d28c47b776d6225b96f6386c090cffa9f
MD5 e62133bffd1d814248d2260673c5f7cf
BLAKE2b-256 3a874db8e57efcde7cc4fef6b1313273f58dadb8acceb91ac9e87cdaa8ddb080

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page