Toolkit for ML-based survey quality control
Project description
The ml4qc Python package offers a toolkit for employing machine learning technologies in survey data quality control. Among other things, it helps to extend the surveydata package and advance SurveyCTO’s machine learning roadmap.
Installation
Installing the latest version with pip:
pip install ml4qc
Overview
The ml4qc package builds on the scikit-learn toolset. It includes the following utility classes for working with survey data:
SurveyML provides core functionality, including preprocessing and outlier detection
SurveyMLClassifier builds on SurveyML, adding support for running classification models and reporting out results
Examples
This package is best illustrated by way of example. The following example analyses are available:
Documentation
See the full reference documentation here:
Project support
Dobility has generously provided financial and other support for v1 of the ml4qc package, including support for early testing and piloting.
Development
To develop locally:
git clone https://github.com/orangechairlabs/ml4qc.git
cd ml4qc
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
For convenience, the repo includes .idea project files for PyCharm.
To rebuild the documentation:
Update version number in /docs/source/conf.py
Update layout or options as needed in /docs/source/index.rst
- In a terminal window, from the project directory:
cd docs
SPHINX_APIDOC_OPTIONS=members,show-inheritance sphinx-apidoc -o source ../src/ml4qc --separate --force
make clean html
To rebuild the distribution packages:
- For the PyPI package:
Update version number (and any build options) in /setup.py
Confirm credentials and settings in ~/.pypirc
Run /setup.py for bdist_wheel build type (Tools… Run setup.py task… in PyCharm)
Delete old builds from /dist
- In a terminal window:
twine upload dist/* --verbose
- For GitHub:
Commit everything to GitHub and merge to main branch
Add new release, linking to new tag like v#.#.# in main branch
- For readthedocs.io:
Go to https://readthedocs.org/projects/ml4qc/, log in, and click to rebuild from GitHub (only if it doesn’t automatically trigger)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ml4qc-0.1.3.tar.gz
.
File metadata
- Download URL: ml4qc-0.1.3.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d1ff43c4cb5b18b37778c91cd77f625e41de2f81ea05544da8c2b7e24323206 |
|
MD5 | da323d0a898d18dcd277e49ace84bd43 |
|
BLAKE2b-256 | 8ee5ffebfa9c4883641fd5f4ea06b0ea59d3bfad01f8865ae24c52701a1beae2 |
File details
Details for the file ml4qc-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: ml4qc-0.1.3-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18011ec7fb42bdb426d3301937726bd16a64d13c509ce533370df47a02c87169 |
|
MD5 | 0f11a17feedda6f59f8d7803ad0e83ad |
|
BLAKE2b-256 | 5329805cb264a1a4122a2209e0af00091e7172247f97388d2691e7e7e84e0e0b |