iQual is a package that leverages natural language processing to scale up interpretative qualitative analysis. It also provides methods to assess the bias, interpretability and efficiency of the machine-enhanced codes.
Project description
iQual
This repository contains the code and resources necessary to implement the techniques described in the paper A Method to Scale-Up Interpretative Qualitative Analysis, with an Application to Aspirations in Cox's Bazaar, Bangladesh. The iQual package is designed for qualitative analysis of open-ended interviews and aims to extend a small set of interpretative human-codes to a much larger set of documents using natural language processing. The package provides a method for assessing the robustness and reliability of this approach. The iQual package has been applied to analyze 2,200 open-ended interviews on parent's aspirations for children from Rohingya refugees and their Bangladeshi hosts in Cox's Bazaar, Bangladesh.
With iQual, researchers can efficiently analyze large amounts of qualitative data while maintaining the nuance and accuracy of human interpretation.
Installation
- To install
iQualusing pip, use the following command:
pip install -U iQual
Getting Started
For a quick introduction to using iQual, check out our Getting Started notebook. This tutorial provides:
- A complete overview of the basic workflow
- Step-by-step examples using real-world data
- Clear explanations of key concepts
- Code you can run immediately
This notebook is perfect for new users who want to understand iQual's core functionality without diving into the technical details.
Model Training
The Model Training notebook demonstrates advanced training techniques including cross-validation and hyperparameter optimization using the same politeness dataset from Getting Started.
Features
iQual is a package designed for qualitative analysis of open-ended interviews. It allows researchers to efficiently analyze large amounts of qualitative data while maintaining the nuance and accuracy of human interpretation.
-
Customizable pipelines using scikit-learn pipelines
-
Text-vectorization using:
- Any of the scikit-learn text feature extraction method.
- Any sentence-transformers compatible model.
- Any spaCy model with a
doc.vectorattribute.
-
Classification using any scikit-learn classification method
-
Feature Transformation:
- Dimensionality reduction using any scikit-learn
decompositionmethod, or UMAP using umap-learn. - Feature scaling using any scikit-learn
preprocessingmethod.
- Dimensionality reduction using any scikit-learn
-
Model selection and performance evaluation using scikit-learn methods.
-
Tests for bias and interpretability, with statsmodels.
Basic Usage
The following code demonstrates the basic usage of the iQual package:
from iqual import iqualnlp # Import `iqualnlp` from the `iqual` package
iqual_model = iqualnlp.Model() # Initiate the model class
# Add text features (using TF-IDF vectorization by default)
iqual_model.add_text_features('question', 'answer')
# Add a classifier (Logistic Regression by default)
iqual_model.add_classifier()
# Add a threshold layer for improved performance on imbalanced data
iqual_model.add_threshold()
# Compile the model
iqual_model.compile()
# Fit the model to your data
iqual_model.fit(X_train, y_train)
# Make predictions
y_pred = iqual_model.predict(X_test)
For a more detailed introduction, check out our Getting Started notebook.
Notebooks
The notebooks folder contains detailed examples on using iQual:
-
Getting Started A complete introduction to iQual with a self-contained example for new users.
-
Basic Modelling These notebooks demonstrate the basic usage of the package, the pipeline construction, and the vectorization and classification options.
-
Advanced Modelling These notebooks demonstrate advanced pipeline construction, mixing and matching of feature extraction and classification methods, and model selection.
-
Interpretability These notebooks demonstrate the interpretability and related tests for measurement and comparison of interpretability across human and enhanced (machine + human) codes.
-
Bias and Efficiency These notebooks demonstrate the bias and efficiency tests for determining the value and validity of enhanced codes.
Citation & Authors
If you use this package, please cite the following paper:
Ashwin,Julian; Rao,Vijayendra; Biradavolu,Monica Rao; Chhabra,Aditya; Haque,Arshia; Khan,Afsana Iffat; Krishnan,Nandini.
A Method to Scale-Up Interpretative Qualitative Analysis, with an Application to Aspirations in Cox’s Bazaar, Bangladesh (English). (Policy Research Working Paper No. WPS 10046)
Paper is funded by the Knowledge for Change Program (KCP) Washington, D.C. : World Bank Group.
http://documents.worldbank.org/curated/en/099759305162210822/IDU0a357362e00b6004c580966006b1c2f2e3996
Maintainers
Please contact the following people for any queries regarding the package:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iqual-0.1.3.tar.gz.
File metadata
- Download URL: iqual-0.1.3.tar.gz
- Upload date:
- Size: 20.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
753304e9ea0377a2f9484cefa51ef502282dbf67a08fd37f197c5f83b8193cb8
|
|
| MD5 |
354c5ee519bfa70a03e974a073f18791
|
|
| BLAKE2b-256 |
c331bda0f4613a5f60a76ae0efd403a479a4c06987db0d450278cac6cc3da386
|
File details
Details for the file iqual-0.1.3-py3-none-any.whl.
File metadata
- Download URL: iqual-0.1.3-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
776eb6c4e14c73f30dd8fdfdd485784a7b26bf7c92c5a0aec239bc34aa5cbc9b
|
|
| MD5 |
0b5792ded447e28b830c4d4effe453ed
|
|
| BLAKE2b-256 |
f1823a151cc66a36b519a654731e3b7c6e92c79d200e34f60b60063f1d496e68
|