Qualitative Research support tools in Python!

These details have not been verified by PyPI

Project links

Project description

🔍 QRMine

/ˈkärmīn/

Libraries.io SourceRank GitHub tag (latest by date)

Qualitative research involves the collection and analysis of textual data, such as interview transcripts, open-ended survey responses, and field notes. It is often used in social sciences, humanities, and health research to explore complex phenomena and understand human experiences. In addition to textual data, qualitative researchers may also collect quantitative data, such as survey responses or demographic information, to complement their qualitative findings.

Qualitative research is often characterized by its inductive approach, where researchers aim to generate theories or concepts from the data rather than testing pre-existing hypotheses. This process is known as Grounded Theory, which emphasizes the importance of data-driven analysis and theory development.

QRMine is a Python package for qualitative research and triangulation of textual and numeric data in Grounded Theory. It provides tools for Natural Language Processing (NLP) and Machine Learning (ML) to analyze qualitative data, such as interview transcripts, and quantitative data, such as survey responses for theorizing.

Version 4.0 is a major update with new features and bug fixes. It moves some of the ML dependencies to an optional install. Version 4.0 is a prelude to version 5.0 that will introduce large language models (LLMs) for qualitative research.

✨ Features

🔧 NLP

Lists common categories for open coding.
Create a coding dictionary with categories, properties and dimensions.
Topic modelling.
Arrange docs according to topics.
Compare two documents/interviews.
Select documents/interviews by sentiment, category or title for further analysis.
Sentiment analysis
Clusters documents and creates visualizations.
Generate (non LLM) summary of documents/interviews.

🧠 ML

Accuracy of a neural network model trained using the data
Confusion matrix from an support vector machine classifier
K nearest neighbours of a given record
K-Means clustering
Principal Component Analysis (PCA)
Association rules

🛠️ How to install

Requires Python 3.11

pip install qrmine
python -m spacy download en_core_web_sm

For ML functions (neural networks & SVM), install the optional packages

pip install qrmine[ml]

Mac users

Mac users, please install libomp for XGBoost

brew install libomp

🚀 How to Use

Input files are transcripts as txt/pdf files and (optionally) a single csv file with numeric data. The output txt file can be specified. All transcripts can be in a single file separated by a break tag as described below.
The coding dictionary, topics and topic assignments can be created from the entire corpus (all documents) using the respective command line options.
Categories (concepts), summary and sentiment can be viewed for entire corpus or specific titles (documents) specified using the --titles switch. Sentence level sentiment output is possible with the --sentence flag.
You can filter documents based on sentiment, titles or categories and do further analysis, using --filters or -f
Many of the ML functions like neural network takes a second argument (-n) . In nnet -n signifies the number of epochs, number of clusters in kmeans, number of factors in pca, and number of neighbours in KNN. KNN also takes the --rec or -r argument to specify the record.
Variables from csv can be selected using --titles (defaults to all). The first variable will be ignored (index) and the last will be the DV (dependant variable).

Command-line options

qrmine --help

Command	Alternate	Description
--inp	-i	Input file in the text format with Topic
--out	-o	Output file name
--csv		csv file name
--num	-n	N (clusters/epochs etc depending on context)
--rec	-r	Record (based on context)
--titles	-t	Document(s) title(s) to analyze/compare
--codedict		Generate coding dictionary
--topics		Generate topic model
--assign		Assign documents to topics
--cat		List categories of entire corpus or individual docs
--summary		Generate summary for entire corpus or individual docs
--sentiment		Generate sentiment score for entire corpus or individual docs
--nlp		Generate all NLP reports
--sentence		Generate sentence level scores when applicable
--nnet		Display accuracy of a neural network model -n epochs(3)
--svm		Display confusion matrix from an svm classifier
--knn		Display nearest neighbours -n neighbours (3)
--kmeans		Display KMeans clusters -n clusters (3)
--cart		Display Association Rules
--pca		Display PCA -n factors (3)

Use it in your code

from qrmine import Content
from qrmine import Network
from qrmine import Qrmine
from qrmine import ReadData
from qrmine import Sentiment
from qrmine import MLQRMine

More instructions and a jupyter notebook available here.

Input file format

NLP

Individual documents or interview transcripts in a single text file separated by Topic. Example below

Transcript of the first interview with John.
Any number of lines
<break>First_Interview_John</break>

Text of the second interview with Jane.
More text.
<break>Second_Interview_Jane</break>

....

Multiple files are suported, each having only one break tag at the bottom with the topic. (The tag may be renamed in the future)

ML

A single csv file with the following generic structure.

Column 1 with identifier. If it is related to a text document as above, include the title.
Last column has the dependent variable (DV). (NLP algorithms like the topic asignments may provide the DV)
All independent variables (numerical) in between.

index, obesity, bmi, exercise, income, bp, fbs, has_diabetes
1, 0, 29, 1, 12, 120, 89, 1
2, 1, 32, 0, 9, 140, 92, 0
......

Author

Bell Eapen (UIS) | Contact |

Citation

Please cite QRMine in your publications if it helped your research. Citation information will be available soon.

Give us a star ⭐️

If you find this project useful, give us a star. It helps others discover the project.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

4.0.0

May 7, 2025

3.9.0

Nov 10, 2024

3.8.4

Feb 23, 2023

3.8.3

Apr 3, 2022

3.8.1

Aug 31, 2021

3.7.6

Jul 12, 2021

3.7.5

Jul 12, 2021

3.6.2

Oct 28, 2020

3.6.0

Oct 28, 2020

3.5.0

Jun 7, 2020

3.4.0

Mar 31, 2020

3.3.0

Dec 13, 2019

3.2.0

Dec 13, 2019

3.1.0

Nov 6, 2019

3.0.0

Nov 6, 2019

2.3.0

Nov 6, 2019

2.2.0

Nov 6, 2019

2.1.2

Jul 23, 2019

2.0.0

Jul 14, 2019

0.0.0

Aug 31, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

qrmine-4.0.0-py3-none-any.whl (44.3 kB view details)

Uploaded May 7, 2025 Python 3

File details

Details for the file qrmine-4.0.0-py3-none-any.whl.

File metadata

Download URL: qrmine-4.0.0-py3-none-any.whl
Upload date: May 7, 2025
Size: 44.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for qrmine-4.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`750ef3c9f9e32746a6fe050280c486e29e0de150a14e32f09b097ce6a6a3b19b`
MD5	`e5b75a284243f3ebbe284a18b359b7cf`
BLAKE2b-256	`086f592aef9b6390fd1b450706eeb911a85c82a114d2f3b7a3cc633a2ee04be0`

See more details on using hashes here.

qrmine 4.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🔍 QRMine

✨ Features

🔧 NLP

🧠 ML

🛠️ How to install

Mac users

🚀 How to Use

Command-line options

Use it in your code

Input file format

NLP

ML

Author

Citation

Give us a star ⭐️

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes