Skip to main content

A multi-dimensional Natural Language Processing (NLP) framework for analyzing and assessing the quality of therapeutic conversations

Project description

Therapeutic Conversation Quality Assessment Framework

A multi-dimensional Natural Language Processing (NLP) framework for analyzing and assessing the quality of therapeutic conversations.

🌟 Overview

This project presents a novel approach to evaluating therapeutic conversation quality using advanced NLP techniques and machine learning. Our framework analyzes key conversational dynamics to distinguish between high and low-quality therapeutic interactions, providing valuable insights for mental health professionals.

🔑 Key Features

  • Multi-dimensional Analysis: Evaluates conversations across four key dimensions:

    • Conversation Analytics (turn-taking, word usage patterns)
    • Semantic Analysis (topic coherence and flow)
    • Sentiment Analysis (emotional context)
    • Question Detection (engagement patterns)
  • Advanced ML Classification: Implements multiple classifiers including Random Forest, CatBoost, and SVM, achieving up to 97% accuracy with optimized parameters

  • Robust Data Processing:

    • Handles imbalanced datasets using SMOTE-Tomek
    • Comprehensive outlier detection
    • Feature normalization and preprocessing

System Arhcitecture

System Architecture for NLP Framework

📊 Performance Highlights

Classifier Accuracy Precision Recall F1 Score AUC Score
SVM 0.9717 0.9775 0.9667 0.9715 0.9874
CatBoost 0.9600 0.9606 0.9600 0.9600 0.9912
Random Forest 0.9533 0.9487 0.9600 0.9539 0.9893

🛠 Technical Implementation

Feature Extraction Pipeline

  1. Conversation Analytics:

    • Words per turn analysis
    • Turn-taking patterns
    • Statistical measures (std dev, skewness, kurtosis)
  2. Semantic Analysis:

    • Utilizes multiple embedding models:
      • PromCSE
      • Sentence-BERT
      • SAKIL sentence similarity
    • Both overall and turn-order-aware analysis
  3. Sentiment Analysis:

    • Twitter-roBERTa-base model
    • Sentiment transition tracking
    • Weighted certainty scores
  4. Question Detection:

    • Syntactic pattern recognition
    • Bi-gram analysis
    • Speaker-specific question tracking

📦 Requirements

  • Python 3.8+
  • scikit-learn
  • transformers
  • torch
  • pandas
  • numpy
  • catboost
  • SVM
  • Random Forest

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

This research was supported by:

  • Natural Sciences and Engineering Research Council of Canada (NSERC)
  • New Frontiers in Research Fund
  • LeaCros

📬 Contact

For questions and feedback, please contact Niloy Roy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

therapeuticnlp-1.0.0.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

therapeuticnlp-1.0.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file therapeuticnlp-1.0.0.tar.gz.

File metadata

  • Download URL: therapeuticnlp-1.0.0.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for therapeuticnlp-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2c0cd351519df47d01afe7c4de6afa84bd6341b21f59062cc81e1cd22c6f7417
MD5 bc60fa0839818614ec942c029315bc2d
BLAKE2b-256 4f256a26d2b4bb6e4844ebeeff0797dd5ca31f4ebed7390fcde2aa786982cf4e

See more details on using hashes here.

File details

Details for the file therapeuticnlp-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: therapeuticnlp-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for therapeuticnlp-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1534002ebdc23f3820a2dac28e7eb6cae2b02e6a4697e697c280b316c1ec1e69
MD5 4517959869bee6a76d40e180ee26debe
BLAKE2b-256 e90d740561f681792a4ce3d8ae119a1aa62913a7dabe1686a51ea5b9786af214

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page