Skip to main content

A Python package for running psychometric on LLMs.

Project description

Indicators of Resilience (IoR)

In natural language processing (NLP), semantic relationships between words can be captured using
a variety of different approaches, such as semantic word embeddings, transformer-based language models (a la BERT), encoder-decoder models (a la T5 and BART), and others. While most embedding techniques consider the contexts of words, some consider sub-word components or even phonetics.[^1] Learning contextual language representations using transformers[^2] drove rapid progress in NLP and led to the development of tools readily accessible for researchers in a variety of disciplines.
In this project, we refer to the various tools used to represent natural language collectively as NLP models.

Most NLP models allow represeting words, phrases, sentences, and documents using mutidimentional coordinates, so called embedding. A vector in this coordinate system represents some concept. Similarity of concepts can be measured by, for example, the cosine similarity.[^3][^4]
Coordinates of words may change depending on the language style, mood, and associations prevalent in the corpus on which the NLP models were trained.

Consider, for example, two chatbots - one trained using free text from the SuicideWatch peer support group on Reddit and the other with free text from partymusic on the same platform. Intuitively, the answers of the two chatbots to the question How do you feel today? would be different. Now consider the kind of answers these two chatbots would provide to anxiety and depression questionnaires.

The above example is overly simplistic in the sense that NLP models cannot be trained on the small amount of data of one subreddit, and the models' behavior depends on a variety of factors. We use this example only to illustrate the idea of querying an NLP model fitted to a corpus of messages produced by a specific population or after a specific event. Intuitively, the outputs of NLP models are biased toward associations prevalent in the training corpus.

The main working hypothesis driving this library that NLP models can capture – to a measurable extent – the emotional states reflected in the training corpus. Under the emotional state we include depression, exiety, stress and burnout. We also include the positive aspects of wellbeing such as sense of coherence,[^5] professional fulfillment,[^6] and various coping strategies[^7] all collectively referred to as Indicators of Resilience (IoRs).

Traditionally IoRs are measured using questionnairs such as GAD, PHQ, SPF, and others. This library provides the toolset and guidelines to translating validated psychological questionnairs into querried for trained NLP models.

[^1]: Ling, S., Salazar, J., Liu, Y., Kirchhoff, K., & Amazon, A. (2020). Bertphone: Phonetically-aware encoder representations for utterance-level speaker and language recognition. In Proc. Odyssey 2020 the speaker and language recognition workshop (pp. 9-16). [^2]: Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [^3]: Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119). [^4]: Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.‏ [^5]: Antonovsky, A. (1987). Unraveling the mystery of health: How people manage stress and stay well. Jossey-bass. [^6]: Trockel, M., Bohman, B., Lesure, E., Hamidi, M. S., Welle, D., Roberts, L., & Shanafelt, T. (2018). A brief instrument to assess both burnout and professional fulfillment in physicians: reliability and validity, including correlation with self-reported medical errors, in a sample of resident and practicing physicians. Academic Psychiatry, 42(1), 11-24. [^7]: Lazarus, R. S., & Folkman, S. (1984). Stress, appraisal, and coping. Springer publishing company.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qlatent-1.0.14.tar.gz (900.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qlatent-1.0.14-py3-none-any.whl (911.5 kB view details)

Uploaded Python 3

File details

Details for the file qlatent-1.0.14.tar.gz.

File metadata

  • Download URL: qlatent-1.0.14.tar.gz
  • Upload date:
  • Size: 900.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for qlatent-1.0.14.tar.gz
Algorithm Hash digest
SHA256 3e073ce3551e6689a1e2eabff0aff8e6f31a356cb32929beeb972c01d84b4f7a
MD5 4866486e8a39828b1574f37f5c727a61
BLAKE2b-256 d4775720565c871cec0f852ae62e1ea94341b3ee7fd1d6632c16a4386cf10891

See more details on using hashes here.

File details

Details for the file qlatent-1.0.14-py3-none-any.whl.

File metadata

  • Download URL: qlatent-1.0.14-py3-none-any.whl
  • Upload date:
  • Size: 911.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for qlatent-1.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 344e26a6508aa99c2e5e5a4e5a41d90291d43f24051acb1436ac9ce8c6dd7b95
MD5 20f11f7f46b5c7c7e66b785e0f7d200e
BLAKE2b-256 39f1ea39c470ae669256b0b50510441927a73e3ca0fac1e32840772f306bca05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page