Skip to main content

A Python package for running psychometric on LLMs.

Project description

Indicators of Resilience (IoR)

In natural language processing (NLP), semantic relationships between words can be captured using
a variety of different approaches, such as semantic word embeddings, transformer-based language models (a la BERT), encoder-decoder models (a la T5 and BART), and others. While most embedding techniques consider the contexts of words, some consider sub-word components or even phonetics.[^1] Learning contextual language representations using transformers[^2] drove rapid progress in NLP and led to the development of tools readily accessible for researchers in a variety of disciplines.
In this project, we refer to the various tools used to represent natural language collectively as NLP models.

Most NLP models allow represeting words, phrases, sentences, and documents using mutidimentional coordinates, so called embedding. A vector in this coordinate system represents some concept. Similarity of concepts can be measured by, for example, the cosine similarity.[^3][^4]
Coordinates of words may change depending on the language style, mood, and associations prevalent in the corpus on which the NLP models were trained.

Consider, for example, two chatbots - one trained using free text from the SuicideWatch peer support group on Reddit and the other with free text from partymusic on the same platform. Intuitively, the answers of the two chatbots to the question How do you feel today? would be different. Now consider the kind of answers these two chatbots would provide to anxiety and depression questionnaires.

The above example is overly simplistic in the sense that NLP models cannot be trained on the small amount of data of one subreddit, and the models' behavior depends on a variety of factors. We use this example only to illustrate the idea of querying an NLP model fitted to a corpus of messages produced by a specific population or after a specific event. Intuitively, the outputs of NLP models are biased toward associations prevalent in the training corpus.

The main working hypothesis driving this library that NLP models can capture – to a measurable extent – the emotional states reflected in the training corpus. Under the emotional state we include depression, exiety, stress and burnout. We also include the positive aspects of wellbeing such as sense of coherence,[^5] professional fulfillment,[^6] and various coping strategies[^7] all collectively referred to as Indicators of Resilience (IoRs).

Traditionally IoRs are measured using questionnairs such as GAD, PHQ, SPF, and others. This library provides the toolset and guidelines to translating validated psychological questionnairs into querried for trained NLP models.

[^1]: Ling, S., Salazar, J., Liu, Y., Kirchhoff, K., & Amazon, A. (2020). Bertphone: Phonetically-aware encoder representations for utterance-level speaker and language recognition. In Proc. Odyssey 2020 the speaker and language recognition workshop (pp. 9-16). [^2]: Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [^3]: Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119). [^4]: Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.‏ [^5]: Antonovsky, A. (1987). Unraveling the mystery of health: How people manage stress and stay well. Jossey-bass. [^6]: Trockel, M., Bohman, B., Lesure, E., Hamidi, M. S., Welle, D., Roberts, L., & Shanafelt, T. (2018). A brief instrument to assess both burnout and professional fulfillment in physicians: reliability and validity, including correlation with self-reported medical errors, in a sample of resident and practicing physicians. Academic Psychiatry, 42(1), 11-24. [^7]: Lazarus, R. S., & Folkman, S. (1984). Stress, appraisal, and coping. Springer publishing company.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qlatent-1.0.19.tar.gz (905.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qlatent-1.0.19-py3-none-any.whl (913.8 kB view details)

Uploaded Python 3

File details

Details for the file qlatent-1.0.19.tar.gz.

File metadata

  • Download URL: qlatent-1.0.19.tar.gz
  • Upload date:
  • Size: 905.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for qlatent-1.0.19.tar.gz
Algorithm Hash digest
SHA256 d753f58ddfb33ad034e4b73ab60ac57b95cff90d6c24ba84eb7063e6b88e8054
MD5 1b00b395e5a75f461dc04e29833cea87
BLAKE2b-256 babc4ba0d5520f38297299cdc4f3089bbb6c3133a78da37c421f6c7f649c9ed6

See more details on using hashes here.

File details

Details for the file qlatent-1.0.19-py3-none-any.whl.

File metadata

  • Download URL: qlatent-1.0.19-py3-none-any.whl
  • Upload date:
  • Size: 913.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for qlatent-1.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 8bcf61cc163cb993088273365885e87c8511863f17ccffbc3e237393887fba89
MD5 1834ff4695c68ec0d81530427292679e
BLAKE2b-256 7faf0e45514ab11ef498002a4f0f104cf10d209c2f73a9013606b20d1850f1c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page