Skip to main content

A Python package for running psychometric on LLMs.

Project description

Indicators of Resilience (IoR)

In natural language processing (NLP), semantic relationships between words can be captured using
a variety of different approaches, such as semantic word embeddings, transformer-based language models (a la BERT), encoder-decoder models (a la T5 and BART), and others. While most embedding techniques consider the contexts of words, some consider sub-word components or even phonetics.[^1] Learning contextual language representations using transformers[^2] drove rapid progress in NLP and led to the development of tools readily accessible for researchers in a variety of disciplines.
In this project, we refer to the various tools used to represent natural language collectively as NLP models.

Most NLP models allow represeting words, phrases, sentences, and documents using mutidimentional coordinates, so called embedding. A vector in this coordinate system represents some concept. Similarity of concepts can be measured by, for example, the cosine similarity.[^3][^4]
Coordinates of words may change depending on the language style, mood, and associations prevalent in the corpus on which the NLP models were trained.

Consider, for example, two chatbots - one trained using free text from the SuicideWatch peer support group on Reddit and the other with free text from partymusic on the same platform. Intuitively, the answers of the two chatbots to the question How do you feel today? would be different. Now consider the kind of answers these two chatbots would provide to anxiety and depression questionnaires.

The above example is overly simplistic in the sense that NLP models cannot be trained on the small amount of data of one subreddit, and the models' behavior depends on a variety of factors. We use this example only to illustrate the idea of querying an NLP model fitted to a corpus of messages produced by a specific population or after a specific event. Intuitively, the outputs of NLP models are biased toward associations prevalent in the training corpus.

The main working hypothesis driving this library that NLP models can capture – to a measurable extent – the emotional states reflected in the training corpus. Under the emotional state we include depression, exiety, stress and burnout. We also include the positive aspects of wellbeing such as sense of coherence,[^5] professional fulfillment,[^6] and various coping strategies[^7] all collectively referred to as Indicators of Resilience (IoRs).

Traditionally IoRs are measured using questionnairs such as GAD, PHQ, SPF, and others. This library provides the toolset and guidelines to translating validated psychological questionnairs into querried for trained NLP models.

[^1]: Ling, S., Salazar, J., Liu, Y., Kirchhoff, K., & Amazon, A. (2020). Bertphone: Phonetically-aware encoder representations for utterance-level speaker and language recognition. In Proc. Odyssey 2020 the speaker and language recognition workshop (pp. 9-16). [^2]: Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [^3]: Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119). [^4]: Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.‏ [^5]: Antonovsky, A. (1987). Unraveling the mystery of health: How people manage stress and stay well. Jossey-bass. [^6]: Trockel, M., Bohman, B., Lesure, E., Hamidi, M. S., Welle, D., Roberts, L., & Shanafelt, T. (2018). A brief instrument to assess both burnout and professional fulfillment in physicians: reliability and validity, including correlation with self-reported medical errors, in a sample of resident and practicing physicians. Academic Psychiatry, 42(1), 11-24. [^7]: Lazarus, R. S., & Folkman, S. (1984). Stress, appraisal, and coping. Springer publishing company.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qlatent-1.0.17.tar.gz (900.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qlatent-1.0.17-py3-none-any.whl (911.7 kB view details)

Uploaded Python 3

File details

Details for the file qlatent-1.0.17.tar.gz.

File metadata

  • Download URL: qlatent-1.0.17.tar.gz
  • Upload date:
  • Size: 900.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for qlatent-1.0.17.tar.gz
Algorithm Hash digest
SHA256 7a534bcbf78eac3a1f8a89f9fd1b66dfd3455ce651f6928e36f2ea07cd441121
MD5 3f5857809f71676865e05991af1eb561
BLAKE2b-256 6044b63ac4cd54c49461245074b6500ce5cb1201afceab13f78f2a19d8983316

See more details on using hashes here.

File details

Details for the file qlatent-1.0.17-py3-none-any.whl.

File metadata

  • Download URL: qlatent-1.0.17-py3-none-any.whl
  • Upload date:
  • Size: 911.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for qlatent-1.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 433e4088d4fd6da678f298cdb1c84c551b443345769dcb5f205e9229a91b7765
MD5 a99a751b37292466601ef1110f3c6331
BLAKE2b-256 239be2f0d92155ac9590467f27b17c1e08522492a65f58bf78969f84d23de19b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page