Skip to main content

Create and quantify 'archetypes' of your constructs. A dictionary-like method run amok!

Project description

Archetypes!

This is a library developed to run what might be called a "souped-up dictionary method" for psychological text analysis. Or any kind of text analysis, really.

The core idea behind Archetypes is that you pre-define a set of prototypical sentences that reflect the construct that you are looking to measure in a body of text. Using modern contextual embeddings, then, this library will aggregate your prototypes into an archetypal representation of your construct. Then, you can quantify texts in your corpus for their semantic similarity to your construct(s) of interest.

Note: For the curious: no, this approach not inspired by anything Jungian in nature. In the past, I've said a few things about Jungian archetypes that have inspired scholars to write more than a few frustrated e-mails to me. Apologies to the Jungians.

Installation

This package is easily installable via pip via the following command:

pip install archetyper

Requirements

If you want to run the library without pip installing as shown above, you will need to first install the following packages:

  • numpy
  • tqdm
  • torch
  • sentence_transformers
  • nltk

You can try to install these all in one go by running the following command from your terminal/cmd:

pip install numpy tqdm torch sentence_transformers nltk

Examples

I have provided an example notebook in this repo that walks through the basic process of using this library, along with demonstrations of a few important "helper" functions to help you evaluate the statistical/psychometric qualities of your archetypes.

Citation

This method is originally described in the following forthcoming paper:

@inproceedings{varadarajan_archetypes_2024,
	address = {St. Julians, Malta},
	title = {Archetypes and {Entropy}: {Theory}-{Driven} {Extraction} of {Evidence} for {Suicide} {Risk}},
	booktitle = {Proceedings of the {Tenth} {Workshop} on {Computational} {Linguistics} and {Clinical} {Psychology}},
	publisher = {Association for Computational Linguistics},
	author = {Varadarajan, Vasudha and Lahnala, Allison and Ganesan, Adithya V. and Dey, Gourab and Mangalik, Siddharth and Bucur, Ana-Maria and Soni, Nikita and Rao, Rajath and Lanning, Kevin and Vallejo, Isabella and Flek, Lucie and Schwartz, H. Andrew and Welch, Charles and Boyd, Ryan L.},
	year = {2024},
}

The citation above will be updated once the paper is actually published 😊

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

archetyper-1.0.1.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

archetyper-1.0.1-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file archetyper-1.0.1.tar.gz.

File metadata

  • Download URL: archetyper-1.0.1.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for archetyper-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b7ff6888de90c61aef9c48ff9539d513dc55383ff75aa7ca66b8754d9606ddfc
MD5 1c1c8e813b9433c31fcf90c6b285c741
BLAKE2b-256 54eba707fce0e73ae9ebd833ab4b174890971275419b640156f53eece71b69ec

See more details on using hashes here.

File details

Details for the file archetyper-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: archetyper-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for archetyper-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c088784e8e90d31d51a5ad924b760a713b337e00c7b2c837192a91c932a79e4e
MD5 1d1b9332c41f424a860cd432813b039c
BLAKE2b-256 c8eeab7d60bda81cb80c1e7252b14a4313d3af6acf4fa95f419d7e8db7ecaee1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page