Hebrew Psychological Lexicons
Project description
Hebrew Psychological Lexicons
This is the official code accompanying a paper on the Hebrew Psychological Lexicons was presented at CLPsych 2021.
Summary
- A large collection of Hebrew psychological lexicons and word lists
- Easy-to-use Python interface for Hebrew clinical psychology text analysis
- Useful for various psychology applications such as detecting emotional state, well being, relationship quality in conversation, identifying topics (e.g., family, work) and many more
- Lexicons were developed through data driven means, and verified by domain experts, clinical psychologists and psychology students, in a process of reconciliation with three judges
- Development and verification relied on a dataset of a total of 872 psychotherapy session transcripts
- Initial results of research studies employing this resource confirm its value
Usage
First, install the package using pip
:
pip install hepsylex
OR
git clone https://github.com/natalieShapira/HebrewPsychologicalLexicons
python setup.py install
Then, in Python, to load the lexicons:
from hepsylex import Lexicons
lexicons = Lexicons()
and a usable wrapper:
from hepsylex import LexiconsAPI
print(LexiconsAPI.number_of_words("היא אמרה הוא אמר והיא אמרה היא גם אני אני אני"))
# out: 11
print(LexiconsAPI.number_of_words_in_lexicon("היא אמרה הוא אמר והיא אמרה היא גם אני אני אני", lexicons.DataDrivenSupervised_WellBeing_NonClinical))
# out: 5
print(LexiconsAPI.number_of_words_in_lexicon("זה בכלל לא מעניין אותי אני אדיש לזה", lexicons.EmotionalVariety_Calm))
# out: 1
print(LexiconsAPI.number_of_words_in_lexicon("זה בכלל לא מעניין אותי אני אדיש לזה", lexicons.EmotionalVariety_NotInterested))
# out: 2
lex1 = lexicons.EmotionalVariety_Calm
lex2 = lexicons.EmotionalVariety_NotInterested
lex3 = LexiconsAPI.lexicons_union([lex1, lex2])
print(LexiconsAPI.number_of_words_in_lexicon("זה בכלל לא מעניין אותי אני אדיש לזה", lex3))
# out: 2
print(LexiconsAPI.frequency_of_lexicon("זה בכלל לא מעניין אותי אני אדיש לזה", lex3))
# out: 2.5
import pandas as pd
df_in = pd.read_csv('Resources/For documentation purposes/df_example.csv')
df_out = pd.DataFrame() #or df_in
LexiconsAPI.df_to_lexicons(df_in, df_out, lexicons, 'story','story')
df_out.to_csv('./Resources/For documentation purposes/df_example_out.csv', index=False)
Publications
hepsylex was also used for Hebrew psychological information extraction in the following academic studies:
- Using Computerized Text Analysis to Examine Associations Between Linguistic Features and Clients’ Distress during Psychotherapy at JCP 2020
- Using topic models to identify clients' functioning levels and alliance ruptures in psychotherapy at Psychotherapy 2021
If you use hepsylex for an academic publication, we'd appreciate a note.
Reference:
Title: Hebrew Psychological Lexicons
Authors: Natalie Shapira, Dana Atzil-Slonim, Daniel Juravski, Moran Baruch, Adar Paz, Dana Stolowicz-Melman, Tal Alfi-Yogev, Roy Azoulay, Adi Singer, Maayan Revivo, Chen Dahbash, Limor Dayan, Tamar Naim, Lidar Gez, Boaz Yanai, Adva Maman, Adam Nadaf, Elinor Sarfati, Amna Baloum, Tal Naor, Ephraim Mosenkis, Matan Kenigsbuch, Badreya Sarsour, Yarden Elias, Liat Braun, Moria Rubin, Jany Gelfand Morgenshteyn, Noa Bergwerk, Noam Yosef, Sivan Peled, Coral Avigdor, Rahav Obercyger, Rachel Mann, Tomer Alper, Inbal Beka, Ori Shapira, Yoav Goldberg
Affiliation: Bar-Ilan University, Israel
Published: Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology, June 2021, Association for Computational Linguistics.
Citation
If you make use of this software for research, we would appreciate the following citation:
@inproceedings{shapira2021hebrew,
title={Hebrew Psychological Lexicons},
author={Shapira, Natalie and Atzil-Slonim, Dana and Juravski, Daniel and Baruch, Moran and Stolowicz-Melman, Dana and Paz, Adar and Alfi-Yogev, Tal and Azoulay, Roy and Singer, Adi and Revivo, Maayan and others},
booktitle={Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access},
pages={55--69},
year={2021}
}
Licensing Highlights
- The code is provided with license (apache 2.0), as is, and without warranties.
- The data word lists and lexicon is provided with creative commons license CC-BY-SA, as is, and without warranties.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.