Computing schematicity of autobiographical narratives
Project description
Measuring narrative schematicity
Methods from the paper "Computational Tools for Quantifying Schemas in Autobiographical Narratives".
Installation
pip install narsche
narsche depends on networkx (for network models), SpaCy (for tokenization), and wordfreq for automated topic identification. Additionally, one of SpaCy's models must be downloaded for SpaCy-based tokenization:
python -m spacy download en_core_web_sm
Usage
Loading and saving models
A text file of word vectors can be read using the read_vectors() function:
vec_mod = narsche.read_vectors('/path/to/vectors.txt')
This produces a vector model. The text file must be formatted such that the first token (space-delimited) on a line is the word for which the remaining tokens are the vector components. This is how, for example, the GloVe embeddings are formatted.
Initializing a network model requires first loading a networkx.Graph object:
import networkx as nx
graph = nx.load('/path/to/graph')
net_mod = narsche.NetworkModel(graph)
A script for setting up a network model us can be found here.
Models can be saved using the save() method and loaded using the load() class method:
net_mod.save('network.mod')
net_mod = narsche.NetworkModel.load('network.mod')
vec_mod.save('vector.mod')
vec_mod = narsche.VectorModel.load('vector.mod')
These are just wrappers around pickle.[load/dump]. Any extension can be used.
Tokenizing narratives
Before schematicity can be computed, narratives must be tokenized, i.e., converted to a list of tokens. For this, there is a Tokenizer() class that relies on SpaCy:
txt = 'I sat on the sofa in my living room with a lamp' # Example text
tokenizer = narsche.Tokenizer('en_core_web_sm') # Initialize tokenizer
words = tokenizer.tokenize(txt) # Tokenize words
words = vec_mod.keep_known(words) # Use only those words that are in the model
Computing schematicity
Given a model and a set of tokens (and possibly a topic word), schematicity can be computed using the schematicity() function:
topic = narsche.identify_topic(words) # Identify the topic
# Compute schematicity
narsche.schematicity(
words=words,
model=mod,
method='on-topic-ppn', # or topic-relatedness, pairwise-relatedness, or component-size
topic=topic)
See the documentation of the schematicity() function for kewords required by other methods.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file narsche-0.2.0.tar.gz.
File metadata
- Download URL: narsche-0.2.0.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d79856e4cf089538759416d017c59116f470aa2f508fe50b091b6ceb77090a16
|
|
| MD5 |
6c3a624851243d3a228f6d03e9bcc6a5
|
|
| BLAKE2b-256 |
3652a98e32360d92b3549ebd1875440cf167bf8f0affc5e556880d0193049338
|
File details
Details for the file narsche-0.2.0-py3-none-any.whl.
File metadata
- Download URL: narsche-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2489dc67672f11ae4151f39d5678b15b98f6fc224bdb644d1678eed74178c8d
|
|
| MD5 |
4adf12b631a4d1512d08fba8ae45fe34
|
|
| BLAKE2b-256 |
9b979a4bec58003a450f854b4dcb21c76c4d9fc81f37bf015c56079680e3251c
|