A tool for stylistic device detection."

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

FreeStylo - an easy-to-use stylistic device detection tool for stylometry

An easy-to-use package for detecting stylistic devices in text. This package is designed to be used in stylometry, the study of linguistic style.

For those proficient in python, this package provides a collection of approaches to detect stylistic devices in text. For those less proficient in python, this package provides a simple interface to detect stylistic devices in text with simple commands and user-friendly configuration.

Installation

This package needs python 3.12 to run. It is recommended to create a virtual environment for the package. The package is available on PyPi and can be installed using pip.

pip install freestylo

Configuration

The package can be configured using the configuration file under ~/.config/freestylo/config.json. This file will be created when the tool is first run or the library needs information from the config file. Currently only the model download location can be configured. The model path can also be overridden by setting the environment variable FREESTYLO_MODEL_PATH.

The default configuration is:

{
    "model_path": "~/.freestylo/models/"
}

Usage Examples

Standalone Tool

After installation, you can run the following command to test that the tool is installed correctly.

If you've cloned the git repository, run this command in the root of the repository. If you've used pip to install the package, first you'll need to download the input and configuration files.

Download test input and example configuration

wget https://raw.githubusercontent.com/cvjena/freestylo/refs/heads/main/example_config.json
wget https://raw.githubusercontent.com/cvjena/freestylo/refs/heads/main/test/documents/chiasmustext.txt
mkdir -p test/documents && mv chiasmustext.txt test/documents/

freestylo --input test/documents/chiasmustext.txt \
    --output ./output.json \
    --config example_config.json

The example_config.json and chiasmustext.txt are contained in this repository. Either download them manually or just clone the repository and run the command from the project root folder.

This creates the file output.json in the root of the repository, which contains the detected stylistic devices in the text file test/documents/chiasmustext.txt. Afterwards, run the following command to get an overview over the results:

freestylo --mode report --data output.json

The package can be used both as a library and as a stand-alone command-line tool. Both from the library and from the command-line tool, the results can be saved in a JSON file. This json file will contain the complete tokenized text. When using the functions from the library, the result will be a python container with a similar structure to the JSON file.

To only download the models without running any annotation, use the following command:

freestylo --mode download_models

The standalone version can be configured using a simple JSON configuration file. The file should specify the language of the text and the stylistic devices to detect. The following is an example configuration file:

{
    "language": "de",
    "annotations": {
        "chiasmus": {
            "window_size": 30,
            "allowlist": ["NOUN", "VERB", "ADJ", "ADV"],
            "denylist": [],
            "model": "/chiasmus_de.pkl"
        }
    }
}

The output for the standalone tool is structured as follows:

            'text': self.text,
            'tokens': self.tokens,
            'pos': self.pos,
            'lemmas': self.lemmas,
            'dep': self.dep,
            'token_offsets': self.token_offsets,
            'annotations': annotations

First, it contains a whole copy of the text in the text field. It then contains the preprocessed lists of tokens, POS tags, lemmas, dependency labels and token offsets. Finally, it contains an annotations field that contains the detected stylistic devices. This field is a dictionary that contains one entry for each stylistic device detector that was applied to the text. The exact information contained in the annotations depends on the stylistic device detector, but generally they contain the ids of the tokens that are part of the detected devices, meaning their index in thetokens list (and respectively in the pos, lemmas, etc. lists) as well as a score that indicates the likelihood of the detected device being a true positive.

Library

The library comprises a collection of functions to detect the stylistic devices, as well as preprocessing based on spaCy. Should you want to use different preprocessing or use the package with a different language than the supported ones, a TextObject can be created and filled with the needed manually computed contents. The stylistic device detectors can then be applied to the TextObject.

The tests folder contains a test for every stylistic device detector. These tests show how to use the different detectors and how to create a TextObject. All classes and functions are documented by docstrings.

A typical example code would look like this:

from freestylo import TextObject as to
from freestylo import TextPreprocessor as tp
from freestylo import ChiasmusAnnotation as ca
from freestylo import MetaphorAnnotation as ma

# first, create a TextObject from the raw text
text = to.TextObject(
        # put the path to your text file here
        textfile = "example_textfile.txt",
        language="en")

# create a TextPreprocessor object and process the text
# this does the tokenizing, lemmatizing, POS-tagging, etc.
preprocessor = tp.TextPreprocessor(language="en")
preprocessor.process_text(text)

# you can also use a different preprocessing of your choice
# without the TextPreprocessor object
# just fill the TextObject with the needed contents
# those could be provided e.g. by spaCy, nltk, cltk,
# or any other method of your choice

# many digital corpora are already tokenized and POS-tagged
# they may come in various formats, such as TEI XML, CoNLL, etc.
# if you have a text in those formats, you can fill the TextObject
# with the needed contents
# you can then fill the missing values in the TextObject
# with e.g. word vectors or other features created with a method of your choice.

# you can now add various annotations to the text object
# here, we add a chiasmus annotation
chiasmus = ca.ChiasmusAnnotation(
        text=text)
chiasmus.allowlist = ["NOUN", "VERB", "ADJ", "ADV"]
chiasmus.find_candidates()
chiasmus.load_classification_model("chiasmus_de.pkl")
chiasmus.score_candidates()

# here, we add a metaphor annotation
metaphor = ma.MetaphorAnnotation(
        text=text)
metaphor.find_candidates()
metaphor.load_model("metaphor_de.torch")
metaphor.score_candidates()

# finally, save the annotated text to a json file
text.serialize("annotated_text.json")

The file test/test_external_source.py shows an an example of using the library without the text preprocessor. Instead the TextObject is filled by hand with the needed contents.

Currently supported stylistic devices are:

Alliteration
Chiasmus
Epiphora
Metaphor
Polysyndeton

Please find an overview of the detectors and their methods in the documentation.

Create your own detectors!

The package is designed to be easily extendable with your own stylistic device detectors. The src folder contains example scripts that show how you can retrain the models for the existing chiasmus and metaphor detectors. You can also create your own stylistic device detectors by referring to the existing ones. Especially the Alliteration Detector provides a very simple example that can be used as a template for your own detectors. If you create and want to contribute your own detecors, pull requests are very welcome!

Participation

The package is free and open-source software and contributions are very welcome. It is designed to be a living project that is constantly improved and extended. If you have implemented your own stylistic device detector, please consider contributing it to the package. For details please refer to the contribution guidelines. Also, if you have any suggestions for improvements or if you find any bugs, please open an issue on the GitHub page.

FreeStylo Configuration (`.json`) — Parameters & Example

This file controls annotate mode (freestylo --mode annotate). Top level keys configure language/NLP; the annotations object enables individual detectors and passes their parameters.

Top-Level Keys

Key	Type	Required	Values / Notes
`language`	string	yes	`"en"`, `"de"`, or `"mgh"`. Selects the preprocessor: • `en` → spaCy `en_core_web_lg` • `de` → spaCy `de_core_news_lg` • `mgh` → custom Middle High German pipeline
`nlp_max_length`	integer	no	Overrides spaCy `nlp.max_length` for long texts (ignored for MHG pipeline).
`annotations`	object (dict)	yes	Keys are annotation names. Only listed annotations are run. See per-annotation blocks below.

`annotations` Block

Add one object per enabled annotation. Supported keys:

1) `chiasmus`

Key	Type	Required	Default	Notes
`window_size`	integer	yes	30	Search window (in tokens) used to find candidates.
`allowlist`	array	yes	`[]`	POS tags allowed (e.g. `["NOUN","VERB","ADJ","ADV"]`). If non-empty, only these POS can anchor candidates.
`denylist`	array	yes	`[]`	POS tags to exclude. Ignored if empty.
`model`	string	yes	—	Model filename or path (e.g. `"chiasmus_de.pkl"`). Resolved via `Configs.get_model_path` (downloads to `~/.freestylo/models/` if missing).

2) `metaphor`

Key	Type	Required	Default	Notes
`model`	string	yes	—	Torch checkpoint filename/path (e.g. `"metaphor_de.torch"`, `"metaphor_en.torch"`, `"metaphor_mgh.torch"`). Resolved and downloaded if needed.

3) `epiphora`

Key	Type	Required	Default	Notes
`min_length`	integer	yes	2	Minimum number of repeated phrase endings.
`conj`	array	yes	`["and","or","but","nor"]`	Conjunctions used to segment phrases.
`punct_pos`	string	yes	`"PUNCT"`	POS tag string treated as punctuation.

4) `polysyndeton`

Key	Type	Required	Default	Notes
`min_length`	integer	yes	2	Minimum number of consecutive phrases starting with the same conjunction.
`conj`	array	yes	`["and","or","but","nor"]`	Conjunction lexicon.
`sentence_end_tokens`	array	yes	`[".", "?", "!", ":", ";", "..."]`	Tokens that terminate a sentence during phrase splitting.
`punct_pos`	string	yes	`"PUNCT"`	POS tag string treated as punctuation.

5) `alliteration`

Key	Type	Required	Default	Notes
`max_skip`	integer	yes	2	Max token gap allowed between consecutive hits (punctuation can extend the effective gap internally).
`min_length`	integer	yes	3	Minimum tokens participating in the run.
`ignore_tokens`	array	yes	`[]`	Exact tokens to ignore during matching (e.g. stopwords or domain-specific noise).

Outputs: All annotations append themselves to the TextObject and are written into the final JSON produced by --output via text.serialize(...).

Example

{
  "language": "de",
  "nlp_max_length": 3000000,
  "annotations": {
    "chiasmus": {
      "window_size": 30,
      "allowlist": ["NOUN", "VERB", "ADJ", "ADV"],
      "denylist": [],
      "model": "chiasmus_de.pkl"
    },
    "metaphor": {
      "model": "metaphor_de.torch"
    },
    "epiphora": {
      "min_length": 2,
      "conj": ["und", "oder", "aber", "noch"],
      "punct_pos": "PUNCT"
    },
    "polysyndeton": {
      "min_length": 2,
      "conj": ["und", "oder", "aber", "noch"],
      "sentence_end_tokens": [".", "?", "!", ":", ";", "..."],
      "punct_pos": "PUNCT"
    },
    "alliteration": {
      "max_skip": 2,
      "min_length": 3,
      "ignore_tokens": ["–", "—", "„", "“"]
    }
  }
}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sart0r

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.9.0

Jan 11, 2026

0.8.8

Dec 13, 2025

0.8.4

Sep 1, 2025

0.8.3

May 8, 2025

0.8.2

May 8, 2025

0.8.0

May 8, 2025

0.7.1

May 7, 2025

0.6.0

May 5, 2025

0.5.0

Oct 27, 2024

0.4.0

Oct 26, 2024

0.3.0

Oct 26, 2024

0.2.0

Oct 26, 2024

0.1.0

Oct 24, 2024

0.0.1

Oct 23, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

freestylo-0.9.0.tar.gz (50.2 kB view details)

Uploaded Jan 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

freestylo-0.9.0-py3-none-any.whl (38.6 kB view details)

Uploaded Jan 11, 2026 Python 3

File details

Details for the file freestylo-0.9.0.tar.gz.

File metadata

Download URL: freestylo-0.9.0.tar.gz
Upload date: Jan 11, 2026
Size: 50.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for freestylo-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`75df143954e298f74757a5de587d725ab660bbef424d2661d17728c6439bcbbd`
MD5	`c1756de4f81e36346d1482e5dde8e9c4`
BLAKE2b-256	`b0c1c2140f94aa18ee4a5d1bbcde0d66feb0a4850e21c337c3d9ab40aa7ba025`

See more details on using hashes here.

Provenance

The following attestation bundles were made for freestylo-0.9.0.tar.gz:

Publisher: package-and-publish.yml on cvjena/freestylo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: freestylo-0.9.0.tar.gz
- Subject digest: 75df143954e298f74757a5de587d725ab660bbef424d2661d17728c6439bcbbd
- Sigstore transparency entry: 813975651
- Sigstore integration time: Jan 11, 2026
Source repository:
- Permalink: cvjena/freestylo@9fe9b22ec556a9e726c5ceff7e8f9c575b1a96f8
- Branch / Tag: refs/tags/0.9.0
- Owner: https://github.com/cvjena
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: package-and-publish.yml@9fe9b22ec556a9e726c5ceff7e8f9c575b1a96f8
- Trigger Event: push

File details

Details for the file freestylo-0.9.0-py3-none-any.whl.

File metadata

Download URL: freestylo-0.9.0-py3-none-any.whl
Upload date: Jan 11, 2026
Size: 38.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for freestylo-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`26f5d539afbe898def4e9167cfc64915a60bda26c39eb48479e3cad023ea92cd`
MD5	`11573614589c40e59ad9fd2235059235`
BLAKE2b-256	`c987ad061c9932daa5ab5e5f103d72ba871f4a47bbfa18bb60698b36579c284c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for freestylo-0.9.0-py3-none-any.whl:

Publisher: package-and-publish.yml on cvjena/freestylo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: freestylo-0.9.0-py3-none-any.whl
- Subject digest: 26f5d539afbe898def4e9167cfc64915a60bda26c39eb48479e3cad023ea92cd
- Sigstore transparency entry: 813975653
- Sigstore integration time: Jan 11, 2026
Source repository:
- Permalink: cvjena/freestylo@9fe9b22ec556a9e726c5ceff7e8f9c575b1a96f8
- Branch / Tag: refs/tags/0.9.0
- Owner: https://github.com/cvjena
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: package-and-publish.yml@9fe9b22ec556a9e726c5ceff7e8f9c575b1a96f8
- Trigger Event: push

freestylo 0.9.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

FreeStylo - an easy-to-use stylistic device detection tool for stylometry

Installation

Configuration

Usage Examples

Standalone Tool

Library

Create your own detectors!

Participation

FreeStylo Configuration (.json) — Parameters & Example

Top-Level Keys

annotations Block

1) chiasmus

2) metaphor

3) epiphora

4) polysyndeton

5) alliteration

Example

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

FreeStylo Configuration (`.json`) — Parameters & Example

`annotations` Block

1) `chiasmus`

2) `metaphor`

3) `epiphora`

4) `polysyndeton`

5) `alliteration`