Skip to main content

Comprehensive Python package for stylometric analysis

Project description

pystylometry

PyPI version Downloads Python 3.11+ License: MIT Tests

Stylometric analysis and authorship attribution for Python. 50+ metrics across 11 modules, from vocabulary diversity to AI-generation detection.

Install

pip install pystylometry              # Core (lexical metrics)
pip install pystylometry[all]         # Everything
Individual extras
pip install pystylometry[readability]   # Readability formulas (pronouncing, spaCy)
pip install pystylometry[syntactic]     # POS/parse analysis (spaCy)
pip install pystylometry[authorship]    # Attribution methods
pip install pystylometry[ngrams]        # N-gram entropy
pip install pystylometry[viz]           # Matplotlib visualizations

Usage

from pystylometry.lexical import compute_mtld, compute_yule
from pystylometry.readability import compute_flesch

result = compute_mtld(text)
print(result.mtld_average)       # 72.4

result = compute_flesch(text)
print(result.reading_ease)       # 65.2
print(result.grade_level)        # 8.1

Every function returns a typed dataclass with the score, components, and metadata -- never a bare float.

Unified API

from pystylometry import analyze

results = analyze(text, lexical=True, readability=True, syntactic=True)

Style Drift Detection

Detect authorship changes, spliced content, and AI-generated text within a single document.

from pystylometry.consistency import compute_kilgarriff_drift

result = compute_kilgarriff_drift(document)
print(result.pattern)             # "sudden_spike"
print(result.pattern_confidence)  # 0.71
print(result.max_location)        # Window 23 -- the splice point

CLI

pystylometry-drift manuscript.txt --window-size=500 --stride=250
pystylometry-viewer report.html

Modules

Module Metrics Description
lexical TTR, MTLD, Yule's K/I, Hapax, MATTR, VocD-D, HD-D, MSTTR, function words, word frequency Vocabulary diversity and richness
readability Flesch, Flesch-Kincaid, SMOG, Gunning Fog, Coleman-Liau, ARI, Dale-Chall, Fry, FORCAST, Linsear Write, Powers-Sumner-Kearl Grade-level and difficulty scoring
syntactic POS ratios, sentence types, parse tree depth, clausal density, passive voice, T-units, dependency distance Sentence and parse structure (requires spaCy)
authorship Burrows' Delta, Cosine Delta, Zeta, Kilgarriff chi-squared, MinMax, John's Delta, NCD Author attribution and text comparison
stylistic Contractions, hedges, intensifiers, modals, punctuation, vocabulary overlap (Jaccard/Dice/Cosine/KL), cohesion, genre/register Style markers and text similarity
character Letter frequencies, digit/uppercase ratios, special characters, whitespace Character-level fingerprinting
ngrams Word/character/POS n-grams, Shannon entropy, skipgrams N-gram profiles and entropy
dialect British/American classification, spelling/grammar/vocabulary markers, markedness Regional dialect detection
consistency Sliding-window chi-squared drift, pattern classification Intra-document style analysis
prosody Syllable stress, rhythm regularity Prose rhythm (requires spaCy)
viz Timeline, scatter, report (PNG + interactive HTML) Drift detection visualization

Development

git clone https://github.com/craigtrim/pystylometry && cd pystylometry
pip install -e ".[dev,all]"
make test       # 1022 tests
make lint       # ruff + mypy
make all        # lint + test + build

License

MIT

Author

Craig Trim -- craigtrim@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pystylometry-1.3.0.tar.gz (207.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pystylometry-1.3.0-py3-none-any.whl (251.5 kB view details)

Uploaded Python 3

File details

Details for the file pystylometry-1.3.0.tar.gz.

File metadata

  • Download URL: pystylometry-1.3.0.tar.gz
  • Upload date:
  • Size: 207.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for pystylometry-1.3.0.tar.gz
Algorithm Hash digest
SHA256 f86d2502fb4794688b4be89d492184c750fd263f41699053347c894c28e7d1e6
MD5 93e811d60019ccba55f0a67353daa412
BLAKE2b-256 22e5bebc0a137f8e468d9d8c0f15d417624e3aa08d7230e93e06b32c23479c77

See more details on using hashes here.

File details

Details for the file pystylometry-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: pystylometry-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 251.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for pystylometry-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2c0f01605047e8bd85d55fac782f808d00ee2f36870d69189f8965e3ea9416bc
MD5 928d3b81925786f1415c931a5e2a4d07
BLAKE2b-256 fab0c62cfbdb471e4300e40c15a925c13b083bbde4e9018b4a8eb321c206cda6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page