Skip to main content

Pure-Python text analysis: readability, vocabulary richness, sentiment, n-grams

Project description

textstat

Text analysis for Python. Readability scores, vocabulary stats, sentiment, n-grams — no dependencies.

pip install textstat-py

Usage

from textstat import analyze, flesch_reading_ease, grade_level_consensus

text = open("essay.txt").read()

print(flesch_reading_ease(text))    # 68.4
print(grade_level_consensus(text))  # 9.2

stats = analyze(text)
# stats is a flat dict with everything:
# reading_time_min, sentiment_label, vocabulary_richness, sentence_stats, ...

CLI

textstat document.txt
cat file.txt | textstat
textstat --json report.txt

Functions

Readability

  • flesch_reading_ease(text) — 0–100
  • flesch_kincaid_grade(text) — US grade level
  • gunning_fog(text) — years of education
  • coleman_liau_index(text)
  • automated_readability_index(text)
  • smog_index(text)
  • grade_level_consensus(text) — average across all grade metrics

Vocabulary

  • lexical_diversity(text) — type-token ratio
  • mattr(text, window=100) — moving-average TTR
  • herdan_c(text), yule_k(text)
  • hapax_legomena_ratio(text) — fraction of words appearing once
  • vocabulary_richness(text) — all of the above as a dict

Counts & structure

  • count_words(text), count_sentences(text), count_paragraphs(text)
  • reading_time(text, wpm=200)
  • sentence_stats(text), paragraph_stats(text)

Sentiment

  • sentiment_polarity(text) — −1 to +1
  • sentiment_label(text) — "positive" / "neutral" / "negative"

N-grams

  • top_ngrams(text, n=2, k=10)
  • ngram_diversity(text, n=2)
  • ngram_stats(text)

Misc

  • top_words(text, n=10)
  • word_frequency_distribution(text)
  • text_density(text)

Requirements

Python 3.8+

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textstat_py-0.1.1.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textstat_py-0.1.1-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file textstat_py-0.1.1.tar.gz.

File metadata

  • Download URL: textstat_py-0.1.1.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for textstat_py-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9d9011d1cd1cf6f016c5baa14ac582a1829bd0c88a870e7daaff1262c5a7fa79
MD5 bae923f98c2a79e2fb3a6e2732564939
BLAKE2b-256 d598540bada82a706e8a3042ea19383ec53e1aac0da6523c3655cc64c79e63c2

See more details on using hashes here.

File details

Details for the file textstat_py-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: textstat_py-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for textstat_py-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 37761844616480e944c9e6f0a1581815d32e0fc701857d1509ae0f5c6d8c9512
MD5 1daac5bb6543979070d8443546d1b9b5
BLAKE2b-256 4f5888697d1978848b2b6553092bb2af3e0a8db97401c53213b75e737a4f2c30

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page