Skip to main content

Pure-Python text analysis: readability, vocabulary richness, sentiment, n-grams

Project description

textstat

Text analysis for Python. Readability scores, vocabulary stats, sentiment, n-grams — no dependencies.

pip install textstat-py

Usage

from textstat import analyze, flesch_reading_ease, grade_level_consensus

text = open("essay.txt").read()

print(flesch_reading_ease(text))    # 68.4
print(grade_level_consensus(text))  # 9.2

stats = analyze(text)
# stats is a flat dict with everything:
# reading_time_min, sentiment_label, vocabulary_richness, sentence_stats, ...

CLI

textstat document.txt
cat file.txt | textstat
textstat --json report.txt

Functions

Readability

  • flesch_reading_ease(text) — 0–100
  • flesch_kincaid_grade(text) — US grade level
  • gunning_fog(text) — years of education
  • coleman_liau_index(text)
  • automated_readability_index(text)
  • smog_index(text)
  • grade_level_consensus(text) — average across all grade metrics

Vocabulary

  • lexical_diversity(text) — type-token ratio
  • mattr(text, window=100) — moving-average TTR
  • herdan_c(text), yule_k(text)
  • hapax_legomena_ratio(text) — fraction of words appearing once
  • vocabulary_richness(text) — all of the above as a dict

Counts & structure

  • count_words(text), count_sentences(text), count_paragraphs(text)
  • reading_time(text, wpm=200)
  • sentence_stats(text), paragraph_stats(text)

Sentiment

  • sentiment_polarity(text) — −1 to +1
  • sentiment_label(text) — "positive" / "neutral" / "negative"

N-grams

  • top_ngrams(text, n=2, k=10)
  • ngram_diversity(text, n=2)
  • ngram_stats(text)

Misc

  • top_words(text, n=10)
  • word_frequency_distribution(text)
  • text_density(text)

Requirements

Python 3.8+

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textstat_py-0.1.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textstat_py-0.1.0-py3-none-any.whl (2.4 kB view details)

Uploaded Python 3

File details

Details for the file textstat_py-0.1.0.tar.gz.

File metadata

  • Download URL: textstat_py-0.1.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for textstat_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ccee012d31d8f7bcebc565c3d2d8eaef9b981dedcd2d6179cc87d01c6cc06dc6
MD5 072fe358cf2fbe5de785043e3e3b6b21
BLAKE2b-256 0c52dafb5ebda7d84060799643c9c72ba3a903277e1a3c1c392c2853ef1b177d

See more details on using hashes here.

File details

Details for the file textstat_py-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: textstat_py-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for textstat_py-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 57b171860b2bb09aed8171c57689e428af0a2e23ee5a59af2b20e2dfff46e8f6
MD5 39782c836b690841532768fbc5125a2d
BLAKE2b-256 c0bc91e4297b4dd99d957d15b1ccf1e71913654e49854e43e2f7a3018b7a8ffb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page