Skip to main content

A tool for quantitatively measuring discursive similarity between bodies of text.

Project description

Quantitative Discursive Analysis (QDA)

(C) 2019 Mark M. Bailey, PhD

About

Quantitative Discursive Analysis (QDA) converts bodies of text into graph objects built from noun phrases. Each noun or modifier becomes a vertex, and edges are determined by how nouns and modifiers are linked within phrases. The more central a noun is to the overall text content, the higher its centrality measure. This makes the graph representation more robust than simple keyword frequencies.

QDA compares discursive content by calculating resonance between two texts. Resonance is the cosine similarity of the betweenness-centrality vectors for the intersection of vertices in both texts. Values are normalized to [0, 1], where 0 indicates no overlap and 1 indicates perfect overlap.

Installation

pip install .

Dependencies

  • Python 3.10+
  • networkx
  • numpy
  • textblob

Important: TextBlob corpora required for default extractor

The default noun phrase extraction method (textblob) requires TextBlob/NLTK corpora.

python -m textblob.download_corpora

If corpora are unavailable, use the fallback extractor documented below.

Quickstart

Default extractor (textblob)

import QDA

text_a = "This is a string of text about politics and economics."
text_b = "This is a different string of text about music and art."

g1 = QDA.discursive_object(text_a)  # noun_extractor='textblob' by default
g2 = QDA.discursive_object(text_b)

print(QDA.resonate(g1, g2))

Fallback extractor (simple, no corpora required)

import QDA

text_a = "This is a string of text about politics and economics."
text_b = "This is a different string of text about music and art."

g1 = QDA.discursive_object(text_a, noun_extractor="simple")
g2 = QDA.discursive_object(text_b, noun_extractor="simple")

print(QDA.resonate(g1, g2))

API summary

  • QDA.discursive_object(text, noun_extractor="textblob")
  • QDA.resonate(g1, g2)
  • QDA.resonate_as_series(G_list)
  • QDA.resonate_as_matrix(G_list)
  • QDA.discursive_community(G_list)

Development

Run tests with:

pytest

Notes and limitations

  • Large texts may be slow because betweenness centrality is computationally expensive.
  • Results depend on noun phrase extraction method (textblob vs simple).
  • simple is a compatibility fallback and may produce different phrase quality than TextBlob.

Changelog

  • 0.1.0
    • Added Python 3.10+ packaging support and pytest test suite.
    • Added explicit TextBlob missing-corpora error messaging and optional simple extractor.
    • Added NetworkX compatibility shim for NumPy graph conversion.
    • Improved performance of matrix resonance and graph construction helpers.
    • Added GitHub Actions CI for Python 3.10/3.11/3.12.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qda-0.1.0.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qda-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file qda-0.1.0.tar.gz.

File metadata

  • Download URL: qda-0.1.0.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for qda-0.1.0.tar.gz
Algorithm Hash digest
SHA256 447f5d930334328ea441425537627e51d9ec64f40fffbe0ae097b25d55b4cc28
MD5 5ba80437d413aa1c46e710dd4f2e8912
BLAKE2b-256 a4741edec2e617d75cfe7a26c173c6a640b3e7317892fcfc54cd21de41b9b703

See more details on using hashes here.

File details

Details for the file qda-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: qda-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for qda-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eec2cd7e09e15485f7ef962e4f23a2db5d6cc76706025abd6d886e1b59d11493
MD5 e76034e218b463aa6f244aa2ceb1536e
BLAKE2b-256 b3923c594360bfb4ef6fd41840b79d70cc0a57cbc55c580a2d0e1d4c707f2b8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page