Skip to main content

Dendrogram Prototypical Discourse generator

Project description

Dendrogram Prototypical Discourse Analysis

According to [Harris, 1954] and [Rubenstein and Goodenough, 1965], words in natural languages are structured within linguistic environments (e.g.,sentences, paragraphs), and in this context, words having similar meanings, tend to share similar contexts. This assumption, known as the Distributional Hypothesis, suggests that a corpus is often constituted bys everal discursive contexts; each one being a set of extended linguistic environments, conveying similar/related concepts and topics. Although this theory emerged in linguistics in 1954, it received recently an in-creasing attention in many other fields such as in cognitive sciences (e.g.,[McDonald and Ramscar, 2001]), and natural language processing (e.g.,[Mikolov et al., 2013a]). This hypothesis is the founding principle of our approach. Our method aims at modeling a large corpus, as a set of so-called DP-discourses, and then studying them as prototypical speeches. To do so, the core step, consists in building clusters of words sharing similar dis-cursive contexts. This was achieved using word-embedding and subspace clustering, but other data-mining techniques could be used. Then, intra-cluster words were represented asDendrogram Prototypical Discourses(DP-discourses), using a hierarchical clustering algorithm. Finally, DP-discourses revealed to be comprehensible enough, to be studied using Charaudeau’s methodology, and they could possibly be analyzed using other discourse analysis approaches.

Installation

The easiest way to install the generator is using pip the package installer for Python. Typing the command:

pip install DPD

Tutorial

Check the jupyter notebook tutorial tutorials/tutorial1.ipynb for a basic usage illustration

License

This project is under the GNU GENERAL PUBLIC LICENSE (Version 3, 29 June 2007)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DPD-0.0.3.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

DPD-0.0.3-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file DPD-0.0.3.tar.gz.

File metadata

  • Download URL: DPD-0.0.3.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.4

File hashes

Hashes for DPD-0.0.3.tar.gz
Algorithm Hash digest
SHA256 8b4b7a8924dff04b8866615c2518ff28edc88024a33d5c54ec755c8da04c0a39
MD5 8fe2019a5b46bdd3c0df1455673e1f6a
BLAKE2b-256 942a002db3483f6ca189e3f178d64c09df9c1df663901157ff932b2d47420d24

See more details on using hashes here.

File details

Details for the file DPD-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: DPD-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.4

File hashes

Hashes for DPD-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 91a9c746d9e02a3cdd5ec058c26ec3f71748403755eaecf43df382235ba2ecd3
MD5 17cb0a472c097e6805c6f274d24ab866
BLAKE2b-256 afcf34bf43bd5fa3092ffc879e80b9944b0a2a13984b10681a30feda4c94d6d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page