Dendrogram Prototypical Discourse generator
Project description
Dendrogram Prototypical Discourse Analysis
According to [Harris, 1954] and [Rubenstein and Goodenough, 1965], words in natural languages are structured within linguistic environments (e.g.,sentences, paragraphs), and in this context, words having similar meanings, tend to share similar contexts. This assumption, known as the Distributional Hypothesis, suggests that a corpus is often constituted bys everal discursive contexts; each one being a set of extended linguistic environments, conveying similar/related concepts and topics. Although this theory emerged in linguistics in 1954, it received recently an in-creasing attention in many other fields such as in cognitive sciences (e.g.,[McDonald and Ramscar, 2001]), and natural language processing (e.g.,[Mikolov et al., 2013a]). This hypothesis is the founding principle of our approach. Our method aims at modeling a large corpus, as a set of so-called DP-discourses, and then studying them as prototypical speeches. To do so, the core step, consists in building clusters of words sharing similar dis-cursive contexts. This was achieved using word-embedding and subspace clustering, but other data-mining techniques could be used. Then, intra-cluster words were represented asDendrogram Prototypical Discourses(DP-discourses), using a hierarchical clustering algorithm. Finally, DP-discourses revealed to be comprehensible enough, to be studied using Charaudeau’s methodology, and they could possibly be analyzed using other discourse analysis approaches.
Installation
The easiest way to install the generator is using pip
the package installer for Python.
Typing the command:
pip install DPD
Tutorial
Check the jupyter notebook tutorial tutorials/tutorial1.ipynb
for a basic usage illustration
License
This project is under the GNU GENERAL PUBLIC LICENSE (Version 3, 29 June 2007)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file DPD-0.0.3.tar.gz
.
File metadata
- Download URL: DPD-0.0.3.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b4b7a8924dff04b8866615c2518ff28edc88024a33d5c54ec755c8da04c0a39 |
|
MD5 | 8fe2019a5b46bdd3c0df1455673e1f6a |
|
BLAKE2b-256 | 942a002db3483f6ca189e3f178d64c09df9c1df663901157ff932b2d47420d24 |
File details
Details for the file DPD-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: DPD-0.0.3-py3-none-any.whl
- Upload date:
- Size: 16.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91a9c746d9e02a3cdd5ec058c26ec3f71748403755eaecf43df382235ba2ecd3 |
|
MD5 | 17cb0a472c097e6805c6f274d24ab866 |
|
BLAKE2b-256 | afcf34bf43bd5fa3092ffc879e80b9944b0a2a13984b10681a30feda4c94d6d0 |