Automated text analysis with networks
Reason this release was yanked:
declares wrong dependency
Project description
This is a Python implementation of Chris Bail’s textnets package for R. It is free software under the terms of the GNU General Public License v3.
The underlying idea behind textnets is presented in this paper:
Christopher A. Bail, “Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media,” Proceedings of the National Academy of Sciences of the United States of America 113, no. 42 (2016), 11823–11828, doi:10.1073/pnas.1607151113.
Features
The library builds on the state-of-the-art library spacy for natural-language processing and igraph for network analysis. It uses the Leiden algorithm for community detection, which is able to perform community detection on the bipartite (word–group) network.
from textnets import Corpus, Textnet
c = Corpus.from_files('~/nltk_data/corpora/state_union/*.txt')
tn = Textnet(c.noun_phrases())
g_bipartite = tn.graph
g_bipartite.vs['cluster'] = tn.clusters.membership
g_groups = tn.project(node_type='doc')
g_words = tn.project(node_type='term')
In addition to providing a Python library, textnets can also be used as a command-line tool to generate network graphs from text corpora.
$ textnets --lex noun_phrases --node-type groups ~/nltk_data/corpora/state_union | gzip > sotu_groups.graphmlz
Run textnets --help for usage instructions.
Installing
In a virtual environment, run python setup.py install followed by python -m spacy download en_core_web_sm.
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for textnets-0.3.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 397f7ed1c94e3f729fff7df8bbc842ae8570f408febc6e9ca8482d2de51a5b3a |
|
MD5 | 5421d054add9229994d0937928f7dea0 |
|
BLAKE2b-256 | 1e3cf270433bd98d471fc200e2f89766ab5241d05fc7ae9496d76965b50afca7 |