Basic computational linguistics and natural language processing in Python
Project description
text_analytics
Basic computational linguistics and natural language processing in Python.
pip install textanalytics
pip install git+https://github.com/jonathandunn/text_analytics.git
This package provides code to support introductory courses in computational linguistics or natural language processing. These courses are available free on edX:
Introduction to Text Analytics and Natural Language Processing with Python
Visualizing Text Analytics and Natural Language Processing with Python
Usage
from text_analytics import TextAnalytics
ai = TextAnalytics()
Getting features
style, vocab_size = ai.get_features(df, features="style")
style = Function word n-grams
sentiment = Positive and negative words
content = Top content words with TD-IDF weighting, PMI for finding phrases, no stop words
constructions = A bag-of-constructions syntactic representation
Using a classifier
ai.shallow_classification(df, label, features="style", cv=False, classifier='svm')
ai.mlp(df, label, features="style", validation_set=False, test_size=0.10)
Unsupervised methods
Topic Models
ai.train_lda(df, n_topics, min_count)
topic_df = ai.use_lda(df, labels="Author")
Vector Semantics
ai.train_word2vec(file, min_count, workers)
Document and Word Clusters
cluster_df = ai.cluster(x, y=None, k)
*Nearest document searches
y_sample, y_closest = ai.linguistic_distance(x, y, sample=1, n=3)
Corpus Descriptions
PMI-based Phrases
ai.fit_phrases(df)
Delta P-based Phrases
association_df = ai.get_association(df, min_count = 1, save_phraser = True)
Basic word frequencies
vocab = ai._get_vocab_list(df, min_count, return_freq = True)
Corpus Comparisons
similarity = ai.get_corpus_similarity(df1, df2)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file textanalytics-1.1-py2.py3-none-any.whl
.
File metadata
- Download URL: textanalytics-1.1-py2.py3-none-any.whl
- Upload date:
- Size: 100.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72e04587d4599eda7166304872ad8aecadea8f5a1b8a28cf63831654b723de3b |
|
MD5 | 1682375ffae5aab0bce110f2e748a8df |
|
BLAKE2b-256 | e18062596bd3b6b8ed4230510935a34c70b45a9b18b144cc60674dedfb8854fa |