Skip to main content

A package for text processing.

Project description

Subtext 2

A package for Natural Language Processing (NLP). This includes minor functions for processing text, as well as machine learning algorithms to perform an in-depth analysis.

Subtext 2 introduces more advanced tools for analysis. As the package is now focused on deployment of such tools, previous functions will now be under miscellaneous section.

As of now, my development plan is in shambles and the only "advanced" algorithm you can currently access is SentimentAnalyser. But the analyser is quite good so I hope you can forgive me for that.

Install

You can install this package through PyPi,

pip install subtext

or, if you were nice enough to have this installed on your device already, you can upgrade the package using

pip install --upgrade subtext

and import using

import subtext

Sentiment Analyser

Overview

The SentimentAnalyser class is designed to perform sentiment analysis on text data using n-grams. It allows users to input sentences with their respective sentiment scores, calculate average scores for each n-gram, and analyze the sentiment of new sentences based on the stored n-grams.

Class methods

  • init(self): Initializes the SentimentAnalyser object.
  • generate_ngrams(self, sentence, n): Generates n-grams from a given sentence.
  • add_sentences(self, sentences, scores, n_grams=1): Adds a list of sentences and their respective sentiment scores to the analyser.
  • calculate_average_scores(self): Calculates the average sentiment scores for each n-gram in the analyser.
  • analyse(self, sentence, n_grams=1, detailed_view=False): Analyzes the sentiment of a given sentence based on the stored n-grams. Once detailed_view is enabled, the user can see the workings behind the analysis.

Example Usage

from subtext import SentimentAnalyser

analyser = SentimentAnalyser()

# Add sentences and their respective scores
sentences = ["I love this movie.", "I hate this movie."]
scores = [0.8, -0.8]
analyser.add_sentences(sentences, scores, n_grams=2)

# Analyze the sentiment of a sentence
sentence = "I love this movie, but I hate the ending."
sentiment_score = analyser.analyse(sentence, n_grams=2)
print(sentiment_score)

# Analyze the sentiment of a sentence with detailed_view
sentiment_score_detailed = analyser.analyse(sentence, n_grams=2, detailed_view=True)
print(sentiment_score_detailed)

n_grams(self, sentence, n)

Generates n-grams from a given sentence.

Parameters:

  • sentence (str): The input sentence.
  • n (int): The length of the n-grams to generate.

Returns: A list of n-grams (list of lists of strings).

Example Usage

from subtext import n_grams

# Generate n-grams from a sentence
sentence = "I love this movie."
ngrams = n_grams(sentence, 2) # this would make bigrams
print(ngrams)

Output:

[['I', 'love'], ['love', 'this'], ['this', 'movie.']]

Subtext 1

These are misc functions that were developed during initial release of Subtext.

Predict

A function that predicts the next x number of words based on the given string and phrase

Parameters

The function's parameters are:

subtext.predict(string, phrase, n=0, case_insensitive=False)
  • String: Main text
  • Phrase: The key phrase (prompt). The function would try to predict what would come after the given phrase.
  • n: The number of words it would return. It's automomatically set to 0, which would return all predictions regardless of their corresponding word counts.
  • case_insensitive: Set this to True if you want to.

Actual usage

So, let's try to use this.

string="I am a string. I am also a human being, but most importantly, I am a string."
print(predict(string, "I am", n=1))

This would output

{'a': 2, 'also': 1}

But, if you change the n value,

print(predict(string, "I am", n=2))

It would output

{'a string.': 2, 'also a': 1}

Identify Syllables

subtext.syllables("carbonmonoxide")

This outputs:

car-bon-mon-ox-ide

But take note that this only works with lowercase strings.

Countwords

Parameters

The function's parameters are:

subtext.countwords(string, case_insensitive=False)

Change that to True if you want it to be case-insensitive.

Actual usage

Get yourself a nice string

string = "Sometimes I wonder, 'Am I stupid?' then I realize, yeah. yeah, I am stupid."

Then put it in the function:

x = subtext.countwords(string)
print(x)

It should print:

{'I': 4, 'Sometimes': 1, 'wonder,': 1, "'Am": 1, "stupid?'": 1, 'then': 1, 'realize,': 1, 'yeah.': 1, 'yeah,': 1, 'am': 1, 'stupid.': 1}

Matchingwords

A function that finds & counts matching words in two strings

Actual usage

So in this case, our strings are:

string1, string2 = "God, I love drawing, drawing is my favourite thing to do", "God, I hate drawing, drawing is my least favourite thing to do"

If we run this through matchingwords, we would get:

{'God,': 1, 'I': 1, 'drawing,': 1, 'drawing': 1, 'is': 1, 'my': 1, 'favourite': 1, 'thing': 1, 'to': 1, 'do': 1}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Subtext-2.0.3.tar.gz (6.3 kB view details)

Uploaded Source

File details

Details for the file Subtext-2.0.3.tar.gz.

File metadata

  • Download URL: Subtext-2.0.3.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for Subtext-2.0.3.tar.gz
Algorithm Hash digest
SHA256 7ace837c1bc13e70d1648959fe9a8adf5387c33caa827ac426d529a6f982d023
MD5 1f6352fc382059d446e59dddff087210
BLAKE2b-256 80e56d3d109f3375209779a5b8c27a276b7ea7d479772304006ab16ebe11d383

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page