No project description provided
Project description
CIDER Python Package
CIDER (Context Informed Dictionary and sEntiment Reasoner) is a Python library used to improve domain-specific sentiment analysis.
It generates, filters, and substitutes polarities into VADER. The approach taken to generate polarities is taken from SocialSent.
Contents
Installation
Before you begin, ensure you have met the following requirements:
- You have installed Python 3.7 or later.
- You have a Windows/Linux/Mac machine.
To install CIDER, follow these steps:
pip install ciderpolarity
Overview
The easiest way to use the package is as follows:
from ciderpolarity import CIDER
# For a running example, the ideal input will have many thousands of lines.
texts = ['Really hate this heat. Just want AC',
'I love an icecream in this heat!',
'I’m melting - terrible weather!',
'Very dehydrated in this heat',
... ,
'this sunny weather is great',
'Oh my icecream is melting',
'My AC is broken! 🥵'],
output_folder = '/path/to/output/folder/'
cdr = CIDER(input_file, output_folder)
results = cdr.fit_transform()
This trains the model, creating a customised VADER classifier, before classifying the provided input using the model. An example output is as follows:
results = [
['Really hate this heat. Just want AC', {"neg":0.6, "neu":0.4, "pos":0.0, "compound":-0.6}],
['I love an icecream in this heat!', {"neg":0.0, "neu":0.5, "pos":0.5, "compound":0.6}],
['I’m melting - terrible weather!', {"neg":0.7, "neu":0.3, "pos":0.0, "compound":-0.7}],
['Very dehydrated in this heat', {"neg":0.5, "neu":0.4, "pos":0.0, "compound":-0.5}],
...
['this sunny weather is great', {"neg":0.0, "neu":0.2, "pos":0.8, "compound":0.7}],
['Oh my icecream is melting', {"neg":0.3, "neu":0.4, "pos":0.3, "compound":0.0}],
['My AC is broken! 🥵', {"neg":0.6, "neu":0.4, "pos":0.0, "compound":-0.6}],
]
Examples
Some alternative ways to use the library are as follows:
Applying CIDER to a saved dataset, adding custom seed words, custom stopwords, and tuning various parameters:
POS_seeds = {'lovely':1, 'excellent':2, 'fortunate':4, 'excited':1, 'loves':2, '♥':1, '🙂':2}
NEG_seeds = {'bad':1, 'horrible':2, 'hate':4, 'crappy':1, 'sad':2, 'bitch':1, 'hates':2}
input_file = '/path/to/input/file.csv'
output = '/path/to/output/test_outputs/'
cdr_example = CIDER(input_file, # input path (one column csv file where each row is a text entry)
output, # output path
iterations=100, # number of iterations for bootstrapped label propagation
stopwords=['i', 'it', 'the'], # custom stopwords, alternativly set as 'default' for the nltk set
keep=['code', 'python'], # words to force into the final lexicon
no_below=5, # exclude words that occur fewer times than this
max_polarities_returned=3000, # maximum number of words returned
pos_seeds=POS_seeds, # positive seeds with custom weighting
neg_seeds=NEG_seeds, # negative seeds with custom weighting
verbose=False) # whether to print progress or not
If the model only requires training, the following can be executed:
cdr_example.fit()
And the resulting polarities (before filtering and scaling) can be viewed:
Generating Seedwords
Whilst CIDER has built in seed words (found here), custom seed words can be generated and suggested. The following shows how this is carried out:
Pos, Neg = cdr_example.generate_seeds(['good','brilliant','love'],['bad','terrible','hate'], n=20, sentiment = True)
Which looks at strongly polarised words which occur both often, are close to one seed set, and distant from the opposing seed set.
The following returns all words in the data, alongside their seed word suitability.
df = cdr_example.generate_seeds(['good','brilliant','love'],['bad','terrible','hate'], return_all = True, sentiment = True)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ciderpolarity-0.2.1.tar.gz
.
File metadata
- Download URL: ciderpolarity-0.2.1.tar.gz
- Upload date:
- Size: 14.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.10 Linux/5.19.0-46-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c33f4a2a48d7af36fbaf01462a77632f1444471a5516e57dae76e6ec834960b5 |
|
MD5 | f4f4206dc4da22ac1fd5cb6679b64d9f |
|
BLAKE2b-256 | de0a01a4ac68951cea779a3c8c842844821a02d6f9a22457aab8068874582813 |
Provenance
File details
Details for the file ciderpolarity-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: ciderpolarity-0.2.1-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.10 Linux/5.19.0-46-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b07a06313af8a663bf0a2652f6a4f342169cbc53ee0ecbe1fa1494148deac495 |
|
MD5 | f504b1affa3084a24cf54f0ef767b92d |
|
BLAKE2b-256 | b973ad0d20087ad7db4ce625f1a38ae514947f4c45e08bd4a4b8af83fb050446 |