EmoTFIDF

Lexicon + TF-IDF emotion features (V1), hybrid transformer support, and V2 interpretable lexical evidence (EmoTFIDFv2). Lexicon: research use only.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mmsa12

These details have not been verified by PyPI

Project description

EmoTFIDF is an emotion detection library (Lexicon approach) based in the National Research Council Canada (NRC) and this package is for research purposes only. Source: [lexicons for research] (http://sentiment.nrc.ca/lexicons-for-research/)

#This library provides two types of emotions:

1- Lexicon based emotions which counting the frequency of the emotion based on the lexicon 2- Integrating TFIDF to add a context to the emotions.

#Installation

pip install EmoTFIDF

EmoTFIDF V2 (interpretable evidence layer)

The implementation lives in the EmoTFIDF.evidence package (from EmoTFIDF.evidence import EmoTFIDFv2). The class name EmoTFIDFv2 marks the second-generation API; the module name describes what it does.

V2 is a parallel API for research and tooling: it is meant as an interpretable lexical + TF-IDF evidence and feature module (explanations, richer vectors, prompt exports, and a light verifier for a proposed emotion label). It does not try to replace transformer baselines; see docs/emotfidf_v2_notes.md for design notes and suggested benchmarks.

Negation uses a fixed cue list and a short token window before each lexicon hit; contributions are scaled (and may flip sign) in a transparent, rule-based way. Intensifiers / downtoners apply simple multipliers in a short window before the affect token. The verifier surfaces lexical alignment only—it should not be read as semantic ground truth.

Evidence API quick example

from EmoTFIDF.evidence import EmoTFIDFv2

corpus = [
    "I am happy today and everything feels great.",
    "I am not happy today and everything feels wrong.",
    "I feel sad and disappointed about the news.",
]
engine = EmoTFIDFv2()
engine.fit(corpus)

text = "I am very happy today!"
analysis = engine.analyze(text)
print(analysis.dominant_emotions)
print(analysis.to_dict()["negation_hits"])

print(engine.verify_label(text, "joy"))
print(engine.to_prompt_features(text))

You can also import the class from the package root: from EmoTFIDF import EmoTFIDFv2.

Compare legacy V1 vs evidence API

From the repository root, run python experiments/compare_v1_v2.py to print side-by-side dominant labels, L1 distance on the seven emotion scores, and cosine similarity (scores use different formulas, so metrics are indicative). Pytest coverage: pytest tests/test_v1_v2_compare.py.

Curated regression benchmark (pre–full benchmark)

Run python experiments/benchmark_v1_v2_regression.py for a small JSON report on dominant agreement, abstention, negation, explanation token previews, and verifier calibration fields. This is a regression gate only—not a paper-scale benchmark. Pytest: pytest tests/test_benchmark_regression_smoke.py.

#List of emotions

-fear -anger -anticipation -trust -surprise -positive -negative -sadness -disgust -joy

#Example of usage ##Get emotions from a sentence

from EmoTFIDF.EmoTFIDF import EmoTFIDF

comment = "I had a GREAT week, thanks to YOU! I am very happy today."

emTFIDF = EmoTFIDF()

emTFIDF.set_text(comment)
print(emTFIDF.em_frequencies)

##Get emotions factorising TFIDF, you will need to add a context

Below is an example in pandas assuming you have a list of tweets/text and you would want to get emotions

emTFIDF  = EmoTFIDF()
def getEmotionsTFIDF(s,emTFIDF):
  emTFIDF.set_text(s)
  emTFIDF.get_emotfidf()
  return emTFIDF.em_tfidf


emTFIDF.computeTFIDF(df['text'])
df['emotions'] = new_df.apply(lambda x: getEmotionsTFIDF(x['text'], emTFIDF), axis=1)#em_tfidf
df2 = df['emotions'].apply(pd.Series)
final_df = pd.concat([df,df2],axis=1)

#Plotting Emotion Distribution You can visualize the distribution of emotions using the plot_emotion_distribution method:

from EmoTFIDF.EmoTFIDF import EmoTFIDF

comment = "I had a GREAT week, thanks to YOU! I am very happy today."

emTFIDF = EmoTFIDF()
emTFIDF.set_text(comment)
emTFIDF.plot_emotion_distribution()

#Plotting Top TFIDF Words To visualize the top N words by their TFIDF scores:

import pandas as pd
from EmoTFIDF.EmoTFIDF import EmoTFIDF

# Assuming df is your DataFrame and it has a column 'text'
emTFIDF = EmoTFIDF()
emTFIDF.compute_tfidf(df['text'])
emTFIDF.plot_top_tfidf(top_n=20)

#Plotting TFIDF Weighted Emotion Scores To visualize the TFIDF weighted emotion scores:

from EmoTFIDF.EmoTFIDF import EmoTFIDF

comment = "I had a GREAT week, thanks to YOU! I am very happy today."

emTFIDF = EmoTFIDF()
emTFIDF.set_text(comment)
emTFIDF.get_emotfidf()
emTFIDF.plot_emotfidf()

##Update 1.4.2

Integrated Hybrid Method for Emotion Detection New Features:

get_hybrid_emotions(text): Combines transformer-based and TFIDF weighted methods to get more accurate emotion scores.

import pandas as pd
from EmoTFIDF.EmoTFIDF import EmoTFIDF

# Sample comments
comments = [
    "I had a GREAT week, thanks to YOU! I am very happy today.",
    "This is terrible. I'm so angry and sad right now.",
    "Looking forward to the weekend! Feeling excited and joyful.",
    "I am disgusted by the recent events. It's just awful.",
    "What a surprising turn of events! I didn't see that coming.",
]

# Create an instance of EmoTFIDF
emTFIDF = EmoTFIDF()

# Lists to store results
lexicon_emotions = []
tfidf_emotions = []
transformer_emotions = []
hybrid_emotions = []

# Process each comment and collect emotion frequencies and hybrid emotion scores
for comment in comments:
    emTFIDF.set_text(comment)
    lexicon_emotions.append(emTFIDF.em_frequencies)
    emTFIDF.compute_tfidf([comment])
    tfidf_emotions.append(emTFIDF.get_emotfidf())
    transformer_emotions.append(emTFIDF.get_transformer_emotions(comment))
    hybrid_emotions.append(emTFIDF.get_hybrid_emotions(comment))

# Create a DataFrame for the comments
df = pd.DataFrame(comments, columns=['text'])

# Add lexicon-based emotion frequencies to the DataFrame
df['lexicon_emotions'] = lexicon_emotions

# Add TFIDF weighted emotion scores to the DataFrame
df['tfidf_emotions'] = tfidf_emotions

# Add transformer-based emotion scores to the DataFrame
df['transformer_emotions'] = transformer_emotions

# Add hybrid emotion scores to the DataFrame
df['hybrid_emotions'] = hybrid_emotions

# Print the DataFrame with the new columns
print(df)

##Update 1.4.0

Integrated transformer-based models for advanced emotion detection.

New Features: get_transformer_emotions(text): Uses a transformer model to get emotion scores.

plot_emotion_distribution(): Visualizes the distribution of emotions in the text using the transformer model.

import pandas as pd
from EmoTFIDF.EmoTFIDF import EmoTFIDF

# Sample comments
comments = [
    "I had a GREAT week, thanks to YOU! I am very happy today.",
    "This is terrible. I'm so angry and sad right now.",
    "Looking forward to the weekend! Feeling excited and joyful.",
    "I am disgusted by the recent events. It's just awful.",
    "What a surprising turn of events! I didn't see that coming.",
]

# Create an instance of EmoTFIDF
emTFIDF = EmoTFIDF()

# Lists to store results
lexicon_emotions = []
transformer_emotions = []

# Process each comment and collect emotion frequencies and transformer emotion scores
for comment in comments:
    emTFIDF.set_text(comment)
    lexicon_emotions.append(emTFIDF.em_frequencies)
    transformer_emotions.append(emTFIDF.get_transformer_emotions(comment))

# Create a DataFrame for the comments
df = pd.DataFrame(comments, columns=['text'])

# Add lexicon-based emotion frequencies to the DataFrame
df['lexicon_emotions'] = lexicon_emotions

# Add transformer-based emotion scores to the DataFrame
df['transformer_emotions'] = transformer_emotions

# Print the DataFrame with the new columns
print(df)

# Visualize the transformer-based emotion scores for a sample comment
sample_comment = "I had a GREAT week, thanks to YOU! I am very happy today."
transformer_emotions = emTFIDF.get_transformer_emotions(sample_comment)

# Plot the transformer-based emotion scores
import matplotlib.pyplot as plt
import seaborn as sns

def plot_transformer_emotion_distribution(emotions):
    labels = list(emotions.keys())
    scores = list(emotions.values())

    plt.figure(figsize=(10, 5))
    sns.barplot(x=labels, y=scores)
    plt.title('Transformer-based Emotion Scores')
    plt.xlabel('Emotions')
    plt.ylabel('Scores')
    plt.show()

plot_transformer_emotion_distribution(transformer_emotions)

##Update 1.3.0

Introduced new plotting features to visualize the distribution of emotions, top TFIDF words, and TFIDF weighted emotion scores.

New Methods: plot_emotion_distribution(): Visualizes the distribution of emotions in the text.

plot_top_tfidf(top_n=20): Visualizes the top N words by their TFIDF scores.

plot_emotfidf(): Visualizes the TFIDF weighted emotion scores.

These features enhance the interpretability of the emotion analysis by providing insightful visualizations.

##Update 1.0.7

Thanks to artofchores, from Reddit for his feedback.

Added a set_lexicon_path option if you would like to use your own lexicon Remember to keep the same structure as the original emotions lexicon which located here

emTFIDF.set_lexicon_path("other_lexicon.json")

##Update 1.1.1

Updated the lexical db with some help from ChatGPT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mmsa12

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.0.1

Apr 18, 2026

1.4.2

Jul 9, 2024

1.4.1

Jul 9, 2024

1.4.0

Jul 9, 2024

1.3.0

Jul 9, 2024

1.2.8

Jul 9, 2024

1.2.7

Jul 9, 2024

1.2.6

Jul 8, 2024

1.2.5

Jul 8, 2024

1.2.3

Jul 8, 2024

1.2.2

Jul 8, 2024

1.2.1

Jul 8, 2024

1.2.0

Jul 8, 2024

1.1.10

Jul 8, 2024

1.1.5

Apr 20, 2023

1.1.4

Apr 19, 2023

1.1.3

Apr 18, 2023

1.1.2

Apr 18, 2023

1.1.1

Apr 18, 2023

1.1.0

Apr 17, 2023

1.0.12

Apr 15, 2023

1.0.11

Apr 15, 2023

1.0.10

Mar 5, 2021

1.0.9

Mar 5, 2021

1.0.8

Mar 5, 2021

1.0.7

Mar 4, 2021

1.0.6

Mar 3, 2021

1.0.5

Mar 2, 2021

1.0.4

Mar 2, 2021

1.0.3

Mar 2, 2021

1.0.2

Mar 2, 2021

1.0.1

Mar 2, 2021

1.0.0

Mar 2, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emotfidf-2.0.1.tar.gz (101.7 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

emotfidf-2.0.1-py3-none-any.whl (105.6 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file emotfidf-2.0.1.tar.gz.

File metadata

Download URL: emotfidf-2.0.1.tar.gz
Upload date: Apr 18, 2026
Size: 101.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for emotfidf-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`28258aaa1b070bf93f1c1e049b9b75e19e97c5e308b7733b95955459cbe1ba23`
MD5	`68f13170d22755cd0c04ed2efd695454`
BLAKE2b-256	`8d468922fb3393a0cdf365e33e213bba999ba74de53cfc1c98f49fff695de10e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for emotfidf-2.0.1.tar.gz:

Publisher: ci.yml on mmsa/EmoTFIDF

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: emotfidf-2.0.1.tar.gz
- Subject digest: 28258aaa1b070bf93f1c1e049b9b75e19e97c5e308b7733b95955459cbe1ba23
- Sigstore transparency entry: 1338645544
- Sigstore integration time: Apr 18, 2026
Source repository:
- Permalink: mmsa/EmoTFIDF@c57b06ea499af7a6eeed044b67657e4da67b4548
- Branch / Tag: refs/tags/v2.0.1
- Owner: https://github.com/mmsa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@c57b06ea499af7a6eeed044b67657e4da67b4548
- Trigger Event: push

File details

Details for the file emotfidf-2.0.1-py3-none-any.whl.

File metadata

Download URL: emotfidf-2.0.1-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 105.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for emotfidf-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`793490a590858f01d457e66054dd58032ecc9c2e66cca8b3401c3bed6b9d0a71`
MD5	`8943390bd01172dae56a52aa87236bc6`
BLAKE2b-256	`13eee2ddfbd938a552f9bb5da8fdf0df3c09be6c40abec00c5eafe7206dc1a9c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for emotfidf-2.0.1-py3-none-any.whl:

Publisher: ci.yml on mmsa/EmoTFIDF

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: emotfidf-2.0.1-py3-none-any.whl
- Subject digest: 793490a590858f01d457e66054dd58032ecc9c2e66cca8b3401c3bed6b9d0a71
- Sigstore transparency entry: 1338645545
- Sigstore integration time: Apr 18, 2026
Source repository:
- Permalink: mmsa/EmoTFIDF@c57b06ea499af7a6eeed044b67657e4da67b4548
- Branch / Tag: refs/tags/v2.0.1
- Owner: https://github.com/mmsa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@c57b06ea499af7a6eeed044b67657e4da67b4548
- Trigger Event: push

EmoTFIDF 2.0.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

EmoTFIDF V2 (interpretable evidence layer)

Evidence API quick example

Compare legacy V1 vs evidence API

Curated regression benchmark (pre–full benchmark)

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance