A KeyBERT-style negative sentiment and keyword extractor for workforce intelligence and marketing analysis

These details have not been verified by PyPI

Project links

Project description

KeyNeg Logo

KeyNeg

A KeyBERT-style Negative Sentiment and Keyword Extractor for Workforce Intelligence

Author: Kaossara Osseni Email: admin@grandnasser.com

KeyNeg extracts negative keywords, frustration indicators, and discontent signals from text. Designed for analyzing employee feedback, forum discussions, customer reviews, and more.

Installation

# Install from PyPI
pip install keyneg

# With the Streamlit app
pip install keyneg[app]

# With the polarity classifier (DistilBERT-SST2 ONNX, ~250MB on first run)
pip install keyneg[polarity]

# Everything
pip install keyneg[all]

What's new in 1.2 — see CHANGELOG at the bottom. Headline: real polarity classification (polarity_filter=True), negation-aware detectors ("I'm not quitting" no longer trips departure intent), score-bound fixes, deepcopy-isolated taxonomies per instance, and a 100+ test suite.

Quick Start

from keyneg import KeyNeg

# Initialize (uses all-mpnet-base-v2 by default)
kn = KeyNeg()

# Extract negative sentiments
sentiments = kn.extract_sentiments(
    "I'm frustrated with the constant micromanagement and lack of recognition"
)
print(sentiments)
# [('micromanagement', 0.72), ('frustration', 0.68), ('lack of recognition', 0.65), ...]

# Extract negative keywords
keywords = kn.extract_keywords(
    "The toxic culture and burnout is unbearable"
)
print(keywords)
# [('toxic culture', 0.81), ('burnout', 0.75), ('unbearable', 0.62), ...]

# Full analysis (topic-similarity only — fast, no extra deps)
result = kn.analyze("My manager never listens and I'm thinking of quitting")
print(result)
# {
#     'keywords': [...],
#     'sentiments': [...],
#     'top_sentiment': 'poor leadership',
#     'topic_match_score': 0.65,
#     'negativity_score': 0.65,           # alias of topic_match_score for back-compat
#     'polarity_score': 0.0,              # 0 until polarity_filter is on
#     'polarity_filter_applied': False,
#     'negative_sentences': [],
#     'categories': ['work_environment_culture', 'job_satisfaction']
# }

# With real polarity classification (requires `pip install keyneg[polarity]`):
result = kn.analyze(
    "I had a great session about preventing burnout today.",
    polarity_filter=True,
)
# {
#     ...,
#     'polarity_score':  0.82,            # net-positive sentence ⇒ classifier says non-negative
#     'polarity_filter_applied': True,
#     'topic_match_score': 0.0,           # nothing tagged because nothing was negative
#     'sentiments': [],
#     'keywords': [],
# }

Score interpretation — read this once

Field	Range	Meaning
`topic_match_score`	[0, 1]	Mean cosine similarity of the doc to detected negative-sentiment labels. Topical overlap with negative themes — not polarity.
`polarity_score`	[-1,1]	Signed polarity from the classifier (positive = positive tone, negative = negative). Populated only when `polarity_filter=True`.
`negativity_score`	[0, 1]	Backward-compat alias of `topic_match_score`. Prefer the new name in new code.

The earlier negativity_score name overstated what cosine similarity to negative-sounding labels can tell you. Without polarity_filter=True the score still measures topical overlap; with it on, you also get a real polarity reading from a fine-tuned DistilBERT-SST2 classifier.

Features

Sentiment Extraction

Extract predefined negative sentiment categories:

sentiments = kn.extract_sentiments(
    text,
    top_n=5,           # Number of results
    threshold=0.3,     # Minimum similarity score
    diversity=0.0      # MMR diversity (0-1)
)

Keyword Extraction

Extract negative keywords from both taxonomy and document:

keywords = kn.extract_keywords(
    text,
    top_n=10,
    threshold=0.25,
    keyphrase_ngram_range=(1, 2),
    use_taxonomy=True,
    diversity=0.0
)

Batch Processing

Efficiently process multiple documents:

docs = ["Comment 1...", "Comment 2...", "Comment 3..."]

# Batch analysis
results = kn.analyze_batch(docs, show_progress=True)

# Or individually
keywords_batch = kn.extract_keywords_batch(docs)
sentiments_batch = kn.extract_sentiments_batch(docs)

Special Detectors (negation-aware as of v1.2)

All three detectors run a token-level negation-scope analysis before matching. Phrases that fall inside a negation window (not, no, never, without, contractions like don't/can't, etc.) are skipped.

Departure Intent Detection:

kn.detect_departure_intent("I'm updating my resume and interviewing")
# {'detected': True, 'confidence': 0.67, 'signals': ['updating resume', 'interviewing']}

kn.detect_departure_intent("I'm not quitting")               # simple negation
# {'detected': False, 'confidence': 0.0, 'signals': []}

kn.detect_departure_intent("He's no longer thinking about quitting")  # multi-word
# {'detected': False, 'confidence': 0.0, 'signals': []}

kn.detect_departure_intent("I am not never quitting tomorrow")   # double-negative cancels
# {'detected': True, 'confidence': 0.33, 'signals': ['quitting']}

Domain-specific negators — for legal/regulatory text:

kn = KeyNeg(extra_negation_tokens=["notwithstanding"])
kn.detect_escalation_risk("Notwithstanding any lawyer involvement")
# {'detected': False, 'risk_level': 'low', 'signals': []}

Escalation Risk Detection:

kn.detect_escalation_risk("I'm contacting my lawyer about this")
# {'detected': True, 'risk_level': 'medium', 'signals': ['contact my lawyer']}

kn.detect_escalation_risk("I'm not contacting any lawyer")  # negation-aware
# {'detected': False, 'risk_level': 'low', 'signals': []}

Intensity Analysis:

kn.get_intensity("I'm absolutely furious about this")
# {'level': 3, 'label': 'strong', 'indicators': ['furious']}

Taxonomy Categories

KeyNeg includes a comprehensive taxonomy covering:

Work Environment & Culture: toxic culture, harassment, discrimination, favoritism
Management Issues: micromanagement, poor leadership, lack of direction
Recognition & Value: undervalued, unappreciated, credit stolen
Workload & Burnout: exhaustion, overwhelmed, unrealistic deadlines
Compensation: underpaid, pay disparity, poor benefits
Career Development: no growth, dead end job, glass ceiling
Work-Life Balance: excessive hours, no flexibility
Team Dynamics: conflict, poor collaboration, isolation
Job Satisfaction: low morale, frustration, disengagement
Customer/Product Issues: poor quality, bad service, overpriced

Customization

Add Custom Labels

kn.add_custom_labels(["impostor syndrome", "quiet firing"])

Add Custom Keywords

kn.add_custom_keywords("tech_specific", [
    "pager duty", "on-call nightmare", "technical debt"
])

Use Custom Model

kn = KeyNeg(model="all-MiniLM-L6-v2")  # Faster, slightly less accurate

Utility Functions

from keyneg.utils import (
    highlight_keywords,      # Highlight detected keywords in text
    score_to_severity,       # Convert score to severity label
    aggregate_batch_results, # Aggregate batch statistics
    export_to_json,          # Export results to JSON
    export_batch_to_csv,     # Export batch to CSV
    preprocess_text,         # Clean/preprocess text
    chunk_text,              # Split long text into chunks
)

# Highlight keywords in HTML
highlighted = highlight_keywords(text, keywords, format="html")

# Get severity
severity = score_to_severity(0.75)  # "critical"

# Aggregate batch results
summary = aggregate_batch_results(results)
print(summary['top_sentiments'])
print(summary['avg_negativity_score'])

Streamlit App

After pip install keyneg[app], launch the interactive UI via the installed entry point:

keyneg-app

(Equivalent to python -m streamlit run keyneg/app.py against the package-internal app module.)

Features:

Single text analysis with detailed results
Batch processing with file upload
Interactive visualizations
Export results to CSV

Use Cases

Employee Survey Analysis: Identify patterns of dissatisfaction across responses
Exit Interview Processing: Extract reasons for departure at scale
Forum Monitoring: Track sentiment on workforce forums (e.g., TheLayoffradar.com, Blind)
Customer Feedback: Analyze product reviews and support tickets
Social Media Monitoring: Track brand sentiment and complaints

API Integration

from fastapi import FastAPI
from keyneg import KeyNeg

app = FastAPI()
kn = KeyNeg()

@app.post("/analyze")
def analyze(text: str):
    return kn.analyze(text)

@app.post("/analyze_batch")
def analyze_batch(texts: list):
    return kn.analyze_batch(texts)

Limitations & known gaps

Read this before deploying:

Cosine similarity ≠ polarity. Without polarity_filter=True, a document discussing "burnout prevention" will topically match the burnout label. That's by design: topic_match_score is a topic signal, not a polarity signal. Pass polarity_filter=True (with the polarity extra installed) to get a real polarity reading.
Thresholds are starting points, not constants. 0.25 for keywords and 0.3 for sentiments work for English HR/feedback text. Tune them to your domain — there is no universally calibrated number for sentence-transformer cosine similarity.
Negation handling is window-based, not parser-based. The 4-token window catches "I'm not quitting" and "no plans to leave" reliably but will miss long-range negation across multiple clauses.
Detectors are recall-oriented. They flag candidates for review — they aren't classifiers. Human-in-the-loop review is recommended for any consequential decision.

What's new in 1.3

Major negation upgrade — algorithm ported from ONES-rs (the Rust-based NLP engine in our research repo). New cases handled:
- Multi-word phrases — lack of escalation plans, failed to address harassment, no longer thinking about quitting, by no means, unable to, have no, without any, etc. all open negation scope from the end of the phrase.
- Multi-word walls — on the other hand, despite the fact, even though, in contrast close the negation scope mid-sentence.
- Double-negation cancellation — not never quitting (count = 2) leaves quitting unnegated. not unhappy about escalating leaves escalating unnegated.
- Comma as wall — clause boundary that resets scope, not just ./?/!.
- Verbal negators — refused, prevented, denied, rejected, failed, rarely, hardly, scarcely.
- Domain-specific cues — pass extra_negation_tokens=[...] to the constructor to add custom negators (legal/regulatory text, industry idioms).
Polarity layer (optional, from 1.2) — pip install keyneg[polarity] adds an ONNX DistilBERT-SST2 classifier. Pass polarity_filter=True to analyze() for a polarity-first pipeline (split → classify → filter → tag).
Negation-aware detectors — detect_departure_intent, detect_escalation_risk, and get_intensity honor the upgraded scope analysis.
Score capped at 1.0 — the 1.2× boost on taxonomy-matched candidates can no longer push cosine scores above 1.0.
Deepcopy-isolated taxonomies — add_custom_keywords no longer leaks across instances. all_keywords now reads from the per-instance taxonomy, so customizations actually surface in extraction.
Cached lowercase keyword set — extracted to a property; not rebuilt on every call.
Score field clarification — topic_match_score is the new primary name; negativity_score is kept as an alias.
Packaging cleanup — single pyproject.toml source of truth (no more setup.py); the Streamlit app moved into the package as keyneg/app.py and is launched via the keyneg-app console script.
100+ test suite — pytest with regressions for every fix above, plus GitHub Actions CI on Python 3.9–3.12.

License

MIT License

Author

Kaossara Osseni Email: admin@grandnasser.com GitHub: https://github.com/Osseni94

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.0

May 6, 2026

1.1.0

Dec 18, 2025

1.0.0

Dec 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keyneg-1.3.0.tar.gz (56.5 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

keyneg-1.3.0-py3-none-any.whl (43.3 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file keyneg-1.3.0.tar.gz.

File metadata

Download URL: keyneg-1.3.0.tar.gz
Upload date: May 6, 2026
Size: 56.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for keyneg-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`185155040486bfdcd07d5b1b8f6d2ad453d7d26f55854f0e2f183d38ec2b9830`
MD5	`b8ff040f8e2a027af341b55ced5824a4`
BLAKE2b-256	`b27e00c1cb3ebd35092128f149bcea0b07fea37958e155790638c263284cd9cb`

See more details on using hashes here.

File details

Details for the file keyneg-1.3.0-py3-none-any.whl.

File metadata

Download URL: keyneg-1.3.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 43.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for keyneg-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f45ff5f866472061790374cbb80d10f988758a9a7100011a995464da2435b5ac`
MD5	`c538b122ed01dedfd0c73d880f89b71b`
BLAKE2b-256	`f4512feb54ee37e55a7714543fef8c6bc69fd4abd2855d3ffac50903088213d3`

See more details on using hashes here.

keyneg 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

KeyNeg

Installation

Quick Start

Score interpretation — read this once

Features

Sentiment Extraction

Keyword Extraction

Batch Processing

Special Detectors (negation-aware as of v1.2)

Taxonomy Categories

Customization

Add Custom Labels

Add Custom Keywords

Use Custom Model

Utility Functions

Streamlit App

Use Cases

API Integration

Limitations & known gaps

What's new in 1.3

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes