LexiDecay is a semi-supervised lexical weighting model for unstructured text. It classifies content by adaptive word-frequency decay and soft lexical scoring. Fast (O(n·m)), language-flexible, and training-free — ideal for topic classification, semantic filtering, and intent detection.
Project description
⚡️ LexiDecay — The Adaptive Lexical Decay Classifier
By Mohammad Taha Gorji
A blazing-fast, semi-supervised text classification algorithm based on adaptive lexical weighting, frequency decay, and probabilistic scoring — all without any training or labeled dataset. LexiDecay is a semi-supervised lexical weighting model for unstructured text. It classifies content by adaptive word-frequency decay and soft lexical scoring. Fast (O(n·m)), language-flexible, and training-free — ideal for topic classification, semantic filtering, and intent detection.
🌌 Algorithm Philosophy & Core Idea
LexiDecay is inspired by the way human cognition evaluates language — not by rigid statistical training, but by dynamically weighting words according to their contextual importance and rarity.
Instead of “learning” through countless iterations, LexiDecay understands by measuring the gravitational pull of words within conceptual clusters.
The algorithm analyzes each category’s text content, counts and weights its tokens, and applies a decay function that reduces the influence of overly common words (like “the”, “of”, “and”).
During classification, it computes soft lexical similarities using adaptive decay, inverse document frequency, and a softmax-based probability normalization.
🧠 Philosophically, LexiDecay reflects a cognitive model of understanding — flexible, intuitive, and progressively self-balancing.
🧩 Scientific Position
| Category | Description |
|---|---|
| Learning Type | Semi-supervised lexical weighting |
| Data Type | Unstructured free text |
| Complexity | O(n × m) — n = words in input, m = number of categories |
| Core Mechanism | Adaptive word-frequency decay + soft lexical scoring |
| Primary Fields | NLP, cognitive AI, text understanding, knowledge extraction |
🚀 Real-World Applications
LexiDecay is suitable for a wide variety of language-intelligent systems:
- 🗂 Topic classification — Distinguish content across domains (e.g. science, art, politics).
- 🎯 Intent detection — Recognize user intentions from text queries or chatbot messages.
- 🧭 Semantic filtering — Filter or route information based on conceptual meaning.
- 🪶 Keyword-based reasoning — Identify thematic or conceptual similarity.
- 🧠 Cognitive AI prototypes — For lightweight, reasoning-like models without deep networks.
⚖️ Advantages Over Classical Models
| Feature | LexiDecay | Classical Models (Naive Bayes, TF-IDF, etc.) |
|---|---|---|
| Training Required | ❌ None — works instantly | ✅ Needs training |
| Computation Speed | ⚡ Extremely fast (O(n·m)) | 🐢 Often slower (training + inference) |
| Flexibility | 🧩 Add or remove categories freely | 🔒 Fixed to trained dataset |
| Data Requirements | 🌱 Works with few samples | 📊 Needs many labeled samples |
| Common Word Handling | 🪶 Auto frequency decay & adaptive weighting | ⚙️ Manual stopword removal |
| Language Support | 🌍 Fully language-independent | ⚠️ Usually language-specific |
| Explainability | 🔍 Transparent lexical logic | 🕳 Often black-box statistics |
💡 LexiDecay combines the interpretability of lexical systems with the adaptability of probabilistic models — no training, no fine-tuning, no waiting.
⚙️ Installation
pip install LexiDecay
That’s it! 🪄
🧱 Getting Started
Below is a full example of how to use LexiDecay from scratch.
from LexiDecay import LexiDecayModel
# 1️⃣ Create a model
m = LexiDecayModel()
# 2️⃣ Add categories (each category can be a string or list of texts)
m.add_category("science", open("science.txt").read())
m.add_category("philosophy", open("philosophy.txt").read())
# 3️⃣ Classify new input
text = "Quantum theories explore the probabilistic structure of the universe."
result = m.classify(text)
print(result["top"]) # ('science', score, probability)
print(result["probs"]) # Probabilities for all categories
🧠 Function Reference & API Details
🔹 add_category(label, content)
Adds or replaces a category.
label:str→ name of the categorycontent:strorList[str]→ text data belonging to that category
Automatically rebuilds the internal vocabulary and frequency statistics.
🔹 classify(input_text, decay=0.5, use_idf=False, auto_common_reduce=True, common_decay=0.7, min_common_mult=0.05, ignore_input_repetitions=False)
Performs text classification and returns a dictionary with:
{
"scores": {label: float, ...},
"probs": {label: float, ...},
"matches": {label: {matched words, stats...}},
"top": (best_label, score, probability)
}
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
decay |
float | 0.5 | Controls how strongly frequent words lose influence (0 = linear, 1 = no decay). |
use_idf |
bool | False | Applies inverse-document-frequency weighting. |
auto_common_reduce |
bool | True | Automatically detects common words and lowers their impact. |
common_decay |
float | 0.7 | Strength of reduction for common words. |
min_common_mult |
float | 0.05 | Minimum multiplier applied to frequent words. |
ignore_input_repetitions |
bool | False | If True, counts each unique input word only once. |
🔹 save_model(path)
Saves the entire model (categories + data) into a .pkl file.
m.save_model("lexidecay.pkl")
🔹 load_model(path)
Loads a model from a .pkl file.
m2 = LexiDecayModel.load_model("lexidecay.pkl")
🌟 Why LexiDecay Feels Different
- Human-like text perception: adaptive decay mimics cognitive salience.
- Instant deployability: no model training — just plug and classify.
- Infinite extendability: add categories anytime, instantly rebuilt.
- Compact and dependency-light: only requires NumPy.
- Transparent math: pure lexical weighting, fully explainable results.
🧬 Example: Multi-category Classification
from LexiDecay import LexiDecayModel
m = LexiDecayModel()
m.add_category("tech", ["AI","Model","AI algorithms", "neural networks", "deep learning"])
m.add_category("art", ["painting", "music", "creativity", "aesthetic beauty"])
m.add_category("sports", ["football", "strength", "competition"])
res = m.classify("New AI model beats humans at creative painting tasks.")
print(res)
# Output → ('art', score, probability)
🧩 Citation
If you use LexiDecay in academic work, please cite:
Mohammad Taha Gorji, LexiDecay: Semi-supervised Lexical Decay Model for Adaptive Text Classification (2025)
🔹 Examples
You can see LexiDecay Examples for some examples:
🪄 Author
Mohammad Taha Gorji Creator of LexiDecay AI Researcher & Cognitive Systems Developer
🖤 License
Apache2 License © 2025 — Mohammad Taha Gorji Open for research, education, and innovation.
“LexiDecay doesn’t learn — it understands.” 🧠✨
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lexidecay-1.0.0.tar.gz.
File metadata
- Download URL: lexidecay-1.0.0.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d55b6ae5a3f8c9139babb3756e81632f2d43a91657225e62e00952f3b994c8a
|
|
| MD5 |
0085626ac1d6fd9f5e718437ba35238b
|
|
| BLAKE2b-256 |
d23852b79feabf02316fb19e7417ea48aa2907c031387b954bd145207d1c4c97
|
File details
Details for the file lexidecay-1.0.0-py3-none-any.whl.
File metadata
- Download URL: lexidecay-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04bc2995ef8a4a49e76bb084382e04f8ba022501694bbf5f962b0685c778cc8f
|
|
| MD5 |
1e720e4d3048d5685c5f3dc47248a8f2
|
|
| BLAKE2b-256 |
97938042886365f5c9691cbf4804d01ba6d2e918d475d7e9886941e78f6dc365
|