One-liner NLP utilities -- summarize, classify, extract entities, analyze sentiment -- with rule-based fallbacks and HuggingFace backends.
Project description
anynlp
NLP utilities that work out of the box — rule-based by default, HuggingFace/LLM-boosted on demand.
anynlp is a graceful-degradation NLP toolkit. Every task — summarization, classification, NER, sentiment, keyword extraction, similarity, chunking, clustering — has a pure-Python rule-based implementation that runs with zero dependencies. Install the optional transformers extra for pretrained HuggingFace models, or the llm extra to route any task through anyllm for state-of-the-art quality. Perfect for RAG pipelines, content analysis, and lightweight text processing.
Built by Viet-Anh Nguyen at NRL.ai.
Why anynlp?
- One-liner API —
anynlp.sentiment("I love this")just works, no downloads - Plugin architecture — Every task has
rules/hf/llmbackends, pick per-call - Local-first — Rule-based backends run entirely in pure Python, zero deps
- Minimal core deps — None! Heavy backends are all optional extras
- Production-ready — Dataclass results, streaming, batch processing
Installation
pip install anynlp
For stronger backends:
pip install anynlp[transformers] # HuggingFace pretrained models
pip install anynlp[llm] # route through anyllm (Ollama, OpenAI, ...)
pip install anynlp[sklearn] # KMeans clustering + TF-IDF speedups
pip install anynlp[all] # everything
Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)
Quick Start
import anynlp
# 1. Extractive summarization (pure Python, zero deps)
summary = anynlp.summarize("Long article text...", max_sentences=3)
# 2. Sentiment analysis (AFINN lexicon with negation handling)
s = anynlp.sentiment("I absolutely do not love this product")
print(s.label, s.score) # "negative" -0.4
# 3. Named-entity recognition (regex patterns by default)
ents = anynlp.ner("Apple was founded by Steve Jobs in Cupertino in 1976.")
for e in ents:
print(e.text, e.label) # "Apple" ORG, "Steve Jobs" PERSON, ...
# 4. Zero-shot classification (keyword overlap by default)
result = anynlp.classify(
"The GPU crashed during training",
labels=["hardware", "software", "billing"],
)
print(result.label) # "hardware"
# 5. RAG-style chunking
chunks = anynlp.chunk(long_text, method="recursive", size=500, overlap=50)
Models & Methods
anynlp has 3 backend tiers — pick based on your needs:
1. Rule-based backend (default, zero dependencies)
- Summarization: Extractive — sentences scored by position, keyword frequency, and length; top-N selected to meet target ratio
- Classification: Keyword overlap with class label tokens
- NER: Regex patterns for emails, URLs, phone numbers, dates, money, percent
- Sentiment: AFINN-style word lexicon with negation handling and intensifiers
- Keywords: TF-IDF-like scoring relative to a built-in stopword list
- Similarity: Jaccard similarity on word sets
- Chunking: Recursive (paragraph→sentence→word), fixed-size, or sentence-boundary methods
- Clustering: TF-IDF + KMeans
2. HuggingFace backend (optional via [transformers])
- Uses transformers.pipeline() for SOTA model-based tasks
- Models auto-download from HuggingFace Hub
- Supports: summarization (BART), classification (zero-shot), NER (BERT), sentiment
3. LLM backend (optional via [llm], uses anyllm)
- Sends task to any LLM (Ollama, OpenAI, Anthropic) for highest quality
- Structured outputs via JSON mode for entity/keyword extraction
- Best for: complex documents, multi-language, custom schemas
API Reference
| Function | Purpose |
|---|---|
anynlp.summarize(text, max_sentences=3, backend="rules") |
Summary text |
anynlp.classify(text, labels, backend="rules") |
ClassificationResult |
anynlp.ner(text, backend="rules") |
list[Entity] |
anynlp.sentiment(text, backend="rules") |
SentimentResult |
anynlp.keywords(text, top_k=10) |
list[(term, score)] |
anynlp.similarity(a, b) |
Float 0.0 - 1.0 |
anynlp.chunk(text, method="recursive", size=500) |
list[str] |
anynlp.cluster(texts, k=5) |
list[int] cluster labels |
CLI Usage
anynlp summarize article.txt --sentences 5
anynlp sentiment "I loved the book"
anynlp ner "Tim Cook visited Paris on 2024-05-01"
anynlp classify "GPU crash" --labels "hardware,software,billing"
anynlp chunk document.txt --method recursive --size 500
Examples
RAG chunking + clustering
import anynlp
# Split documents for a RAG index
with open("book.txt") as f:
chunks = anynlp.chunk(f.read(), method="recursive", size=500, overlap=50)
# Cluster the chunks to find topic groups (TF-IDF + KMeans)
labels = anynlp.cluster(chunks, k=8)
Upgrade to HuggingFace when quality matters
import anynlp
# Rule-based: fast, no downloads
s1 = anynlp.sentiment("complicated sentence with irony", backend="rules")
# HuggingFace: slower first call, better accuracy
s2 = anynlp.sentiment("complicated sentence with irony", backend="hf")
# LLM: highest quality, requires anyllm + a provider
s3 = anynlp.sentiment("complicated sentence with irony",
backend="llm", model="llama3.1:8b")
Zero-shot text classification with a local LLM
import anynlp
result = anynlp.classify(
"My order is 3 weeks late",
labels=["billing", "shipping", "product", "other"],
backend="llm",
model="llama3.1:8b",
)
print(result.label, result.score)
License
MIT (c) Viet-Anh Nguyen
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anynlp-0.2.3.tar.gz.
File metadata
- Download URL: anynlp-0.2.3.tar.gz
- Upload date:
- Size: 43.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
adebbd6923622148697dde329fae3de2be8be5becde8af96ccd5c0bbc55469ea
|
|
| MD5 |
1ab965055f28045e8e9991f7ecabacf2
|
|
| BLAKE2b-256 |
0d6be9e171b58147f4df1e0c13e3f65fcddbacc2180d86d38cf9829609c2de6c
|
File details
Details for the file anynlp-0.2.3-py3-none-any.whl.
File metadata
- Download URL: anynlp-0.2.3-py3-none-any.whl
- Upload date:
- Size: 35.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53d1e50457cf435ee38030c05116dc5fb73551c50a1da73a0ca3c647766b0b42
|
|
| MD5 |
108f0f8b6ffef89ba2034a1dfdb1c3a8
|
|
| BLAKE2b-256 |
ac99e3a11d0fadd6ad95dc0ea6e1d49a00fd5bfa8fd6fc63fb5e1718ae8928be
|