A robust NLP pipeline for stemming, lemmatization, and vectorization

These details have not been verified by PyPI

Project links

Homepage

Project description

NLPProcessor

Overview

NLPProcessor is an automated, adaptive NLP pipeline that dynamically handles:

Tokenization (Word & Sentence)
Stopword Removal
POS Tagging
Named Entity Recognition (NER)
Text Normalization (Lowercasing, Punctuation Removal, etc.)
Stemming & Lemmatization (via NLTK or spaCy)
Vectorization (TF-IDF or Count Vectorizer)
Dependency Management (Auto-installs missing libraries)
Support for 2D Text Arrays (Processes lists of lists of text)
Exception-Free Execution (Handles API changes without breaking)

Features

Automated dependency installation
Works with both NLTK and spaCy
Vectorization support using scikit-learn
Handles single strings and 2D arrays
No human intervention required

Installation

Run the following command to install missing dependencies:

python your_script.py

Usage

Import and Initialize

from your_script import NLPProcessor

processor = NLPProcessor(stem=True, lemmatize=True, vectorize="tfidf", backend="spacy")

Process a Single Text

output = processor.process("running jumped swimming")
print(output)

Process a 2D Array of Text

input_texts = [
    ["I am running", "He is jumping"],
    ["They are swimming", "Dogs are barking"]
]
output = processor.process(input_texts)
print(output)

Customization Options

Parameter	Description
`stem`	Enable stemming (default: `False`)
`lemmatize`	Enable lemmatization (default: `False`)
`vectorize`	Choose "tfidf", "count", or `None` (default: `None`)
`tokenize`	Enable word/sentence tokenization (default: `False`)
`remove_stopwords`	Remove stopwords (default: `False`)
`pos_tagging`	Enable Part-of-Speech tagging (default: `False`)
`ner`	Enable Named Entity Recognition (default: `False`)
`normalize`	Lowercase and remove punctuation (default: `False`)
`backend`	Choose "nltk" or "spacy" (default: "nltk")

Check Supported Vectorizers

print(NLPProcessor.supported_vectorizers())  # ['tfidf', 'count']

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.0.9

Dec 23, 2025

0.0.8

May 12, 2025

0.0.7

May 12, 2025

0.0.6

Mar 17, 2025

0.0.5

Mar 17, 2025

This version

0.0.4

Mar 17, 2025

0.0.3

Mar 17, 2025

0.0.2

Mar 17, 2025

0.0.1

Mar 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pun_nlp-0.0.4.tar.gz (4.3 kB view details)

Uploaded Mar 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pun_nlp-0.0.4-py3-none-any.whl (4.7 kB view details)

Uploaded Mar 17, 2025 Python 3

File details

Details for the file pun_nlp-0.0.4.tar.gz.

File metadata

Download URL: pun_nlp-0.0.4.tar.gz
Upload date: Mar 17, 2025
Size: 4.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pun_nlp-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`f8ebac21a8762e798910fa77e7d05fe69ea62df0cfe170bde986388bf140fc6a`
MD5	`ad5e378545da08c8ab2dda72d36ebe81`
BLAKE2b-256	`f4b4df990ef4e107b88ee3807602ff4f93e10f19ea7f1eb346989dc1ce1e1d5a`

See more details on using hashes here.

File details

Details for the file pun_nlp-0.0.4-py3-none-any.whl.

File metadata

Download URL: pun_nlp-0.0.4-py3-none-any.whl
Upload date: Mar 17, 2025
Size: 4.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pun_nlp-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`85079e41ef443c71dbdeecef41a0688aaa3d9b3a0a08afd3379eb3d763c77849`
MD5	`486c36c9ab7108913227ad05ecdc3bd3`
BLAKE2b-256	`c5282d570387cce04e52780319cabc88531bf13c0d223cbe1e0af34fd5dd0775`

See more details on using hashes here.

pun-nlp 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NLPProcessor

Overview

Features

Installation

Usage

Import and Initialize

Process a Single Text

Process a 2D Array of Text

Customization Options

Check Supported Vectorizers

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes