Lightweight sentence and noun-phrase parsing package for Swarmauri built on TextBlob and NLTK.
Project description
Swarmauri Parser TextBlob
swarmauri_parser_textblob provides two Swarmauri text parsing components built
on TextBlob: TextBlobSentenceParser for
sentence segmentation and TextBlobNounParser for noun-phrase extraction. It
is designed for lightweight NLP preprocessing before chunking, retrieval,
classification, or agent workflows.
Why Use Swarmauri Parser TextBlob
- Split long passages into sentence-level documents for downstream processing.
- Extract noun phrases without introducing a larger transformer stack.
- Keep lightweight linguistic preprocessing aligned with the Swarmauri parser interface.
- Use simple NLP enrichment before embeddings, retrieval, or task routing.
FAQ
What parser classes are included?
TextBlobSentenceParserandTextBlobNounParser.
What does the sentence parser return?
ADocumentper detected sentence, with metadata indicating the parser.
What does the noun parser return?
A singleDocumentcontaining the original text and anoun_phraseslist in metadata.
Does it require NLTK resources?
Yes. The package downloads required NLTK corpora during initialization unless they are already present.
Features
- Sentence segmentation through
TextBlobSentenceParser. - Noun phrase extraction through
TextBlobNounParser. - Fits Swarmauri ingestion and preprocessing workflows using parser-style components.
- Useful for lightweight English NLP pipelines where a smaller dependency stack is preferred.
- Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.
Installation
uv add swarmauri_parser_textblob
pip install swarmauri_parser_textblob
Optional setup:
python -m textblob.download_corpora
Usage
Sentence parsing
from swarmauri_parser_textblob import TextBlobSentenceParser
parser = TextBlobSentenceParser()
documents = parser.parse("One more large chapula please. It should be extra spicy!")
for document in documents:
print(document.content)
Noun phrase extraction
from swarmauri_parser_textblob import TextBlobNounParser
parser = TextBlobNounParser()
documents = parser.parse("One more large chapula please.")
print(documents[0].content)
print(documents[0].metadata["noun_phrases"])
Examples
Prepare sentence-level documents
from swarmauri_parser_textblob import TextBlobSentenceParser
parser = TextBlobSentenceParser()
sentences = parser.parse(
"Swarmauri coordinates tools. It also routes data through composable components."
)
for sentence in sentences:
print(sentence.metadata["parser"], sentence.content)
Extract noun phrases for downstream tagging
from swarmauri_parser_textblob import TextBlobNounParser
parser = TextBlobNounParser()
docs = parser.parse("The Swarmauri agent indexed a customer support knowledge base.")
print(docs[0].metadata["noun_phrases"])
Related Packages
- swarmauri_parser_entityrecognition
- swarmauri_tool_entityrecognition
- swarmauri_tool_sentimentanalysis
- swarmauri_parser_bertembedding
Swarmauri Foundations
More Documentation
Best Practices
- Pre-download NLTK corpora in CI, containers, or production images to avoid runtime setup costs.
- Use these parsers for lightweight English NLP tasks; domain-specific or multilingual corpora may require a different component.
- Combine sentence parsing with embeddings or vector stores when building retrieval-oriented pipelines.
- Treat noun phrase extraction as heuristic enrichment rather than a strict ontology or entity-linking system.
License
This project is licensed under the Apache-2.0 License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file swarmauri_parser_textblob-0.11.0.dev1.tar.gz.
File metadata
- Download URL: swarmauri_parser_textblob-0.11.0.dev1.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
010652e029c5e3f92ffb08e4f383b508788497b7f80014c258aa9db33cddb6f0
|
|
| MD5 |
1beed277fda2e04e056e91c90c3d5854
|
|
| BLAKE2b-256 |
63e9508c464e9efd6a35103d8f6ef24e7e83d123ac6f2634aae733c0842c00b4
|
File details
Details for the file swarmauri_parser_textblob-0.11.0.dev1-py3-none-any.whl.
File metadata
- Download URL: swarmauri_parser_textblob-0.11.0.dev1-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
232230fd68184b1cf0581a74c102482976c1ed490d01a68ef9130e4c0477e962
|
|
| MD5 |
dd8f91b2ef235cc1225481aea608d8a8
|
|
| BLAKE2b-256 |
69e44ccb0779b47cc8f72d65c9707200c16c302bb60e22aaee225289ee9570e9
|