Skip to main content

Lightweight sentence and noun-phrase parsing package for Swarmauri built on TextBlob and NLTK.

Project description

Swarmauri Logo

PyPI - Downloads Hits PyPI - Python Version PyPI - License PyPI - swarmauri_parser_textblob Discord

Swarmauri Parser TextBlob

swarmauri_parser_textblob provides two Swarmauri text parsing components built on TextBlob: TextBlobSentenceParser for sentence segmentation and TextBlobNounParser for noun-phrase extraction. It is designed for lightweight NLP preprocessing before chunking, retrieval, classification, or agent workflows.

Why Use Swarmauri Parser TextBlob

  • Split long passages into sentence-level documents for downstream processing.
  • Extract noun phrases without introducing a larger transformer stack.
  • Keep lightweight linguistic preprocessing aligned with the Swarmauri parser interface.
  • Use simple NLP enrichment before embeddings, retrieval, or task routing.

FAQ

What parser classes are included?
TextBlobSentenceParser and TextBlobNounParser.

What does the sentence parser return?
A Document per detected sentence, with metadata indicating the parser.

What does the noun parser return?
A single Document containing the original text and a noun_phrases list in metadata.

Does it require NLTK resources?
Yes. The package downloads required NLTK corpora during initialization unless they are already present.

Features

  • Sentence segmentation through TextBlobSentenceParser.
  • Noun phrase extraction through TextBlobNounParser.
  • Fits Swarmauri ingestion and preprocessing workflows using parser-style components.
  • Useful for lightweight English NLP pipelines where a smaller dependency stack is preferred.
  • Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.

Installation

uv add swarmauri_parser_textblob
pip install swarmauri_parser_textblob

Optional setup:

python -m textblob.download_corpora

Usage

Sentence parsing

from swarmauri_parser_textblob import TextBlobSentenceParser

parser = TextBlobSentenceParser()
documents = parser.parse("One more large chapula please. It should be extra spicy!")

for document in documents:
    print(document.content)

Noun phrase extraction

from swarmauri_parser_textblob import TextBlobNounParser

parser = TextBlobNounParser()
documents = parser.parse("One more large chapula please.")

print(documents[0].content)
print(documents[0].metadata["noun_phrases"])

Examples

Prepare sentence-level documents

from swarmauri_parser_textblob import TextBlobSentenceParser

parser = TextBlobSentenceParser()
sentences = parser.parse(
    "Swarmauri coordinates tools. It also routes data through composable components."
)

for sentence in sentences:
    print(sentence.metadata["parser"], sentence.content)

Extract noun phrases for downstream tagging

from swarmauri_parser_textblob import TextBlobNounParser

parser = TextBlobNounParser()
docs = parser.parse("The Swarmauri agent indexed a customer support knowledge base.")

print(docs[0].metadata["noun_phrases"])

Related Packages

Swarmauri Foundations

More Documentation

Best Practices

  • Pre-download NLTK corpora in CI, containers, or production images to avoid runtime setup costs.
  • Use these parsers for lightweight English NLP tasks; domain-specific or multilingual corpora may require a different component.
  • Combine sentence parsing with embeddings or vector stores when building retrieval-oriented pipelines.
  • Treat noun phrase extraction as heuristic enrichment rather than a strict ontology or entity-linking system.

License

This project is licensed under the Apache-2.0 License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swarmauri_parser_textblob-0.11.0.dev1.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

swarmauri_parser_textblob-0.11.0.dev1-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file swarmauri_parser_textblob-0.11.0.dev1.tar.gz.

File metadata

  • Download URL: swarmauri_parser_textblob-0.11.0.dev1.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_parser_textblob-0.11.0.dev1.tar.gz
Algorithm Hash digest
SHA256 010652e029c5e3f92ffb08e4f383b508788497b7f80014c258aa9db33cddb6f0
MD5 1beed277fda2e04e056e91c90c3d5854
BLAKE2b-256 63e9508c464e9efd6a35103d8f6ef24e7e83d123ac6f2634aae733c0842c00b4

See more details on using hashes here.

File details

Details for the file swarmauri_parser_textblob-0.11.0.dev1-py3-none-any.whl.

File metadata

  • Download URL: swarmauri_parser_textblob-0.11.0.dev1-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_parser_textblob-0.11.0.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 232230fd68184b1cf0581a74c102482976c1ed490d01a68ef9130e4c0477e962
MD5 dd8f91b2ef235cc1225481aea608d8a8
BLAKE2b-256 69e44ccb0779b47cc8f72d65c9707200c16c302bb60e22aaee225289ee9570e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page