spaCy-based named-entity recognition parser for Swarmauri with structured entity document output.
Project description
Swarmauri Parser Entity Recognition
swarmauri_parser_entityrecognition is the Swarmauri named-entity recognition
parser built on spaCy. It extracts named entities such as
people, organizations, and geopolitical entities from unstructured text and
returns Swarmauri Document objects containing the entity text and entity
metadata.
Why Use Swarmauri Parser Entity Recognition
- Turn raw text into structured entity objects inside a Swarmauri parser workflow.
- Preserve entity labels and entity ids in a predictable
Documentshape for downstream enrichment, filtering, or indexing. - Use spaCy's English NER pipeline when available, while still retaining a minimal fallback path for constrained environments.
- Fit entity extraction into larger ingestion, retrieval, anonymization, and knowledge-graph workflows.
FAQ
What does this parser return?
A list of SwarmauriDocumentobjects, usually one per detected entity.
Which metadata fields are included?
entity_type,entity_id, andtext.
What spaCy model does it use?
It tries to loaden_core_web_sm.
What happens if the model is unavailable?
The parser attempts to download the model. If that fails, it falls back to a blank English pipeline plus a small regex-based fallback used as a best-effort compatibility path.
Features
- Named-entity extraction via spaCy's English NER model.
- Automatic attempt to download
en_core_web_smif the model is missing. - Best-effort fallback behavior for environments where the full model cannot be loaded.
- Returns Swarmauri
Documentobjects with entity label metadata. - Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.
Installation
uv add swarmauri_parser_entityrecognition
pip install swarmauri_parser_entityrecognition
Optional model bootstrap:
python -m spacy download en_core_web_sm
Usage
from swarmauri_parser_entityrecognition import EntityRecognitionParser
text = "Barack Obama was born in Hawaii and served as President of the United States."
parser = EntityRecognitionParser()
entities = parser.parse(text)
for entity in entities:
print(entity.content, entity.metadata["entity_type"])
Examples
Parse organizations, places, and people
from swarmauri_parser_entityrecognition import EntityRecognitionParser
parser = EntityRecognitionParser()
docs = parser.parse(
"Apple Inc. is planning to open a new office in New York City, according to CEO Tim Cook."
)
for doc in docs:
print(doc.content, doc.metadata)
Handle non-string input
from swarmauri_parser_entityrecognition import EntityRecognitionParser
parser = EntityRecognitionParser()
print(parser.parse(42))
Inspect fallback-compatible metadata
from swarmauri_parser_entityrecognition import EntityRecognitionParser
parser = EntityRecognitionParser()
entities = parser.parse("Tim Cook announced new products in New York City for Apple Inc.")
print([entity.metadata for entity in entities])
Related Packages
- swarmauri_tool_entityrecognition
- swarmauri_tool_sentimentanalysis
- swarmauri_parser_textblob
- swarmauri_parser_bertembedding
Swarmauri Foundations
More Documentation
Best Practices
- Preinstall
en_core_web_smin CI and production environments to avoid runtime downloads. - Treat the regex fallback as a compatibility path, not as production-quality entity recognition.
- Strip markup or noisy boilerplate before parsing to improve entity quality.
- Persist entity spans or link them to downstream IDs if you need durable knowledge-graph or indexing workflows.
License
This project is licensed under the Apache-2.0 License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file swarmauri_parser_entityrecognition-0.11.0.dev1.tar.gz.
File metadata
- Download URL: swarmauri_parser_entityrecognition-0.11.0.dev1.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9450717fec8b421c2db02ece349dfa4d27fc66deb41e86abdce856c7f56c50ec
|
|
| MD5 |
8160bf9f40294e2b9baef63d6d9da4ba
|
|
| BLAKE2b-256 |
2ecc49cfa519402c6c8c4dae83cf1723a45325e4dad8d425ee1dee4c3b27d972
|
File details
Details for the file swarmauri_parser_entityrecognition-0.11.0.dev1-py3-none-any.whl.
File metadata
- Download URL: swarmauri_parser_entityrecognition-0.11.0.dev1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a2dd56f80b33775a5e5956ab59862d94d5252433ede2f1fbe81804e9ac065fe
|
|
| MD5 |
4208394fbe1d54057fcb97ca69d20d5c
|
|
| BLAKE2b-256 |
fe0aa9e59b794055d57c95258af5f7eefd75a110b29b43b2faab489bd22f7eaf
|