Schema-driven Named Entity Recognition powered by local LLMs via Ollama

These details have not been verified by PyPI

Project links

Project description

llm-ner

Schema-driven Named Entity Recognition powered by local LLMs via Ollama.

llm-ner lets you define arbitrary extraction schemas as plain Pydantic models and extract structured entities from free text – without training a custom model. Every extracted value is paired with a short verbatim evidence quote from the source, making results auditable and explainable.

Features

Schema-first – define what to extract with pure Python + Pydantic; the library builds the LLM prompt automatically.
Evidence tracking – every field carries an evidence quote that must appear verbatim in the source text.
Smart retries – automatically re-runs extraction and merges results when fields are missing.
Tolerant parsing – invalid enum values, malformed numbers, bad dates, etc. become None instead of crashing.
Fully typed – ships with a py.typed marker and complete type annotations.
No cloud required – runs entirely on a local Ollama instance.

Installation

With `uv` (recommended)

# Install uv if you don't have it
pip install uv

# Clone the repository
git clone https://github.com/ManuelMunozBer/llm-ner.git
cd llm-ner

# Create a virtual environment and install the package
uv venv
uv pip install -e .

# With test dependencies
uv pip install -e ".[test]"

# With all development dependencies
uv pip install -e ".[dev]"

With pip

pip install llm-ner

Prerequisites

A running Ollama instance with your chosen model:

ollama serve
ollama pull qwen2.5:7b-instruct   # or any instruction-following model

Quick Start

from llmner import NERBaseModel, NERExtractor, SchemaRegistry

# 1. Create a registry – one per schema
registry = SchemaRegistry()

# 2. Define typed field annotations
GenderType = registry.categorical(
    "gender",
    options=["male", "female"],
    instruction="Extract the subject's gender.",
)
AgeType = registry.int_range(
    "age",
    "Extract the subject's age as an integer or range (e.g. '25-30').",
)
NameType = registry.generic(
    "name",
    "Extract the subject's full name.",
)

# 3. Define your Pydantic extraction schema
class PersonSchema(NERBaseModel):
    name:   NameType   | None = None  # type: ignore[valid-type]
    gender: GenderType | None = None  # type: ignore[valid-type]
    age:    AgeType    | None = None  # type: ignore[valid-type]

# 4. Create the extractor
extractor = NERExtractor(
    schema_class=PersonSchema,
    system_role="You are an expert information extractor.",
    system_task=(
        "Extract the requested fields from the text. "
        "Return null for any field not mentioned."
    ),
    rules_registry=registry.rules,
)

# 5. Extract
result = extractor.extract_one(
    "Detective John Smith, 42, was assigned to the case."
)

print(result.name.value)     # "John Smith"
print(result.name.evidence)  # "John Smith"
print(result.age.value)      # "42"
print(result.gender.value)   # "male"

Concepts

`SchemaRegistry`

A SchemaRegistry instance is used to create self-documenting Pydantic field types. Each factory call registers a rule that will be injected into the LLM prompt.

registry = SchemaRegistry()

# Categorical field – only values from the allowed list are accepted
StatusType = registry.categorical(
    "status",
    options={"active": "currently employed", "inactive": "no longer employed"},
    instruction="Extract the person's employment status.",
)

# Integer / range field
SalaryType = registry.int_range(
    "salary",
    "Extract the annual salary in thousands of euros.",
)

# Free-text field
AddressType = registry.generic(
    "address",
    "Extract the full postal address.",
)

# Datetime field – normalised to YYYY-MM-DD HH:MM:SS
DateType = registry.datetime_format(
    "date",
    "Extract the contract signing date.",
)

`EvidenceField`

Every factory produces Annotated[EvidenceField, ...] types. An EvidenceField has two attributes:

Attribute	Type	Description
`value`	`str \| None`	The normalised extracted value.
`evidence`	`str \| None`	Verbatim quote from the source text that justifies `value`.

field: EvidenceField = result.name
print(field.value)     # "John Smith"
print(field.evidence)  # "John Smith, 42"

Evidence is validated: if the quote does not appear verbatim in the source text it is set to None.

Automatic evidence resolution — after validation, evidence is automatically resolved using a priority chain:

Full value match: if the extracted value appears verbatim in the source text (≥ 3 characters), it becomes the evidence — even if the model provided a different quote. The canonical value is the most precise anchor. 1a. Full value match (descored): if the value contains underscores (e.g. "physical_assault") and the underscore-to-space form ("physical assault") appears in the text, that form is returned as evidence. 1b. Full raw value match: if the normalised value is not found but the pre-transformation form (e.g. "1.75" before metre→cm conversion) appears in the text, the raw value is returned as evidence. 1c. Full raw value match (descored): same as 1a but applied to the raw (pre-transformation) value.
Model evidence: if neither value nor raw value is found but the model provided a valid evidence quote, it is kept unchanged.
Partial prefix fallback: when no model evidence exists, the longest token-prefix of the value that appears in the text is used (minimum 3 characters). Both the original and descored (underscore→space) forms are tried. Single-token prefixes are tried, so e.g. "2024-01-01" from a datetime "2024-01-01 08:13:00" can serve as evidence. 3b. Partial raw value prefix: same as rule 3 but applied to the raw (pre-transformation) value (and its descored form).
None — no usable evidence could be determined.

`NERBaseModel`

Your extraction schemas must subclass NERBaseModel. It adds four reflection- based utilities:

Method	Description
`prompt_schema()`	Generate the JSON skeleton injected into the LLM prompt.
`has_missing_fields()`	Return `True` if any nested `EvidenceField.value` is `None`.
`merge(e1, e2, *, input_text="")`	Fill `None` values in `e1` with values from `e2`; resolve conflicts using the source text when provided.
`safe_parse(data)`	Tolerantly parse LLM output, isolating per-field errors.

`NERExtractor`

The main orchestrator. Key parameters:

Parameter	Default	Description
`schema_class`	–	Your `NERBaseModel` subclass.
`system_role`	–	LLM persona / expertise description.
`system_task`	–	Extraction task and constraints.
`rules_registry`	–	`registry.rules` from your `SchemaRegistry`.
`llm_model`	`"qwen2.5:7b-instruct"`	Ollama model tag.
`llm_base_url`	`"http://localhost:11434"`	Ollama server URL.
`llm_temperature`	`1.0`	Sampling temperature (`0.0` = deterministic).
`max_retries`	`1`	Extra calls on incomplete extraction.

Nested Schemas

class Address(NERBaseModel):
    street: registry.generic("street", "Street name and number.") | None = None  # type: ignore[valid-type]
    city:   registry.generic("city",   "City name.")              | None = None  # type: ignore[valid-type]

class PersonSchema(NERBaseModel):
    name:    NameType    | None = None   # type: ignore[valid-type]
    address: Address     | None = None
    suspects: list[SuspectSchema] = []

prompt_schema() and safe_parse() handle arbitrary nesting and lists of sub-models automatically.

Advanced Usage

Custom LLM client

Implement BaseLLMClient to use a different inference backend:

from llmner.llm_client import BaseLLMClient

class MyClient(BaseLLMClient):
    def generate(self, prompt: str) -> dict | None:
        # Call your backend here
        ...

extractor = NERExtractor(
    ...,
    llm_client=MyClient(),
)

Custom prompt template

from llmner import DEFAULT_PROMPT_TEMPLATE

MY_TEMPLATE = """\
[INST] {system_role}

{system_task}

Rules:
{rules_text}

Schema:
{schema_json}

Text: {input_text} [/INST]
"""

extractor = NERExtractor(
    ...,
    prompt_template=MY_TEMPLATE,
)

Fallback parsers

When the LLM returns a null value but provides a non-null evidence quote, a fallback parser can attempt to recover the value from the evidence string. Every factory method accepts an optional fallback_parser callback:

import re

# Recover age from evidence like "aged 34"
AgeType = registry.int_range(
    "age",
    "Extract the subject's age.",
    fallback_parser=lambda ev: m.group() if (m := re.search(r"\d+", ev)) else None,
)

# Recover gender from contextual clues in evidence
GenderType = registry.categorical(
    "gender",
    options=["male", "female"],
    instruction="Extract the subject's gender.",
    fallback_parser=lambda ev: "male" if "man" in ev.lower() else None,
)

The callback signature is (evidence: str) -> str | None. When it returns a non-None value, that value is fed through the factory's normal validation pipeline (option matching, range parsing, date normalisation, etc.).

Extra datetime formats

datetime_format accepts an extra_formats tuple of additional strptime format strings appended after the built-in ones:

DateType = registry.datetime_format(
    "date",
    "Extract the event date.",
    extra_formats=("%B %d, %Y", "%d %b %Y"),  # "March 15, 2024", "15 Mar 2024"
)

Per-field evidence requirement

By default, extracted values are kept even when no supporting evidence can be found in the source text. To enforce grounding on a per-field basis, pass evidence_required=True to any factory method:

# This field will be set to None if no evidence is found in the source text
NameType = registry.generic(
    "name",
    "Full name of the person.",
    evidence_required=True,
)

# This field keeps its value even without evidence (default behaviour)
AgeType = registry.int_range("age", "Age in years.")

Controlling retries

# Disable retries
result = extractor.extract_one(text, retry_on_null=False)

# Configure at extractor level
extractor = NERExtractor(..., max_retries=3)

Extraction Pipeline Flow

Below is the complete data flow from input text to validated output.

 INPUT TEXT
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│  1. PROMPT BUILDING  (PromptBuilder)                    │
│  ─────────────────────────────────────                  │
│  • system_role + system_task                            │
│  • Per-field extraction rules (from SchemaRegistry)     │
│  • JSON schema skeleton (from NERBaseModel.prompt_schema)│
│  • The input text itself                                │
│  ⇒ Assembled into a single prompt string                │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────┐
│  2. LLM CALL  (BaseLLMClient / OllamaClient)           │
│  ─────────────────────────────────────                  │
│  • POST prompt to Ollama /api/generate (JSON mode)      │
│  • Parse the raw JSON response into a Python dict       │
│  • Returns None on network/parse errors                 │
└────────────────────────┬────────────────────────────────┘
                         │  raw dict
                         ▼
┌─────────────────────────────────────────────────────────┐
│  3. SAFE PARSE  (NERBaseModel.safe_parse)               │
│  ─────────────────────────────────────                  │
│  Three-phase tolerant validation:                       │
│                                                         │
│  Phase 1 — List items: validate each item in list-of-   │
│  model fields individually. Bad items get per-field     │
│  fallback (good fields kept, bad → None).               │
│                                                         │
│  Phase 2 — Full model: attempt model_validate() with    │
│  context={input_text}. This triggers all BeforeValidator │
│  pipelines (step 4 below). If it succeeds → done.       │
│                                                         │
│  Phase 3 — Field-by-field fallback: validate each       │
│  field in isolation. Fields that fail → None.            │
│  Nested models are validated field-by-field too.        │
└────────────────────────┬────────────────────────────────┘
                         │  for each field (during Phase 2/3)
                         ▼
┌─────────────────────────────────────────────────────────┐
│  4. FIELD VALIDATION  (BeforeValidator in each factory) │
│  ─────────────────────────────────────                  │
│  For every EvidenceField-type field, the validator runs │
│  this pipeline:                                         │
│                                                         │
│  a) _extract_ev(v) — unpack {value, evidence} from      │
│     the raw dict or EvidenceField object                │
│                                                         │
│  b) _coerce_null(raw) — convert "null"/"none"/"" → None │
│                                                         │
│  c) _validate_evidence(evidence, info) — check that the │
│     model's evidence quote exists in the source text    │
│     (case-insensitive, whitespace-normalised).          │
│     Invalid quotes → None.                              │
│                                                         │
│  d) _try_fallback(raw, evidence, fallback_parser) —     │
│     if raw is None but evidence exists, attempt to      │
│     recover a value from the evidence string            │
│                                                         │
│  e) TYPE-SPECIFIC NORMALISATION:                        │
│     • categorical: lowercase, apply replacements,       │
│       match against allowed options list                │
│     • int_range: strip units, convert m→cm, parse       │
│       integers or MIN-MAX ranges                        │
│     • generic: coerce to python_type                    │
│     • datetime_format: parse with strptime, normalise   │
│       to "YYYY-MM-DD HH:MM:SS"                         │
│                                                         │
│  f) _resolve_evidence(value, evidence, info,            │
│     raw_value=raw_str):                                 │
│     Rule 1:  value in text        → value as evidence   │
│     Rule 1a: descored value in text → descored as ev.   │
│     Rule 1b: raw_value in text    → raw_value as ev.    │
│     Rule 1c: descored raw in text → descored as ev.     │
│     Rule 2:  model evidence valid → keep it             │
│     Rule 3:  partial token-prefix of value in text      │
│              (also tries descored form)                  │
│     Rule 3b: partial token-prefix of raw_value          │
│              (also tries descored form)                  │
│     Rule 4:  → None                                     │
│                                                         │
│  g) _apply_evidence_required(ef, evidence_required,     │
│     info):                                              │
│     if evidence_required=True (per-field)               │
│     AND input_text available                            │
│     AND value ≠ None AND evidence = None                │
│     → discard value (set to None)                       │
│                                                         │
│  ⇒ Returns EvidenceField(value=..., evidence=...)       │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────┐
│  5. RETRY & MERGE  (NERExtractor.extract_one)           │
│  ─────────────────────────────────────                  │
│  • If has_missing_fields() → True and retry_on_null:    │
│    - Call LLM again (up to max_retries times)           │
│    - Merge results: first.value takes priority;         │
│      None values filled from second extraction          │
│    - Lists merged by index position                     │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
              VALIDATED OUTPUT
         (NERBaseModel instance)
         Every non-null value has
         evidence from the source text

Key Guarantee

When evidence_required=True is set on a factory (opt-in, per-field), every non-null value for that field has a non-null evidence that appears verbatim in the source text. Values that the LLM extracted correctly but cannot be grounded in the text are discarded (value → None). This ensures zero hallucinated entities at the cost of potentially lower recall. The default is evidence_required=False, so values are kept even when no supporting evidence can be found.

Running the Examples

# Make sure Ollama is running and the model is available
ollama pull qwen2.5:7b-instruct

# Run the crime extraction example
python examples/crime_extraction/run.py

Running the Tests

Integration tests require a live Ollama instance. Mark them accordingly:

# Run only unit tests (no Ollama needed)
pytest tests/ -m "not integration"

# Run all tests including integration
pytest tests/ -m integration -v

Project Structure

llm-ner/
├── src/
│   └── llmner/
│       ├── __init__.py        # Public API
│       ├── base_model.py      # NERBaseModel
│       ├── factories.py       # SchemaRegistry + EvidenceField
│       ├── extractor.py       # NERExtractor
│       ├── llm_client.py      # OllamaClient
│       └── prompt.py          # PromptBuilder
├── tests/
│   ├── conftest.py            # Pytest configuration
│   ├── schema/
│   │   └── crime_schema.py    # Crime-specific schema (integration test)
│   ├── data/
│   │   ├── complaints.csv
│   │   ├── crimes_perceived_detailed.csv
│   │   └── perceived_suspects.csv
│   └── test_ner_accuracy.py   # End-to-end accuracy test
├── examples/
│   └── crime_extraction/
│       ├── schema.py          # English crime schema example
│       └── run.py             # Runnable example script
├── pyproject.toml
├── LICENSE
└── README.md

Contributing

Fork the repository and create a feature branch.
Install development dependencies: uv pip install -e ".[dev]".
Run linting: ruff check src/.
Run type checking: mypy src/llmner.
Open a pull request with a clear description of your changes.

License

MIT – see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.1

Mar 31, 2026

0.4.1

Mar 23, 2026

0.4.0

Mar 22, 2026

0.3.0

Mar 22, 2026

0.2.0

Mar 21, 2026

0.1.0

Mar 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_ner-0.5.1.tar.gz (47.8 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_ner-0.5.1-py3-none-any.whl (25.9 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file llm_ner-0.5.1.tar.gz.

File metadata

Download URL: llm_ner-0.5.1.tar.gz
Upload date: Mar 31, 2026
Size: 47.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for llm_ner-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`b468096d42213f8faacedad0d4dc5cbdeceec00e30c716bb0ec85b7de9cf93ca`
MD5	`2804d1a0c245486afdfd7522de8956b1`
BLAKE2b-256	`dd8003cb4148536d77cdd305f43ec643973b4cde5b0bedab863b334fe3c60145`

See more details on using hashes here.

File details

Details for the file llm_ner-0.5.1-py3-none-any.whl.

File metadata

Download URL: llm_ner-0.5.1-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 25.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for llm_ner-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c1f68f4fdc6ddce3b3caf1fb7bff1a77fcb556894275bae46a063db9fb8d5620`
MD5	`0e47d5f2bde4ffef1770e9438bb26a6f`
BLAKE2b-256	`f808d8d173cfe03a5f40bdc26101931186feed219c333e59509188871f370b4f`

See more details on using hashes here.

llm-ner 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-ner

Features

Installation

With uv (recommended)

With pip

Prerequisites

Quick Start

Concepts

SchemaRegistry

EvidenceField

NERBaseModel

NERExtractor

Nested Schemas

Advanced Usage

Custom LLM client

Custom prompt template

Fallback parsers

Extra datetime formats

Per-field evidence requirement

Controlling retries

Extraction Pipeline Flow

Key Guarantee

Running the Examples

Running the Tests

Project Structure

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

With `uv` (recommended)

`SchemaRegistry`

`EvidenceField`

`NERBaseModel`

`NERExtractor`