Skip to main content

Rule-based attribute enrichment for any sequence of items

Project description

Attribute Enricher

Rule-based attribute enrichment for any sequence of items.

Overview

vcti-attribute-enricher adds attributes to items in a sequence by matching vcti-lookup rules. Each EnrichRule pairs a condition (when) with attributes to write (set); the enricher walks an iterable, evaluates each rule against each item, and stamps matching items with the rule's attributes.

The package is iterable-generic, not tree-specific: the items can be tree-node payloads (e.g., from vcti-fileloader), dicts, dataclasses, or any object with an attribute-bearing surface. Reading and writing are pluggable via getter / setter callables, with sensible defaults for vcti-fileloader's DataNode-shaped payloads.

The package is the write-side counterpart to vcti-lookup: where Lookup filters a sequence by rules, the enricher mutates the sequence by rules.

Installation

pip install vcti-attribute-enricher

In requirements.txt

vcti-attribute-enricher>=1.0.0

In pyproject.toml dependencies

dependencies = [
    "vcti-attribute-enricher>=1.0.0",
]

Quick Start

from vcti.attribute_enricher import EnrichRule, apply_rules
from vcti.lookup import MISSING, Rule

items = [
    {"name": "stress.h5",     "dtype": "float64", "attributes": {}, "enricher_attributes": {}},
    {"name": "ids.h5",        "dtype": "int64",   "attributes": {}, "enricher_attributes": {}},
    {"name": "config.json",   "dtype": "object",  "attributes": {}, "enricher_attributes": {}},
]

def getter(item, key):
    # Read from the item's "attributes" dict, then the top-level item.
    # Return MISSING (not None) when absent so the rule cleanly no-matches.
    if key in item["attributes"]:
        return item["attributes"][key]
    return item.get(key, MISSING)

apply_rules(
    items,
    rules=[
        EnrichRule(set={"loaded_at": "2026-06-06"}),                            # every item
        EnrichRule(set={"category": "mechanical"},
                   when=(Rule("name", "^=", "stress"),)),                        # name starts with "stress"
        EnrichRule(set={"is_numeric": True},
                   when=(Rule("dtype", "^=", "float"),
                         Rule("dtype", "!=", "object"))),                        # AND across rules
    ],
    getter=getter,
    setter=lambda item, key, value: item["enricher_attributes"].__setitem__(key, value),
)

For payloads with a single mutable attribute dict (the common case):

items = [{"name": "x", "tags": {}}, {"name": "y", "tags": {}}]

apply_rules(
    items,
    rules=[EnrichRule(set={"seen": True})],
    getter=lambda item, key: item.get(key, MISSING),
    setter=lambda item, key, value: item["tags"].__setitem__(key, value),
)

Getters must return vcti.lookup.MISSING for absent attributes, not None. None is a legal value that gets passed to the operator; MISSING short-circuits the rule to "no match".

Enriching tree-node payloads

The package's default getter/setter are tuned for vcti-fileloader's DataNode/LazyDataNode payloads, which carry both a read-only file_attributes mapping (file-native) and a mutable enricher_attributes dict. The merged read view is exposed via .attributes (a ChainMap).

from vcti.attribute_enricher import EnrichRule, apply_rules
from vcti.tree import descendants
from vcti.lookup import Rule

apply_rules(
    descendants(tree, subtree_root, include_self=True),
    rules=[
        EnrichRule(set={"file_path": str(path)}),
        EnrichRule(set={"category": "mechanical"},
                   when=(Rule("name", "^=", "stress"),)),
    ],
)

No getter / setter arguments needed — the defaults read item.attributes (ChainMap) and write to item.enricher_attributes.


How it works

  1. Iterate the supplied items.
  2. For each item, evaluate every EnrichRule:
    • If when is empty, the rule matches.
    • If when has one or more Rules, every rule must match (AND).
    • A rule matches by reading the relevant attribute through getter and evaluating it via vcti.predicate.evaluate (forwarded through vcti-lookup).
  3. On a match, write each key/value in set to the item via setter.

Layering — last write wins. Multiple EnrichRules that match the same item all apply, in the order given. Later rules overwrite earlier ones on collision. This makes it easy to express "set a default for everything, then refine for specific subsets."

Return value. apply_rules returns an EnrichResult with summary counts — handy for logging and for spotting dead rules:

result = apply_rules(items, rules)
print(result.items_visited, result.items_matched, result.writes_applied)
print(result.per_rule_matches)   # match count per rule, in order; 0 = dead rule

The items are mutated in place; the result carries metrics only.


API

Symbol Description
EnrichRule(set, when=()) Frozen dataclass. set is a dict of attributes to write; when is a tuple[Rule, ...] combined with AND logic. Empty when matches every item. Not hashable (the set dict).
apply_rules(items, rules, *, getter=None, setter=None) Walk items, evaluate rules, stamp matches. Returns an EnrichResult. Defaults to vcti.lookup.attributes_getter and a setter that writes to item.enricher_attributes. Mutates in place (not thread-safe); raises ValueError with rule context if evaluation fails (fail-fast, non-transactional).
EnrichResult(items_visited, items_matched, writes_applied, per_rule_matches) Frozen dataclass of summary counts returned by apply_rules. per_rule_matches is per-rule, in order (a 0 flags a dead rule).
Getter / Setter Type aliases for the (item, key) -> value and (item, key, value) -> None callables.

Values in set are written by reference — a mutable value shared across matched items is the same object on each. See docs/patterns.md for pitfalls.


Dependencies

  • vcti-lookup (>=1.0.0) — Rule, MISSING, default attributes_getter
  • vcti-predicate (>=1.0.0) — evaluate() is called directly for rule matching

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcti_attribute_enricher-1.0.0.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcti_attribute_enricher-1.0.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file vcti_attribute_enricher-1.0.0.tar.gz.

File metadata

  • Download URL: vcti_attribute_enricher-1.0.0.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vcti_attribute_enricher-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dc8e5a31d89fd8ce7d397724e62006dee754abf7a8b85af1cc48b7a2500430af
MD5 0c3967beec75fdd1c7bf298ae7d1a2c6
BLAKE2b-256 8cb276a31ef6d07fc85609f8bbfec9f641095019e9f0c040dcbd1c6e54f5b5be

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_attribute_enricher-1.0.0.tar.gz:

Publisher: release.yml on vcollab/vcti-python-attribute-enricher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vcti_attribute_enricher-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vcti_attribute_enricher-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f53287c01013bcb8004789bb3d60d9ab26017cab1d08137baec524d7f45b56ee
MD5 31d7316b3274b4df6d26f340ca3671a3
BLAKE2b-256 23abc39d4163e18ce45a28ddaf61ca77f213519f34d4f116a73060b35fe643f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_attribute_enricher-1.0.0-py3-none-any.whl:

Publisher: release.yml on vcollab/vcti-python-attribute-enricher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page