Inference-driven schema mapping engine

These details have not been verified by PyPI

Project links

Project description

infermap

Inference-driven schema mapping engine — automatically maps source fields to target fields using a composable scorer pipeline.

Install

pip install infermap

Install extras for additional database support:

pip install infermap[postgres]   # psycopg2-binary
pip install infermap[mysql]      # mysql-connector-python
pip install infermap[duckdb]     # duckdb
pip install infermap[all]        # all extras

Quick Start

import infermap

# Map a CRM export CSV to a canonical customer schema
result = infermap.map("crm_export.csv", "canonical_customers.csv")

for m in result.mappings:
    print(f"{m.source} -> {m.target}  ({m.confidence:.0%})")
# fname -> first_name  (97%)
# lname -> last_name   (95%)
# email_addr -> email  (91%)

# Apply mappings to rename DataFrame columns
import polars as pl
df = pl.read_csv("crm_export.csv")
renamed = result.apply(df)

# Save mappings to a reusable config file
result.to_config("my_mapping.yaml")

# Reload later — no re-inference needed
saved = infermap.from_config("my_mapping.yaml")

CLI Examples

# Map two files and print a report
infermap map crm_export.csv canonical_customers.csv

# Map and save the config
infermap map crm_export.csv canonical_customers.csv --save mapping.yaml

# Apply a saved mapping config to a DataFrame (prints renamed column list)
infermap apply crm_export.csv mapping.yaml

# Inspect the schema of a file or database table
infermap inspect crm_export.csv
infermap inspect sqlite:///mydb.db --table customers

# Validate a mapping config file
infermap validate mapping.yaml

How It Works

infermap runs each field pair through a pipeline of 5 scorers. Each scorer returns a score between 0.0 and 1.0 (or abstains with None). The engine combines scores via weighted average (requiring at least 2 contributing scorers), then uses the Hungarian algorithm for optimal one-to-one assignment.

Scorer	Weight	What it detects
ExactScorer	1.0	Case-insensitive exact name match
AliasScorer	0.9	Known field aliases (e.g. `fname` == `first_name`, `tel` == `phone`)
PatternTypeScorer	0.7	Semantic type from sample values — email, date_iso, phone, uuid, url, zip, currency
ProfileScorer	0.6	Statistical profile similarity — null rate, unique rate, value count
FuzzyNameScorer	0.5	Token-level fuzzy string similarity on field names

Features

Maps CSV, Parquet, XLSX, Polars DataFrames, Pandas DataFrames, SQLite, and schema YAML files
Composable scorer pipeline — disable, reweight, or add custom scorers via config or code
Optimal one-to-one assignment via the Hungarian algorithm
required parameter warns when critical target fields go unmapped
MapResult.apply() renames DataFrame columns in one call
to_config() / from_config() roundtrip for repeatable pipelines
CLI for quick inspection, mapping, and validation

Custom Scorers

import infermap
from infermap.types import FieldInfo, ScorerResult

@infermap.scorer("my_prefix_scorer", weight=0.8)
def my_prefix_scorer(source: FieldInfo, target: FieldInfo) -> ScorerResult | None:
    src = source.name.lower()
    tgt = target.name.lower()
    # Abstain if neither name starts with a common prefix
    if not (src[:3] == tgt[:3]):
        return None
    return ScorerResult(score=0.85, reasoning=f"Shared prefix '{src[:3]}'")

from infermap.engine import MapEngine
from infermap.scorers import default_scorers

engine = MapEngine(scorers=[*default_scorers(), my_prefix_scorer])
result = engine.map("source.csv", "target.csv")

You can also use a plain class with name, weight, and score():

class DomainScorer:
    name = "DomainScorer"
    weight = 0.75

    def score(self, source: FieldInfo, target: FieldInfo) -> ScorerResult | None:
        ...

Config Reference

Load an infermap.yaml at engine creation to override scorer weights, disable scorers, or add domain aliases:

engine = MapEngine(config_path="infermap.yaml")

See infermap.yaml.example for a full annotated example.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

May 6, 2026

0.3.2

Apr 10, 2026

0.3.1

Apr 10, 2026

0.3.0

Apr 10, 2026

0.2.0

Apr 9, 2026

This version

0.1.0

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infermap-0.1.0.tar.gz (48.7 kB view details)

Uploaded Mar 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

infermap-0.1.0-py3-none-any.whl (28.2 kB view details)

Uploaded Mar 30, 2026 Python 3

File details

Details for the file infermap-0.1.0.tar.gz.

File metadata

Download URL: infermap-0.1.0.tar.gz
Upload date: Mar 30, 2026
Size: 48.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for infermap-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`cc29ca1486ecbdb765b47b5622ce12f8901bdc176fdf59ffdb9c14f762f1131f`
MD5	`fdd1d539730a7be7a633d0acf4ad5165`
BLAKE2b-256	`fdd6b83ce6c02db03f1349b790ef850cc0c391df32ca6afa3f5c20e25b323f41`

See more details on using hashes here.

File details

Details for the file infermap-0.1.0-py3-none-any.whl.

File metadata

Download URL: infermap-0.1.0-py3-none-any.whl
Upload date: Mar 30, 2026
Size: 28.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for infermap-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`07ad0478388f06fbfce4ed25f04825b1b4c946aace408bc9b3d04124e1a8018e`
MD5	`cfc16a03392575df9e81e575020b3f19`
BLAKE2b-256	`009e205c7d0e491f5abe22bbb849c529d2ee00442675162f4be8f49d4029d04e`

See more details on using hashes here.

infermap 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

infermap

Install

Quick Start

CLI Examples

How It Works

Features

Custom Scorers

Config Reference

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes