Local-first, rule-based semantic data reconciliation CLI tool

These details have not been verified by PyPI

Project links

Project description

Reconlify CLI

Semantic data reconciliation for the command line.

Validate structured datasets using declarative YAML rules and produce deterministic JSON reconciliation reports suitable for CI/CD pipelines.

Fully local. No data leaves your machine.

Typical use cases:

ETL validation
Data migration verification
Financial transaction reconciliation
CI pipeline dataset checks
Log comparison with normalization rules

Quick Example

source.csv

txn_id,amount
1,100
2,200

target.csv

txn_id,amount
1,100
2,210

config.yaml

type: tabular
source: source.csv
target: target.csv
keys:
  - txn_id

reconlify run config.yaml

Exit code: 1 (differences found). Report highlights:

rows_with_mismatches: 1
missing_in_source:    0
missing_in_target:    0

How Reconlify Compares

Reconlify is not just another diff tool.

It performs semantic reconciliation for structured data files.

Capability	diff	csvdiff	Excel Compare	Beyond Compare	Datafold	Reconlify
Understands tabular datasets	No	Yes	Yes	Yes	Yes	Yes
Key-based row matching	No	Yes	Manual	Yes	Yes	Yes
Detects missing rows	No	Yes	Manual	Partial	Yes	Yes
Column-level mismatch detection	No	Yes	Manual	Partial	Yes	Yes
Rule-based normalization	No	No	No	No	No	Yes
Regex transformations	No	No	No	No	No	Yes
Numeric tolerance	No	No	No	Yes	Yes	Yes
Noise filtering	No	No	Manual	Manual	Partial	Yes
Deterministic JSON reconciliation report	No	No	No	No	Partial	Yes
Works with exported files	Yes	Yes	Yes	Yes	No	Yes
Database integration	No	No	No	No	Yes	Planned
CI/CD automation ready	Yes	Partial	No	No	Yes	Yes
Schema-aware column mapping	No	No	Manual	Partial	Partial	Yes
Local-first execution	Yes	Yes	Yes	Yes	No	Yes

Reconlify can compare semantically equivalent datasets even when source and target use different column names — a common requirement in migration validation and cross-system reconciliation.

Reconlify focuses on semantic reconciliation for structured files such as CSV exports, logs, and tabular datasets.

While tools like Datafold specialize in comparing database tables inside data warehouses, Reconlify focuses on validating exported datasets produced by pipelines, migrations, or financial systems.

This makes it particularly useful for:

QA validation of data migrations
regression testing for ETL pipelines
reconciliation of exported reports
financial and operational data audits

Features

Key-based dataset reconciliation (single or composite keys)
Schema-aware column mapping for files with different column names
Automatic missing-row detection (both directions)
Column-level mismatch detection with include/exclude control
Numeric tolerance support (per-column absolute tolerance)
Normalization rules (trim, case-insensitive, null handling, regex, virtual columns)
Row filters and exclusions
Deterministic JSON reconciliation reports
Machine-readable exit codes (0 / 1 / 2)
Two engines: tabular (CSV/TSV) and text (line-by-line / unordered)
CI/CD pipeline friendly
Fully local execution — no network calls

Performance

Reconlify uses DuckDB-backed tabular processing, streaming text comparison, and a local-first architecture. It processes large datasets locally without requiring a database or external service.

Dataset	Rows / Lines	Mode	Time
CSV reconciliation (exact match)	200k rows	tabular	~2 s
CSV reconciliation (high mismatch)	200k rows	tabular	~12 s
Log comparison (positional diffs)	500k lines	line_by_line	~3 s
Log comparison (unordered)	250k lines	unordered_lines	< 1 s

Benchmarks were executed on a MacBook (Apple Silicon / Python 3.11) with default fixture settings.

Performance depends on dataset structure, rule complexity, and system hardware.

Full benchmark methodology and results: PERF_TESTING.md

Installation

Requires Python 3.11+.

pip install reconlify-cli

Or with pipx for isolated installs:

pipx install reconlify-cli

Package name on PyPI: reconlify-cli

For development:

git clone https://github.com/testuteab/reconlify-cli.git && cd reconlify-cli
make install          # runs: poetry install
reconlify --help

CLI Usage

reconlify run <config.yaml>                # default output: report.json
reconlify run <config.yaml> --out out.json # custom output path

Options

Option	Default	Description
`--out PATH`	`report.json`	Output path for the JSON report
`--include-line-numbers` / `--no-include-line-numbers`	on	Include original line numbers in text report samples
`--max-line-numbers N`	`0` (unlimited)	Max line numbers per distinct line in unordered mode
`--debug-report`	off	Include processed line numbers in text report samples

Exit Codes

Code	Meaning
0	No differences found
1	Differences found
2	Error (config validation, file not found, runtime failure)

Version

reconlify --version
# reconlify 0.1.1

Reconlify follows Semantic Versioning: MAJOR.MINOR.PATCH.

Real-World Example: Accounting Transaction Reconciliation

A finance team exports transactions from two systems (ERP and bank ledger) and needs to reconcile them nightly. Here's how Reconlify handles it.

1. Create sample data

cat <<'EOF' > source.csv
txn_id,booking_date,account,amount,currency,counterparty,status,memo
TXN-001,2026-01-15,4100,1500.00,USD,Acme Corp,BOOKED,Invoice 9201
TXN-002,2026-01-16,4100,320.50,EUR,Globex Inc,BOOKED,Wire transfer
TXN-003,2026-01-17,4200,75.00,USD,Jane Doe,BOOKED,Expense report
TXN-004,2026-01-18,4100,10000.00,USD,Initech,CANCELLED,Reversed
TXN-005,2026-01-19,4300,249.99,USD,Umbrella Ltd,BOOKED,Subscription
TXN-006,2026-01-20,4100,,USD,Soylent Corp,BOOKED,Pending allocation
EOF

cat <<'EOF' > target.csv
txn_id,booking_date,account,amount,currency,counterparty,status,memo
TXN-001,2026-01-15,4100,1500.00,USD,  acme corp  ,BOOKED,Invoice 9201
TXN-002,2026-01-16,4100,320.75,EUR,Globex Inc,BOOKED,Wire transfer
TXN-003,2026-01-17,4200,75.00,USD,Jane Doe,BOOKED,Expense report
TXN-005,2026-01-19,4300,249.99,USD,Umbrella Ltd,BOOKED,Subscription
TXN-006,2026-01-20,4100,NULL,USD,Soylent Corp,BOOKED,Pending allocation
TXN-007,2026-01-21,4100,430.00,USD,Wayne Ent,BOOKED,New deposit
EOF

What's different between the two files:

Scenario	Row	Detail
Matches after normalization	TXN-001	`counterparty` has extra spaces + lowercase in target — should match with trim + case rules
Value mismatch	TXN-002	`amount` is `320.50` vs `320.75` (exceeds tolerance)
Exact match	TXN-003	Identical
Filtered out	TXN-004	Status `CANCELLED` — excluded by row filter
Exact match	TXN-005	Identical
NULL normalization	TXN-006	Source has blank `amount`, target has `NULL` string — should match
Missing in source	TXN-007	Exists only in target

2. Create the reconciliation config

cat <<'EOF' > recon.yaml
type: tabular
source: source.csv
target: target.csv

keys:
  - txn_id

csv:
  delimiter: ","
  header: true
  encoding: utf-8

compare:
  trim_whitespace: true
  case_insensitive: true
  normalize_nulls: ["", "NULL", "null"]
  exclude_columns:
    - memo

filters:
  row_filters:
    both:
      - column: status
        op: equals
        value: CANCELLED
EOF

Key config choices:

keys: [txn_id] — match rows by transaction ID
trim_whitespace + case_insensitive — ignore formatting noise
normalize_nulls — treat blank, NULL, and null as equivalent
exclude_columns: [memo] — skip free-text fields
row_filters — drop CANCELLED transactions before comparison

3. Run the reconciliation

reconlify run recon.yaml --out report.json
echo $?

Expected exit code: 1 (differences found).

The report will contain:

missing_in_target: 0 — TXN-004 was filtered out, so not counted
missing_in_source: 1 — TXN-007 exists only in target
rows_with_mismatches: 1 — TXN-002 has an amount mismatch (320.50 vs 320.75)

TXN-001 and TXN-006 match cleanly thanks to normalization rules.

The report is written to report.json. It is deterministic — identical inputs and config always produce identical output (except the generated_at timestamp).

Text Engine

Reconlify also compares text files line-by-line or as unordered line sets:

type: text
source: expected.log
target: actual.log
mode: unordered_lines
normalize:
  trim_lines: true
  ignore_blank_lines: true

reconlify run text_recon.yaml

In unordered_lines mode, line order is ignored — Reconlify compares occurrence counts of each distinct line. In line_by_line mode (default), lines are compared positionally.

See the examples/ directory for config samples.

Engines

Tabular Engine

Key-based reconciliation — Single or composite keys. Detects missing rows on either side and cell-level value mismatches.
Column mapping — Map source column names to different target column names (column_mapping: {amount: total_amount}).
Column control — Include or exclude specific columns from comparison.
Numeric tolerance — Absolute tolerance per column (e.g. amount: 0.01).
String rules — Per-column normalization: trim, case_insensitive, contains, regex_extract.
Source-side normalization — Virtual columns via ops: map, concat, substr, add, sub, mul, div, coalesce, date_format, upper, lower, trim, round.
Row filters — Exclude specific key values or filter rows by column rules.
TSV support — Set csv.delimiter: "\t".

Text Engine

Two comparison modes: line_by_line (positional) and unordered_lines (multiset).
Normalization: trim_lines, collapse_whitespace, case_insensitive, ignore_blank_lines, normalize_newlines.
Regex rules: drop_lines_regex to remove lines, replace_regex to transform lines before comparison.

Report Format

Every run produces a JSON report with a consistent structure:

Section	Description
`summary`	Aggregate counts (rows, mismatches, missing). Zero differences = exit code 0
`details`	Metadata: keys used, columns compared, filters applied, per-column mismatch stats
`samples`	Concrete examples of differences (tabular: `missing_in_target`, `missing_in_source`, `value_mismatches`, `excluded`; text: flat list or `samples_agg`)
`error`	Present only on exit code 2. Machine-readable `code`, human-readable `message`, and `details`
`warnings`	Optional list of warning strings (e.g. large line-number arrays in unordered mode)

CI Usage

reconlify run recon.yaml --out report.json
rc=$?

if [ $rc -eq 2 ]; then
  echo "ERROR: config or runtime failure" >&2
  exit 1
elif [ $rc -eq 1 ]; then
  echo "WARN: differences found — see report.json" >&2
fi

Exit code 1 means differences were found — your pipeline decides whether that's a warning or a failure. Exit code 2 is always an error.

GitHub Actions

- name: Reconcile data
  run: |
    reconlify run recon.yaml --out report.json
    exit_code=$?
    if [ $exit_code -eq 2 ]; then
      echo "::error::Reconciliation failed with error"
      exit 1
    elif [ $exit_code -eq 1 ]; then
      echo "::warning::Differences found — see report.json"
    fi

- name: Upload report
  if: always()
  uses: actions/upload-artifact@v4
  with:
    name: recon-report
    path: report.json

Documentation

User Guide — In-depth guide covering both engines and best practices
YAML Config Schema — Full reference for all configuration options
Report Schema — Complete specification of the JSON report format
Performance Testing — Benchmark methodology and baseline results

Reconlify Desktop

Reconlify Desktop is a graphical interface for Reconlify CLI. It allows users to:

Visually build YAML reconciliation configs
Run reconciliations without using the terminal
Inspect reconciliation reports interactively

Reconlify CLI remains the core reconciliation engine.

Development

make install       # install dependencies
make test          # unit + integration tests (excludes e2e and perf)
make e2e           # end-to-end CLI tests
make lint          # ruff linter
make format        # auto-fix lint + format
make clean         # remove build artifacts and caches

Performance Testing

make perf          # generate fixtures + run full benchmark suite
make perf-smoke    # lightweight perf smoke tests
make perf-clean    # remove generated fixtures

See Performance Testing for details and baseline results.

Changelog

See CHANGELOG.md for release history.

Reconlify sits between simple file diff tools and heavy enterprise reconciliation systems, providing a deterministic, developer-friendly workflow for validating structured data locally.

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 9, 2026

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reconlify_cli-0.1.1.tar.gz (31.9 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

reconlify_cli-0.1.1-py3-none-any.whl (30.2 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file reconlify_cli-0.1.1.tar.gz.

File metadata

Download URL: reconlify_cli-0.1.1.tar.gz
Upload date: Mar 9, 2026
Size: 31.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.11.11 Darwin/22.5.0

File hashes

Hashes for reconlify_cli-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`e8821da3453094eca3e13d2d663cc93dd0707f08e4dda7ccd51d499ec0cfa17a`
MD5	`b1bc2e929252d5f8c98e1f272fde7e67`
BLAKE2b-256	`3816902c13d19754809dc820dba631dcee491a2da3197d0647e10d0b840a81d2`

See more details on using hashes here.

File details

Details for the file reconlify_cli-0.1.1-py3-none-any.whl.

File metadata

Download URL: reconlify_cli-0.1.1-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 30.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.11.11 Darwin/22.5.0

File hashes

Hashes for reconlify_cli-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`af9dcf342293a41df6562ee3c3071d976ad547fe01b226f589fde6cc0e4cf89f`
MD5	`0d82d79aebd6b4a40019db6fd123cbcf`
BLAKE2b-256	`ddc9dca14f675b66fb97cdb4b8d7ac195a8529c052bbd38347f40a383de75bd0`

See more details on using hashes here.

reconlify-cli 0.1.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Reconlify CLI

Quick Example

How Reconlify Compares

Features

Performance

Installation

CLI Usage

Options

Exit Codes

Version

Real-World Example: Accounting Transaction Reconciliation

1. Create sample data

2. Create the reconciliation config

3. Run the reconciliation

Text Engine

Engines

Tabular Engine

Text Engine

Report Format

CI Usage

GitHub Actions

Documentation

Reconlify Desktop

Development

Performance Testing

Changelog

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes