Skip to main content

Upload your Chart of Accounts. Get a production-ready financial hierarchy and dbt models. Zero config.

Project description

DataBridge Core

PyPI version Python License: MIT

Your finance team just spent 4 hours on VLOOKUP. This takes 5 seconds.

DataBridge Core is a Python toolkit for data reconciliation, profiling, and ingestion. Compare CSV files, find fuzzy matches, detect schema drift, and clean messy data -- from the command line or Python.

pip install databridge-core  # v1.0.0

5-Second Demo

# Profile a file
databridge profile sales.csv

# Compare two sources -- find orphans, conflicts, match rate
databridge compare source.csv target.csv --keys id

# Fuzzy match names across systems
databridge fuzzy erp_accounts.csv gl_accounts.csv --column name --threshold 80

Python API

from databridge_core import compare_hashes, profile_data, load_csv

# Profile your data
profile = profile_data("chart_of_accounts.csv")
print(f"{profile['rows']} rows, {profile['columns']} columns")
print(f"Potential keys: {profile['potential_key_columns']}")

# Compare two sources
result = compare_hashes("source.csv", "target.csv", key_columns="account_id")
stats = result["statistics"]
print(f"Match rate: {stats['match_rate_percent']}%")
print(f"Conflicts: {stats['conflicts']}, Orphans: {stats['total_orphans']}")

Commands

Command Description
databridge profile <file> Profile data: structure, quality, cardinality
databridge compare <a> <b> --keys <col> Hash comparison: orphans, conflicts, match rate
databridge fuzzy <a> <b> -c <col> Fuzzy match columns across two files
databridge diff <a> <b> Text diff between two files
databridge drift <old> <new> Detect schema drift between CSVs
databridge transform <file> -c <col> --op upper Clean a column (upper/lower/strip/trim/remove_special)
databridge merge <a> <b> --keys <col> Merge two CSVs on key columns
databridge find "*.csv" Find files matching a pattern
databridge parse <text> Parse tabular data from messy text

Optional Extras

pip install 'databridge-core[fuzzy]'   # Fuzzy matching (rapidfuzz)
pip install 'databridge-core[pdf]'     # PDF text extraction (pypdf)
pip install 'databridge-core[ocr]'     # OCR image extraction (pytesseract)
pip install 'databridge-core[sql]'     # Database queries (sqlalchemy)
pip install 'databridge-core[all]'     # Everything
pip install 'databridge-core[dev]'     # Development tools (pytest, ruff, build)

Built for Finance

DataBridge Core is the open-source foundation of DataBridge AI -- a full platform for financial hierarchy management, dbt model generation, and enterprise data reconciliation.

How it works: Upload your Chart of Accounts. Get a production-ready financial hierarchy and dbt models. Zero config.

Changelog

v1.0.0 (2026-02-24)

  • Initial public release on PyPI
  • 9 CLI commands: profile, compare, fuzzy, diff, drift, transform, merge, find, parse
  • 16 Python API functions
  • Python 3.10 - 3.13

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databridge_core-1.1.0.tar.gz (238.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databridge_core-1.1.0-py3-none-any.whl (136.0 kB view details)

Uploaded Python 3

File details

Details for the file databridge_core-1.1.0.tar.gz.

File metadata

  • Download URL: databridge_core-1.1.0.tar.gz
  • Upload date:
  • Size: 238.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for databridge_core-1.1.0.tar.gz
Algorithm Hash digest
SHA256 a27fe528f4c9e67aa4746e9ba811ca22e472e0774a4d6c8720412797d0d4737d
MD5 ef95062391a6ca5a683f24f15ab44b6e
BLAKE2b-256 f5280ef40bef51702c279e2f381e4bb802a9f5721f97eaa5453a6e55b71eb205

See more details on using hashes here.

File details

Details for the file databridge_core-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for databridge_core-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9e7bd64da0d2defb0edb9616345472b4c08c0476a5edf854a6414997d596019e
MD5 aba69e2d21c3206911bf7a27125d6ab3
BLAKE2b-256 0ddc5cc76d606add839c9bf12a16bbb8fd3f8e741263cb3051d8b9be640599f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page