A database-agnostic Python library for simple and fast UPSERT (Insert/Update/Delete) operations via SQLAlchemy.
Project description
DBMerge
DBMerge is a Python library that provides a simplified interface for performing UPSERT (Insert/Update/Delete) operations. Built on top of SQLAlchemy, it abstracts away the complexities of writing engine-specific SQL MERGE or ON CONFLICT statements.
Overview
DBMerge accepts multiple data sources as input:
- Pandas DataFrames
- Lists of dictionaries
- Other database tables or views
DBMerge automates data update process by comparing your source data against the target table and automatically performing the required operations.
- Insert new records that do not exist in the target table.
- Update existing records only if their values have changed.
- Delete (or mark) existing records in the target table that are no longer present in the source data.
To ensure optimal performance, the library loads your data into a temporary table first, and then executes bulk synchronization queries.
Supported Databases
Tested and verified with:
- PostgreSQL
- MariaDB / MySQL
- SQLite
- MS SQL Server
Installation
pip install dbmerge
Quick Start Example
The library uses a context manager to handle database connections and ensure resources are safely released.
from sqlalchemy import create_engine
from datetime import date
from dbmerge import dbmerge
# 1. Initialize DB engine
engine = create_engine("sqlite://")
# 2. Prepare your source data
data = [
{'Shop': '123', 'Product': 'A1', 'Date': date(2025, 1, 1), 'Qty': 2, 'Price': 50.10},
{'Shop': '124', 'Product': 'A1', 'Date': date(2025, 1, 1), 'Qty': 1, 'Price': 100.50}
]
# 3. Execute the merge operation
# The table will be created automatically if it doesn't exist.
with dbmerge(engine=engine, data=data, table_name="Facts",
key=['Shop', 'Product', 'Date']) as merge:
merge.exec()
Key Features
- Database Agnostic: Write your synchronization logic once and run it across different SQL databases without modifying the code.
- High Performance: Uses temporary staging tables for fast bulk operations rather than slow row-by-row changes.
- Smart Deletion: Supports scoped deletion. You can pass a SQLAlchemy logical expression to delete missing data only within a specific timeframe or subset (e.g., updating only a single month).
- Auto-Schema Management: Automatically creates missing tables or columns in the database.
- Audit: Optional parameters to automatically add
merged_onandinserted_ontimestamps to track when rows were created or modified.
Benchmark
DBMerge handles the entire reconciliation process (staging, comparing, updating, inserting) with solid performance, scaling well even for larger datasets.
Here is a rough performance comparison for synchronizing data of different sizes using DBMerge (measured on a standard developer laptop):
| Database | DBMerge (100k rows) | DBMerge (1mil rows) |
|---|---|---|
| PostgreSQL | ~2.0s | ~19.8s |
| MySQL / MariaDB | ~1.0s | ~11.1s |
| SQLite | ~0.7s | ~7.6s |
| MS SQL Server* | ~22.4s | ~4m 23s |
* Note: MS SQL Server bulk operations take longer due to inherent limitations in the pyodbc driver
Database-Specific Notes & Limitations
- PostgreSQL:
- Temporary tables are created as
UNLOGGED. JSONBtype is supported, but notJSON(as it cannot be compared to detect changes).
- Temporary tables are created as
- MariaDB / MySQL:
- Does not detect changes in uppercase vs. lowercase or space padding by default (e.g.,
'test' == ' Test'). If this is important, you need to change the collation settings in your database. - The schema is treated the same as the database, but schema settings are still supported by this library.
- Does not allow strings with unlimited size. You must explicitly define
data_typesif you want to create a table or field automatically (e.g.,data_types={'Your Field': String(100)}).
- Does not detect changes in uppercase vs. lowercase or space padding by default (e.g.,
- SQLite: Does not support schemas. If a schema setting is provided, it is automatically reset to
Nonewith a warning. - MS SQL Server: Bulk insert operations may have lower performance due to specific
pyodbcdriver limitations. - Oracle: Currently not supported (missing support for
JOINoperations inUPDATEstatements within theoracledbmodule). - DuckDB: Currently not supported (due to a bug in
duckdb_engineregarding table definition loading).
Documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbmerge-1.0.13.tar.gz.
File metadata
- Download URL: dbmerge-1.0.13.tar.gz
- Upload date:
- Size: 110.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27365d983c1f3a625c6108741858108fdce0832ecd9c1c83283cbbbb2b5a051d
|
|
| MD5 |
33ecb4750f3096c3be894d76d10e24ed
|
|
| BLAKE2b-256 |
8a16eaa6ea13ca2b9ddb55831752627c36c118e867f4a9ee34d428ad88f09dba
|
File details
Details for the file dbmerge-1.0.13-py3-none-any.whl.
File metadata
- Download URL: dbmerge-1.0.13-py3-none-any.whl
- Upload date:
- Size: 13.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b9bb20c73eafd782730df2ff1c2417b2567d37e5adb4bd1ace7144a999000af
|
|
| MD5 |
00943fa2fa715bba4042c3b992190488
|
|
| BLAKE2b-256 |
dc2056e86b4c3e6764d285895694c7f9892b12658ca837f9d0dd8ec93baad0b7
|