Skip to main content

No project description provided

Project description

DeltaDB

DeltaDB is a lightweight, fast, and scalable database built on polars and deltalake. It is designed to streamline data operations, providing features like upsert, delete, commit, and version control while harnessing the high performance of polars and deltalake.

Table of Contents

Installation

To install DeltaDB, run the following command:

pip install deltadb

[return]

Key Features

  • Upsert: Efficiently insert or update records in your tables.
  • Delete: Remove specific records or entire tables.
  • Commit: Version control your data with commit functionality.
  • Query: Execute SQL queries and retrieve results as dictionaries or dataframes.
  • Schema Handling: Automatically manage schema changes during data operations.
  • Versioning: Revert tables to previous versions when needed.

[return]

Getting Started

Connecting to a Database

Establish a connection to your database:

from deltadb import delta

# connect to a database at the specified path
db = delta.connect(path="test.delta")

[return]

Upserting Data

Insert or update data within a table:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data={"id": 1, "name": "alice"}
)

# query the data
result = db.sql("select * from test_table")
print(result)  # output: [{'id': 1, 'name': 'alice'}]

# commit the changes
db.commit("test_table")

[return]

Upserting Multiple Records

Upsert multiple records simultaneously, with automatic schema management:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data=[
        {"id": 1, "name": "ali"},
        {"id": 2, "name": "bob", "job": "chef"},
        {"id": 3, "name": "sam"},
    ]
)

[return]

Querying Data as a DataFrame

Execute SQL queries and return the results as a polars DataFrame for advanced data manipulation:

df_result = db.sql("select * from test_table", dtype="polars")
print(df_result)

[return]

Committing with Schema Differences

Force a commit even when there are schema differences between the current and new data:

db.commit("test_table", force=True)

[return]

Deleting Records

Remove specific records from a table using SQL or lambda functions, or delete the entire table:

# delete records using an sql filter
db.delete(table="test_table", filter="name='charles'")

# delete records using a lambda function
db.delete(table="test_table", filter=lambda row: row["name"] == "charles")

# delete the entire table
db.delete("test_table")

[return]

Checking Out Previous Table Versions

Revert to a previous version of a table with ease:

db.checkout(table="test_table", version=0)

[return]

Performance

DeltaDB excels in performance, particularly when compared to traditional databases like SQLite. A series of benchmarks were conducted to compare the average elapsed time for operations in both DeltaDB and SQLite.

Benchmark Setup

The benchmarks involved running 100 iterations of data insertion into tables of varying sizes, specifically with a width of 1,000 and a height of 10,000, to simulate realistic data load scenarios. For each iteration, the time taken to perform the operations was measured, and the average elapsed time for both DeltaDB and SQLite was calculated.

Results

  • DeltaDB demonstrated a significant performance advantage, especially as the data size increased. For a table with a width of 1,000 and a height of 10,000, DeltaDB completed the operations in an average time of 1.03 seconds over 100 runs.
  • In contrast, SQLite took an average of 8.06 seconds for the same operations and data size, making DeltaDB approximately 87.22% faster.

These results underscore the efficiency of DeltaDB in handling large datasets and performing complex operations, positioning it as a strong choice for data-intensive applications.

[return]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltadb-0.1.5.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

deltadb-0.1.5-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file deltadb-0.1.5.tar.gz.

File metadata

  • Download URL: deltadb-0.1.5.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.5.tar.gz
Algorithm Hash digest
SHA256 81d8183c631282623a4295400bf87f528d337b7b329c4418536f41eb4c3f5ad0
MD5 25acdd6b0576a97c77c7bc2406028a1c
BLAKE2b-256 68a06dc7a6f1191f1366c954fed3e3a701cf889121bd602b54ca9377fdc68f02

See more details on using hashes here.

File details

Details for the file deltadb-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: deltadb-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ae3e819f7afbb9014bd2b2668239200c4b94a29fe671a6191414e807119d6873
MD5 c9ccac53dccc2a58397c34c3984f18b5
BLAKE2b-256 5aa8ad585658948961500c5d5ece195105bf295ed14738edf369de5c48b99a9b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page