Skip to main content

No project description provided

Project description

DeltaDB

DeltaDB is a lightweight, fast, and scalable database built on polars and deltalake. It is designed to streamline data operations, providing features like upsert, delete, commit, and version control while harnessing the high performance of polars and deltalake.

Table of Contents

Installation

To install DeltaDB, run the following command:

pip install deltadb

[return]

Key Features

  • Upsert: Efficiently insert or update records in your tables.
  • Delete: Remove specific records or entire tables.
  • Commit: Version control your data with commit functionality.
  • Query: Execute SQL queries and retrieve results as dictionaries or dataframes.
  • Schema Handling: Automatically manage schema changes during data operations.
  • Versioning: Revert tables to previous versions when needed.

[return]

Getting Started

Connecting to a Database

Establish a connection to your database:

from deltadb import delta

# connect to a database at the specified path
db = delta.connect(path="test.delta")

[return]

Upserting Data

Insert or update data within a table:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data={"id": 1, "name": "alice"}
)

# query the data
result = db.sql("select * from test_table")
print(result)  # output: [{'id': 1, 'name': 'alice'}]

# commit the changes
db.commit("test_table")

[return]

Upserting Multiple Records

Upsert multiple records simultaneously, with automatic schema management:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data=[
        {"id": 1, "name": "ali"},
        {"id": 2, "name": "bob", "job": "chef"},
        {"id": 3, "name": "sam"},
    ]
)

[return]

Querying Data as a DataFrame

Execute SQL queries and return the results as a polars DataFrame for advanced data manipulation:

df_result = db.sql("select * from test_table", dtype="polars")
print(df_result)

[return]

Committing with Schema Differences

Force a commit even when there are schema differences between the current and new data:

db.commit("test_table", force=True)

[return]

Deleting Records

Remove specific records from a table using SQL or lambda functions, or delete the entire table:

# delete records using an sql filter
db.delete(table="test_table", filter="name='charles'")

# delete records using a lambda function
db.delete(table="test_table", filter=lambda row: row["name"] == "charles")

# delete the entire table
db.delete("test_table")

[return]

Checking Out Previous Table Versions

Revert to a previous version of a table with ease:

db.checkout(table="test_table", version=0)

[return]

Performance

DeltaDB excels in performance, particularly when compared to traditional databases like SQLite. A series of benchmarks were conducted to compare the average elapsed time for operations in both DeltaDB and SQLite.

Benchmark Setup

The benchmarks involved running 100 iterations of data insertion into tables of varying sizes, specifically with a width of 1,000 and a height of 10,000, to simulate realistic data load scenarios. For each iteration, the time taken to perform the operations was measured, and the average elapsed time for both DeltaDB and SQLite was calculated.

Results

  • DeltaDB demonstrated a significant performance advantage, especially as the data size increased. For a table with a width of 1,000 and a height of 10,000, DeltaDB completed the operations in an average time of 1.03 seconds over 100 runs.
  • In contrast, SQLite took an average of 8.06 seconds for the same operations and data size, making DeltaDB approximately 87.22% faster.

These results underscore the efficiency of DeltaDB in handling large datasets and performing complex operations, positioning it as a strong choice for data-intensive applications.

[return]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltadb-0.1.7.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

deltadb-0.1.7-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file deltadb-0.1.7.tar.gz.

File metadata

  • Download URL: deltadb-0.1.7.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.7.tar.gz
Algorithm Hash digest
SHA256 95a76b5b00de17ad5597b51971cda8d3d998b041671c2bd3cf8d183ef9368861
MD5 4b182931d0c3ef8c5ff82f433a030907
BLAKE2b-256 1d770f850dff4e2f71dfcfb01a996a4b5401056016444cd71c407b219ad23138

See more details on using hashes here.

File details

Details for the file deltadb-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: deltadb-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 53a736ea758fbf65aed25e43df8277200f4474da75cd03c8ce2f19d8dfacae74
MD5 04c490baabd53c5158c813e66f487f29
BLAKE2b-256 f8295ef9b513f90333f057163db03bb9a9862d71fc55c6cf39a2354e1a5e5476

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page