Skip to main content

No project description provided

Project description

DeltaDB

DeltaDB is a lightweight, fast, and scalable database built on polars and deltalake. It is designed to streamline data operations, providing features like upsert, delete, commit, and version control while harnessing the high performance of polars and deltalake.

Table of Contents

Installation

To install DeltaDB, run the following command:

pip install deltadb

[return]

Key Features

  • Upsert: Efficiently insert or update records in your tables.
  • Delete: Remove specific records or entire tables.
  • Commit: Version control your data with commit functionality.
  • Query: Execute SQL queries and retrieve results as dictionaries or dataframes.
  • Schema Handling: Automatically manage schema changes during data operations.
  • Versioning: Revert tables to previous versions when needed.

[return]

Getting Started

Connecting to a Database

Establish a connection to your database:

from deltadb import delta

# connect to a database at the specified path
db = delta.connect(path="test.delta")

[return]

Upserting Data

Insert or update data within a table:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data={"id": 1, "name": "alice"}
)

# query the data
result = db.sql("select * from test_table")
print(result)  # output: [{'id': 1, 'name': 'alice'}]

# commit the changes
db.commit("test_table")

[return]

Upserting Multiple Records

Upsert multiple records simultaneously, with automatic schema management:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data=[
        {"id": 1, "name": "ali"},
        {"id": 2, "name": "bob", "job": "chef"},
        {"id": 3, "name": "sam"},
    ]
)

[return]

Querying Data as a DataFrame

Execute SQL queries and return the results as a polars DataFrame for advanced data manipulation:

df_result = db.sql("select * from test_table", dtype="polars")
print(df_result)

[return]

Committing with Schema Differences

Force a commit even when there are schema differences between the current and new data:

db.commit("test_table", force=True)

[return]

Deleting Records

Remove specific records from a table using SQL or lambda functions, or delete the entire table:

# delete records using an sql filter
db.delete(table="test_table", filter="name='charles'")

# delete records using a lambda function
db.delete(table="test_table", filter=lambda row: row["name"] == "charles")

# delete the entire table
db.delete("test_table")

[return]

Checking Out Previous Table Versions

Revert to a previous version of a table with ease:

db.checkout(table="test_table", version=0)

[return]

Performance

DeltaDB excels in performance, particularly when compared to traditional databases like SQLite. A series of benchmarks were conducted to compare the average elapsed time for operations in both DeltaDB and SQLite.

Benchmark Setup

The benchmarks involved running 100 iterations of data insertion into tables of varying sizes, specifically with a width of 1,000 and a height of 10,000, to simulate realistic data load scenarios. For each iteration, the time taken to perform the operations was measured, and the average elapsed time for both DeltaDB and SQLite was calculated.

Results

  • DeltaDB demonstrated a significant performance advantage, especially as the data size increased. For a table with a width of 1,000 and a height of 10,000, DeltaDB completed the operations in an average time of 1.03 seconds over 100 runs.
  • In contrast, SQLite took an average of 8.06 seconds for the same operations and data size, making DeltaDB approximately 87.22% faster.

These results underscore the efficiency of DeltaDB in handling large datasets and performing complex operations, positioning it as a strong choice for data-intensive applications.

[return]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltadb-0.1.8.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

deltadb-0.1.8-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file deltadb-0.1.8.tar.gz.

File metadata

  • Download URL: deltadb-0.1.8.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.8.tar.gz
Algorithm Hash digest
SHA256 7479822d028d3edf24d1517730877a0ab417fa8291e1d1b69f3eb421b9a58f77
MD5 e0aa5b078482f47bff2826014d0671bf
BLAKE2b-256 1d326c7a90b3ef3fcf3c61e1072a5e6d338d1cb020e3d0183109d4eae02d705a

See more details on using hashes here.

File details

Details for the file deltadb-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: deltadb-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 f672c16a4a6953a0c4b80f8574768044dcd314c781ff8d0cea1174d25417ebd0
MD5 7862ec32de58cc0e1b1c27a2251ed4ca
BLAKE2b-256 44aa9bbe0a499158bf02914b94f5ff26136b6046cf59d463c1ec352741849e0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page