Skip to main content

No project description provided

Project description

DeltaDB

DeltaDB is a lightweight, fast, and scalable database built on polars and deltalake. It is designed to streamline data operations, providing features like upsert, delete, commit, and version control while harnessing the high performance of polars and deltalake.

Table of Contents

Installation

To install DeltaDB, run the following command:

pip install deltadb

[return]

Key Features

  • Upsert: Efficiently insert or update records in your tables.
  • Delete: Remove specific records or entire tables.
  • Commit: Version control your data with commit functionality.
  • Query: Execute SQL queries and retrieve results as dictionaries or dataframes.
  • Schema Handling: Automatically manage schema changes during data operations.
  • Versioning: Revert tables to previous versions when needed.

[return]

Getting Started

Connecting to a Database

Establish a connection to your database:

from deltadb import delta

# connect to a database at the specified path
db = delta.connect(path="test.delta")

[return]

Upserting Data

Insert or update data within a table:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data={"id": 1, "name": "alice"}
)

# query the data
result = db.sql("select * from test_table")
print(result)  # output: [{'id': 1, 'name': 'alice'}]

# commit the changes
db.commit("test_table")

[return]

Upserting Multiple Records

Upsert multiple records simultaneously, with automatic schema management:

db.upsert(
    table="test_table", 
    primary_key="id", 
    data=[
        {"id": 1, "name": "ali"},
        {"id": 2, "name": "bob", "job": "chef"},
        {"id": 3, "name": "sam"},
    ]
)

[return]

Querying Data as a DataFrame

Execute SQL queries and return the results as a polars DataFrame for advanced data manipulation:

df_result = db.sql("select * from test_table", dtype="polars")
print(df_result)

[return]

Committing with Schema Differences

Force a commit even when there are schema differences between the current and new data:

db.commit("test_table", force=True)

[return]

Deleting Records

Remove specific records from a table using SQL or lambda functions, or delete the entire table:

# delete records using an sql filter
db.delete(table="test_table", filter="name='charles'")

# delete records using a lambda function
db.delete(table="test_table", filter=lambda row: row["name"] == "charles")

# delete the entire table
db.delete("test_table")

[return]

Checking Out Previous Table Versions

Revert to a previous version of a table with ease:

db.checkout(table="test_table", version=0)

[return]

Performance

DeltaDB excels in performance, particularly when compared to traditional databases like SQLite. A series of benchmarks were conducted to compare the average elapsed time for operations in both DeltaDB and SQLite.

Benchmark Setup

The benchmarks involved running 100 iterations of data insertion into tables of varying sizes, specifically with a width of 1,000 and a height of 10,000, to simulate realistic data load scenarios. For each iteration, the time taken to perform the operations was measured, and the average elapsed time for both DeltaDB and SQLite was calculated.

Results

  • DeltaDB demonstrated a significant performance advantage, especially as the data size increased. For a table with a width of 1,000 and a height of 10,000, DeltaDB completed the operations in an average time of 1.03 seconds over 100 runs.
  • In contrast, SQLite took an average of 8.06 seconds for the same operations and data size, making DeltaDB approximately 87.22% faster.

These results underscore the efficiency of DeltaDB in handling large datasets and performing complex operations, positioning it as a strong choice for data-intensive applications.

[return]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltadb-0.1.4.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

deltadb-0.1.4-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file deltadb-0.1.4.tar.gz.

File metadata

  • Download URL: deltadb-0.1.4.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.4.tar.gz
Algorithm Hash digest
SHA256 37fd3de44ec1dd63e058b111df2d8ccad78ba53917c55dd9206a4b216f1c5869
MD5 29907ca4f4c493e7be73eb8727df226e
BLAKE2b-256 b6edeb48905d8fa13168add2d0843e8a40445e4e14a75b2d4cab8a78623ddaea

See more details on using hashes here.

File details

Details for the file deltadb-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: deltadb-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for deltadb-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1663dd2f965fd299983ed8c1c9577ed9193608a181a5bbaee4189f4d6cd87316
MD5 0a316152b9a45dcc4a4431fcd6885083
BLAKE2b-256 d64ae784bac18e46c0fa05cd5fb12ce7c640aa1d937a25740a53a3c7b184991e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page