No project description provided
Project description
DeltaDB
DeltaDB is a lightweight, fast, and scalable database built on polars and deltalake. It is designed to streamline data operations, providing features like upsert, delete, commit, and version control while harnessing the high performance of polars and deltalake.
Table of Contents
Installation
To install DeltaDB, run the following command:
pip install deltadb
Key Features
- Upsert: Efficiently insert or update records in your tables.
- Delete: Remove specific records or entire tables.
- Commit: Version control your data with commit functionality.
- Query: Execute SQL queries and retrieve results as dictionaries or dataframes.
- Schema Handling: Automatically manage schema changes during data operations.
- Versioning: Revert tables to previous versions when needed.
Getting Started
Connecting to a Database
Establish a connection to your database:
from deltadb import delta
# connect to a database at the specified path
db = delta.connect(path="test.delta")
Upserting Data
Insert or update data within a table:
db.upsert(
table="test_table",
primary_key="id",
data={"id": 1, "name": "alice"}
)
# query the data
result = db.sql("select * from test_table")
print(result) # output: [{'id': 1, 'name': 'alice'}]
# commit the changes
db.commit("test_table")
Upserting Multiple Records
Upsert multiple records simultaneously, with automatic schema management:
db.upsert(
table="test_table",
primary_key="id",
data=[
{"id": 1, "name": "ali"},
{"id": 2, "name": "bob", "job": "chef"},
{"id": 3, "name": "sam"},
]
)
Querying Data as a DataFrame
Execute SQL queries and return the results as a polars DataFrame for advanced data manipulation:
df_result = db.sql("select * from test_table", dtype="polars")
print(df_result)
Committing with Schema Differences
Force a commit even when there are schema differences between the current and new data:
db.commit("test_table", force=True)
Deleting Records
Remove specific records from a table using SQL or lambda functions, or delete the entire table:
# delete records using an sql filter
db.delete(table="test_table", filter="name='charles'")
# delete records using a lambda function
db.delete(table="test_table", filter=lambda row: row["name"] == "charles")
# delete the entire table
db.delete("test_table")
Checking Out Previous Table Versions
Revert to a previous version of a table with ease:
db.checkout(table="test_table", version=0)
Performance
DeltaDB excels in performance, particularly when compared to traditional databases like SQLite. A series of benchmarks were conducted to compare the average elapsed time for operations in both DeltaDB and SQLite.
Benchmark Setup
The benchmarks involved running 100 iterations of data insertion into tables of varying sizes, specifically with a width of 1,000 and a height of 10,000, to simulate realistic data load scenarios. For each iteration, the time taken to perform the operations was measured, and the average elapsed time for both DeltaDB and SQLite was calculated.
Results
- DeltaDB demonstrated a significant performance advantage, especially as the data size increased. For a table with a width of 1,000 and a height of 10,000, DeltaDB completed the operations in an average time of 1.03 seconds over 100 runs.
- In contrast, SQLite took an average of 8.06 seconds for the same operations and data size, making DeltaDB approximately 87.22% faster.
These results underscore the efficiency of DeltaDB in handling large datasets and performing complex operations, positioning it as a strong choice for data-intensive applications.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file deltadb-0.1.4.tar.gz
.
File metadata
- Download URL: deltadb-0.1.4.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37fd3de44ec1dd63e058b111df2d8ccad78ba53917c55dd9206a4b216f1c5869 |
|
MD5 | 29907ca4f4c493e7be73eb8727df226e |
|
BLAKE2b-256 | b6edeb48905d8fa13168add2d0843e8a40445e4e14a75b2d4cab8a78623ddaea |
File details
Details for the file deltadb-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: deltadb-0.1.4-py3-none-any.whl
- Upload date:
- Size: 17.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.2 Darwin/23.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1663dd2f965fd299983ed8c1c9577ed9193608a181a5bbaee4189f4d6cd87316 |
|
MD5 | 0a316152b9a45dcc4a4431fcd6885083 |
|
BLAKE2b-256 | d64ae784bac18e46c0fa05cd5fb12ce7c640aa1d937a25740a53a3c7b184991e |