Skip to main content

NoSQLDB following the Dynamo concept, but for a filebased embedded db.

Project description

Consider the project as a proof of concept! Definitely not production ready!

Dynafile

Embedded pure Python NoSQL database following DynamoDB concepts.

pip install dynafile

# with string filter support using filtration

pip install "dynafile[filter]"

# bloody edge

pip install git+https://github.com/eruvanos/dynafile.git
pip install filtration

Overview

Dynafile stores items within partitions, which are stored as separate files. Each partition contains a SortedDict from sortedcontainers which are sorted by the sort key attribute.

Dynafile does not implement the interface or functionality of DynamoDB, but provides familiar API patterns.

Differences:

  • Embedded, file based
  • No pagination

Features

  • persistence
  • put item
  • get item
  • delete item
  • scan - without parameters
  • query - starts_with
  • query - index direction
  • query - filter
  • scan - filter
  • batch writer
  • atomic file write
  • event stream hooks (put, delete)
  • TTL

Roadmap

  • GSI - global secondary index
  • update item
  • batch get
  • thread safeness
  • LSI - local secondary index
  • split partitions
  • parallel scans - pre defined scan segments
  • transactions
  • optimise disc load time (cache partitions in memory, invalidate on file change)
  • conditional put item
  • improve file consistency (options: acidfile)

API

from dynafile import *

# init DB interface
db = Dynafile(path=".", pk_attribute="PK", sk_attribute="SK")

# put items
db.put_item(item={"PK": "user#1", "SK": "user#1", "name": "Bob"})
db.put_item(item={"PK": "user#1", "SK": "role#1", "TYPE": "sender"})
db.put_item(item={"PK": "user#2", "SK": "user#2", "name": "Alice"})

# more performant batch operation
with db.batch_writer() as writer:
    db.put_item(item={"PK": "user#3", "SK": "user#3", "name": "Steve"})
    db.delete_item(key={"PK": "user#3", "SK": "user#3"})

# retrieve items
item = db.get_item(key={
    "PK": "user#1",
    "SK": "user#1"
})

# query item collection by pk
items = list(db.query(pk="user#1"))

# scan full table
items = list(db.scan())

# add event stream listener to retrieve item modification
def print_listener(event: Event):
    print(event.action)
    print(event.old)
    print(event.new)


db.add_stream_listener(print_listener)

Filter

query and scan support filter, you can provide callables as filter like lambda expressions.

Another option are filtration expressions.

  • Equal ("==")
  • Not equal ("!=")
  • Less than ("<")
  • Less than or equal ("<=")
  • Greater than (">")
  • Greater than or equal (">=")
  • Contains ("in")
    • RHS must be a list or a Subnet
  • Regular expression ("=~")
    • RHS must be a regex token

Examples:

  • SK =~ /^a/ - SK starts with a
  • SK == 1 - SK is equal 1
  • SK == 1 - SK is equal 1
  • nested.a == 1 - accesses nested structure item.nested.a

TTL - Time To Live

TTL provides the option to expire items on read time (get, query, scan).

import time
from dynafile import *

db = Dynafile(path=".", pk_attribute="PK", sk_attribute="SK", ttl_attribute="ttl")

item = {"PK": "1", "SK": "2", "ttl": time.time() - 1000} # expired ttl
db.put_item(item=item)

list(db.scan()) # -> []

Architecture

architecture.puml

File Structure

--- ROOT ---
./db/

--- MAIN DB ---

|- meta.json - meta information
|- _partitions/
    |- <hash>/
        |- data.pickle - Contains partition data by sort key (SortedDict)
        |- lsi-attr1.pickle - Contains partition data by lsi attr (SortedDict)

--- GSI ---
|- _gsi-<gsi-name>/
    |- _partitions/
        |- <hash>/
            |- data.pickle - Contains partition data by sort key (SortedDict)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynafile-0.1.3.tar.gz (55.9 kB view hashes)

Uploaded Source

Built Distribution

dynafile-0.1.3-py3-none-any.whl (6.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page