Skip to main content

a system for storing data structures in lmdb

Project description

Gink in General

Gink aims to be a "protocol first" database system defined by the protocol for syncronizing instances, rather than by a specific implementation. Defining the database in terms of the interchange format allows for independent implementations to interact seamlessly in a well-defined manner.

Take a look at the full docs here.

I created the Python implementation of Gink to be a testbed for new ideas and to provide the simplest expression of all the concepts in Gink. Well written python code can essentially serve as executable psudocode. Code written for this implementation has been biased in favor of readability and extensibility, rather than raw performance. For example, (most of) the code doesn't use async functions or multi-threading.

Installation

Assuming you have Python and Pip installed:

pip3 install gink

Examples

This page does not include examples for all data structures. Take a look at our Python documentation for all examples and full docs.

Data Structures

There are some operations that are available to all Containers, which are shown at the end of this page.

For all operations, a store and a database are needed:

from gink import *

store = LmdbStore('example.db')
database = Database(store=store)

Box

A Box is the simplest data structure available on Gink. It can hold only one value at a time; you can set its value, or get its value.

box = Box(database=database)

box.set({"foo": "bar", "key2": 15})
result = box.get() # Returns the python dictionary just added

if not box.isEmpty():
    print(box.size()) # This will only return 0 or 1 (1 in this case).

Directory

The Directory aims to mimic the functionality of a Python dictionary. If you know how to use a dictionary, you should already know how to use the directory!

Create a new directory:

directory = Directory(database=database)

Set key: value pairs:

directory["key1"] = "value1"

# Saves a timestamp after "key1" and before "key2"
time = database.get_now() # more on this in All Containers examples

# Achieves the same thing as the previous set, just different syntax.
directory.set("key2", {"test": "document"})

Getting the value of a key:

result = directory["key1"] # Returns "value1"
result2 = directory.get("key2") # Returns {"test": "document"}
result3 = directory.get("key3") # Returns None

Get all keys and items:

# Returns an generator of ["key1", "key2"]
# Note: the order may not be the same.
keys = directory.keys()

# Returns the items as a generator of (key, value tuples) in the directory
# at the specified timestamp - in this case, [("key1", "value1")]
items = directory.items(as_of=time)

# returns a list of all values
values = directory.values()

Deleting keys:

# Returns "value1" and removes the key: value pair from the directory.
value = directory.pop("key1")

# delete the key and return the Muid of this change
del_muid = directory.delete("key2")

Setting multiple keys and values at the same time:

directory.update({"newkey": "newvalue", "another key": "another value"})

Sequence

The Sequence is equivalent to a Python list. Again, these operations should look pretty familiar! In a Gink Sequence, the contents are ordered by timestamps.

Create a Sequence and append some values:

sequence = Sequence()

sequence.append("Hello, World!")
sequence.append(42)
sequence.append("a")
sequence.append("b")

Search for value and return index if found:

found = sequence.index("Hello, World!")
# Returns 0

Pop values:

popped = sequence.pop(1)
# Returns 42

# Pops and returns the value at index 0, which is "Hello, World!"
# The destination argument allows you to place the item
# back into the sequence at a different timestamp
# in this case, -1 would indicate the timestamp of the last change.
# So, this sequence is now ordered as ["a", "b", "Hello, World!"]
popped = sequence.pop(0, dest=-1)

Insert to specific index:

# Inserts "x" at index 1, between "a" and "b", in this example.
# Comment is an optional parameter that will be included in
# bundle for this change (most operations may contain comments).
sequence.insert(1, "x", comment="insert x")

Return the sequence as a Python list:

as_list = list(sequence)

All Containers

The Container is the parent class for all Gink data structures. Here are some examples of the powerful operations you can do with any container:

From Contents

To make it easier to insert data into an object upon initialization, Gink allows you to specify a contents argument to the constructor of the object. Different data structures may take different types as contents, but the idea remains the same for all Gink objects.

directory = Directory(database=database, contents={
    "key1": "value1", "key2": 42, "key3": [1, 2, 3, 4]})

key_set = KeySet(database=database, contents=["key1", "key2", 3])

# Vertex creation for pair map population
vertex1 = Vertex()
vertex2 = Vertex()

# Pair Map contents only takes a dictionary. Read the docs for the
# accepted data types for other data structures.
pair_map = PairMap(contents={(noun1, noun2): "value"})
Back in time

You will frequently see as_of in the Gink documentation. as_of refers to the time to look back to. There are multiple ways of interacting with as_of. If you are curious about how certain timestamps are resolved, take a look at Database.resolve_timestamp()
One easy way is to pass a negative integer indicating how many changes back you want to look.

box = Box(contents="first_value")
box.set("second_value")

# Passing -1 into the as_of argument looks back at the previous value
# Returns "first_value"
previous = box.get(as_of=-1)

Another common way to use timestamps is to "save" a time between changes as a variable.

box = Box(contents="first_value")
time_after_first = database.get_now()
box.set("second_value")

# Passing saved timestamp into as_of
# Returns "first_value"
previous = box.get(as_of=time_after_first)
Reset

Resetting a container is a fundamental operation used to revert the container back to a previous time. Above we looked at using timestamps to get previous values, but resetting to specific times may prove more useful. This example uses a directory, but this functionality works the same for all containers.

directory = Directory()

directory["foo"] = "bar"
directory["bar"] = "foo"
time_between = database.get_now()
directory[7] = {"user": 1003203, "email": "test@test.com"}

has_7 = 7 in directory # returns True
directory.reset(to_time=time_between)
has_7 = 7 in directory # now returns False
has_bar = "bar" in directory # still returns True

Clearing

Clearing a container does exactly what you would expect it to do. The Container.clear() method removes all entries from the container and returns the Muid of the clearance. The clearance is processed as a new database change, which means you can still look back at previous timestamps or reset the database back to before the clearance occurred.

directory = Directory()

directory["foo"] = "bar"
directory["bar"] = "foo"
directory[7] = {"user": 1003203, "email": "test@test.com"}
# Storing the muid of the clearance to use later
clearance_muid = directory.clear()

# Directory is now empty
has_foo = "foo" in directory # Returns False
has_bar = "bar" in directory # Returns False
has_7 = "7" in directory # Returns False

# Using the muid's timestamp to look back before the clearance
# Returns "bar"
previous = directory.get("foo", as_of=clearance_muid.timestamp)

Database Operations

Bundling and comments

Think of a bundle as a commit in Git. A bundle is just a collection of changes with an optional comment/message. Without specifying a bundler object, most Gink operations will immediately commit the change in its own bundle, so you don't have to worry about always creating a new bundler, etc. However, if you do want to specify which changes go into a specific bundle, here is an example:

directory = Directory()
bundler = Bundler(comment="example setting values in directory")

directory.set("key1", 1, bundler=bundler)
directory.set("key2", "value2", bundler=bundler)
directory.update({"key3": 3, "key4": 4}, bundler=bundler)

# This seals the bundler and commits changes to database
# at this point, no more changes may be added
database.commit(bundler)

Reset

Similar to how Container.reset() works, the Database class has its own reset functionality that will reset all containers to the specified time. A "reset" is simply one large bundle of changes that updates the database entries to what they were are the previous timestamp; this allows you to easily look back before the reset.

database = Database(store=store)
root = Directory.get_global_instance(database=database)
queue = Sequence.get_global_instance(database=database)
misc = Directory()

misc["yes"] = False
root["foo"] = "bar"
queue.append("value1")

# No as_of argument defaults to EPOCH
# which is the time of database creation (empty)
database.reset()

# All containers will have a length of 0
# since the database is now empty.
size = len(root)

# to_time=-1 reverts the database to the
# previous change
database.reset(to_time=-1)

# This will now have a len of 1,
# and one element of "value1"
size = len(queue)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gink-0.20240123.1705984949.tar.gz (91.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gink-0.20240123.1705984949-py3-none-any.whl (123.3 kB view details)

Uploaded Python 3

File details

Details for the file gink-0.20240123.1705984949.tar.gz.

File metadata

  • Download URL: gink-0.20240123.1705984949.tar.gz
  • Upload date:
  • Size: 91.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for gink-0.20240123.1705984949.tar.gz
Algorithm Hash digest
SHA256 fdf8af83d3e3698a72a41d6a03d01122d6bd9e77aa8f2e975a63a1dd3c2cfb27
MD5 ac9c8b129255c5f753b13781a171438a
BLAKE2b-256 dedd9d44268425d24cae9d4a32f022a1e4cabb0ca535fa84b5df159d6875dc46

See more details on using hashes here.

File details

Details for the file gink-0.20240123.1705984949-py3-none-any.whl.

File metadata

File hashes

Hashes for gink-0.20240123.1705984949-py3-none-any.whl
Algorithm Hash digest
SHA256 d465d78a6a36d8ac40810fb83dc6d37a3bbfa2f2bc8f10660cf99290a8cd94c1
MD5 56c5cf3686931ea7d47460a2de2f6046
BLAKE2b-256 68b7658938ce00d166e12bd6721a6e9350a787d9ba58e6a00c08a3c8c40fd323

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page