Skip to main content

git-backed time-indexed blob storage

Project description

gitblobts

gitblobts is an experimental Python package for git-backed time-indexed blob storage. Even so, a lock-in of the stored files with git is avoided. If encryption is not enabled, a lock-in of the file contents with this application is also avoided.

Its goal is to ensure availability of data both locally and remotely. It stores each blob as a file in a preexisting local and remote git repository. Each filename contains an encoded nanosecond timestamp and format version number.

Given the pull and push actions of git, collaborative use of the same remote repo is supported. To prevent merge conflicts, there is a one-to-many mapping of timestamp to filenames. This is accomplished by including sufficient random bytes in the filename to ensure uniqueness.

Subsequent retrieval of blobs is by a time range. At this time there is no implemented method to remove or overwrite a blob; this is by design. From the perspective of the package, once a blob is written, it is considered read-only. An attempt to add a blob with the same timestamp as a preexisting blob will result in a new blob.

An effort has been made to keep third-party package requirements to a minimum.

Links

Installation

Using Python 3.7+, install the package from PyPI: pip install -U gitblobts.

Usage examples

Storage

from typing import Optional
import datetime, gitblobts, json, time, urllib.request

optional_compression_module_name: Optional[str] = [None, 'bz2', 'gzip', 'lzma'][2]
optional_user_saved_encryption_key: Optional[bytes] = [None, gitblobts.generate_key()][1]
store = gitblobts.Store('/path_to/preexisting_git_repo',
                        compression=optional_compression_module_name, key=optional_user_saved_encryption_key)

store.addblob('a byte encoded string'.encode())
store.addblob(b'some bytes' * 1000, timestamp=time.time())
store.addblob(blob=json.dumps([0, 1., 2.2, 3]).encode(),
              timestamp=datetime.datetime.now(datetime.timezone.utc).timestamp())
store.addblob(blob=urllib.request.urlopen('https://i.imgur.com/3GmPd7O.png').read())

store.addblobs(blobs=[b'first blob', b'another blob'])
store.addblobs(blobs=[b'A', b'B'], timestamps=[time.time(), time.time()])

Retrieval

from typing import List
from gitblobts import Blob, Store
import time

store = Store('/path_to/preexisting_git_repo', compression='gzip', key=b'JVGmuw3wRntCc7dcQHJ5q1noUs62ydR0Nw8HpyllKn8=')

blobs: List[Blob] = list(store.getblobs(pull=False))
blobs_bytes: List[bytes] = [b.blob for b in blobs]
timestamps: List[float] = [b.timestamp for b in blobs]

blobs2_ascending: List[Blob] = list(store.getblobs(start_time='midnight yesterday', end_time='now'))
blobs2_descending: List[Blob] = list(store.getblobs(start_time='now', end_time='midnight yesterday', pull=True))
blobs3_ascending: List[Blob] = list(store.getblobs(start_time=time.time() - 86400, end_time=time.time(), pull=True))
blobs3_descending: List[Blob] = list(store.getblobs(start_time=time.time(), end_time=time.time() - 86400))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitblobts-0.0.7.tar.gz (12.5 kB view hashes)

Uploaded Source

Built Distribution

gitblobts-0.0.7-py3-none-any.whl (24.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page