Skip to main content

Python bindings for SQLite's LSM key/value engine

Project description

lsm

Fast Python bindings for SQLite's LSM key/value store. The LSM storage engine was initially written as part of the experimental SQLite4 rewrite (now abandoned). More recently, the LSM source code was moved into the SQLite3 source tree and has seen some improvements and fixes. This project uses the LSM code from the SQLite3 source tree.

Features:

  • Embedded zero-conf database.
  • Keys support in-order traversal using cursors.
  • Transactional (including nested transactions).
  • Single writer/multiple reader MVCC based transactional concurrency model.
  • On-disk database stored in a single file.
  • Data is durable in the face of application or power failure.
  • Thread-safe.
  • Releases GIL for read and write operations (each connection has own mutex)
  • Page compression (lz4 or zstd)
  • Zero dependency static library
  • Python 3.x.

Limitations:

The source for Python lsm is hosted on GitHub.

If you encounter any bugs in the library, please open an issue, including a description of the bug and any related traceback.

Quick-start

Below is a sample interactive console session designed to show some of the basic features and functionality of the lsm Python library.

To begin, instantiate a LSM object, specifying a path to a database file.

>>> from lsm import LSM
>>> db = LSM('test.ldb')
>>> db.open()
>>> print(db)
<LSM at "/tmp/test.ldb" as 0x10951e450>

More pythonic variant is using context manager:

>>> from lsm import LSM
>>> with LSM("/tmp/test.ldb") as db:
...     print(db)
<LSM at "/tmp/test.ldb" as 0x10951e450>

Binary/string mode

You should select mode for opening the database with binary: bool = True argument.

For example when you want to store strings just pass binary=False:

>>> from lsm import LSM
>>> with LSM("/tmp/test.ldb", binary=False) as db:
...    db['foo'] = 'bar'   # must be str for keys and values
...    print(db['foo'])
bar

Otherwise, you must pass keys and values ad bytes (default behaviour):

>>> from lsm import LSM
>>> with LSM("/tmp/test.ldb") as db:
...    db[b'foo'] = b'bar'   # must be bytes for keys and values
...    print(db[b'foo'])
b'bar'

Key/Value Features

lsm is a key/value store, and has a dictionary-like API:

>>> from lsm import LSM
>>> with LSM("/tmp/test.ldb", binary=False) as db:
...    db['foo'] = 'bar'
...    print(db['foo'])
bar

Database apply changes as soon as possible:

>>> from lsm import LSM
>>> db = LSM("/tmp/test.ldb", binary=False)
>>> db.open()
True
>>> for i in range(4):
...     db[f'k{i}'] = str(i)
...
>>> 'k3' in db
True
>>> 'k4' in db
False
>>> del db['k3']
>>> db['k3']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: "Key 'k3' was not found"

By default when you attempt to look up a key, lsm will search for an exact match. You can also search for the closest key, if the specific key you are searching for does not exist:

>>> from lsm import, LSM, SEEK_LE, SEEK_GE
>>> db = LSM("/tmp/test.ldb", binary=False)
>>> db.open()
True
>>> db['k1xx', SEEK_LE]  # Here we will match "k1".
'1'
>>> db['k1xx', SEEK_GE]  # Here we will match "k2".
'2'

LSM supports other common dictionary methods such as:

  • keys()
  • values()
  • items()
  • update()

Slices and Iteration

The database can be iterated through directly, or sliced. When you are slicing the database the start and end keys need not exist -- lsm will find the closest key (details can be found in the LSM.fetch_range() documentation).

>>> [item for item in db.items()]
[('foo', 'bar'), ('k0', '0'), ('k1', '1'), ('k2', '2')]

>>> db['k0':'k99']
<lsm_slice object at 0x10d4f3500>

>>> list(db['k0':'k99'])
[('k0', '0'), ('k1', '1'), ('k2', '2')]

You can use open-ended slices. If the lower- or upper-bound is outside the range of keys an empty list is returned.

>>> list(db['k0':])
[('k0', '0'), ('k1', '1'), ('k2', '2')]

>>> list(db[:'k1'])
[('foo', 'bar'), ('k0', '0'), ('k1', '1')]

>>> list(db[:'aaa'])
[]

To retrieve keys in reverse order or stepping over more then one item, simply use a third slice argument as usual. Negative step value means reverse order, but first and second arguments must be ordinary ordered.

>>> list(db['k0':'k99':2])
[('k0', '0'), ('k2', '2')]

>>> list(db['k0'::-1])
[('k2', '2'), ('k1', '1'), ('k0', '0')]

>>> list(db['k0'::-2])
[('k2', '2'), ('k0', '0')]


>>> list(db['k0'::3])
[('k0', '0')]

You can also delete slices of keys, but note that the delete will not include the keys themselves:

>>> del db['k0':'k99']

>>> list(db)  # Note that 'k0' still exists.
[('foo', 'bar'), ('k0', '0')]

Cursors

While slicing may cover most use-cases, for finer-grained control you can use cursors for traversing records.

>>> with db.cursor() as cursor:
...     for key, value in cursor:
...         print(key, '=>', value)
...
foo => bar
k0 => 0

>>> db.update({'k1': '1', 'k2': '2', 'k3': '3', 'foo': 'bar'})

>>> with db.cursor() as cursor:
...     cursor.first()
...     print(cursor.key())
...     cursor.last()
...     print(cursor.key())
...     cursor.previous()
...     print(cursor.key())
...
foo
k3
k2

>>> with db.cursor() as cursor:
...     cursor.seek('k0', SEEK_GE)
...     print(list(cursor.fetch_until('k99')))
...
[('k0', '0'), ('k1', '1'), ('k2', '2'), ('k3', '3')]

It is very important to close a cursor when you are through using it. For this reason, it is recommended you use the LSM.cursor() context-manager, which ensures the cursor is closed properly.

Transactions

lsm supports nested transactions. The simplest way to use transactions is with the LSM.transaction() method, which doubles as a context-manager or decorator.

>>> with db.transaction() as txn:
...     db['k1'] = '1-mod'
...     with db.transaction() as txn2:
...         db['k2'] = '2-mod'
...         txn2.rollback()
...
True
>>> print(db['k1'], db['k2'])
1-mod 2

You can commit or roll-back transactions part-way through a wrapped block:

>>> with db.transaction() as txn:
...    db['k1'] = 'outer txn'
...    txn.commit()  # The write is preserved.
...
...    db['k1'] = 'outer txn-2'
...    with db.transaction() as txn2:
...        db['k1'] = 'inner-txn'  # This is commited after the block ends.
...    print(db['k1']  # Prints "inner-txn".)
...    txn.rollback()  # Rolls back both the changes from txn2 and the preceding write.
...    print(db['k1'])
...
1              <- Return value from call to commit().
inner-txn      <- Printed after end of txn2.
True           <- Return value of call to rollback().
outer txn      <- Printed after rollback.

If you like, you can also explicitly call LSM.begin(), LSM.commit(), and LSM.rollback().

>>> db.begin()
>>> db['foo'] = 'baze'
>>> print(db['foo'])
baze
>>> db.rollback()
True
>>> print(db['foo'])
bar

Thanks to

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lsm-0.4.2.tar.gz (798.8 kB view hashes)

Uploaded Source

Built Distributions

lsm-0.4.2-cp310-cp310-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

lsm-0.4.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

lsm-0.4.2-cp310-cp310-macosx_10_14_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.10 macOS 10.14+ x86-64

lsm-0.4.2-cp39-cp39-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

lsm-0.4.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

lsm-0.4.2-cp39-cp39-macosx_10_14_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.9 macOS 10.14+ x86-64

lsm-0.4.2-cp38-cp38-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

lsm-0.4.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

lsm-0.4.2-cp38-cp38-macosx_10_14_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.8 macOS 10.14+ x86-64

lsm-0.4.2-cp37-cp37m-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.7m Windows x86-64

lsm-0.4.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

lsm-0.4.2-cp37-cp37m-macosx_10_14_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.7m macOS 10.14+ x86-64

lsm-0.4.2-cp36-cp36m-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.6m Windows x86-64

lsm-0.4.2-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

lsm-0.4.2-cp36-cp36m-macosx_10_14_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.6m macOS 10.14+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page