Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Python bindings for the sophia database.

Project description

Sophia Library Latest Version https://img.shields.io/pypi/wheel/sonya.svg https://img.shields.io/pypi/pyversions/sonya.svg https://img.shields.io/pypi/l/sonya.svg

Sonya

sonya, fast Python bindings for Sophia embedded database, v2.2.

About sonya:

  • Written in Cython for speed and low-overhead
  • Clean, memorable APIs
  • Extensive support for Sophia’s features
  • Python 2 and Python 3 support
  • No 3rd-party dependencies besides Cython (for python>3)

About Sophia:

Sophia Library
  • Document store
  • ACID transactions
  • MVCC, optimistic, non-blocking concurrency with multiple readers and writers.
  • Multiple databases per environment
  • Multiple- and single-statement transactions across databases
  • Prefix searches
  • Automatic garbage collection
  • Hot backup
  • Compression
  • Multi-threaded compaction
  • mmap support, direct I/O support
  • APIs for variety of statistics on storage engine internals
  • BSD licensed

Some ideas of where Sophia might be a good fit:

  • Running on application servers, low-latency / high-throughput
  • Time-series
  • Analytics / Events / Logging
  • Full-text search
  • Secondary-index for external data-store

Limitations:

  • Not tested on Windows.

If you encounter any bugs in the library, please open an issue, including a description of the bug and any related traceback.

Installation

The sophia sources are bundled with the sonya source code, so the only thing you need to install is Cython. You can install from GitHub or from PyPi.

Pip instructions:

$ pip install Cython   # Optional
$ pip install sonya

Or to install the latest code from master:

$ pip install Cython   # Required
$ pip install git+https://github.com/mosquito/sonya#egg=sonya

Git instructions:

$ pip install Cython
$ git clone https://github.com/mosquito/sonya
$ cd sonya
$ python setup.py build
$ python setup.py install

To run the tests:

$ pip install pytest
$ pytest tests

Overview

Sonya addition to normal dictionary operations, you can read slices of data that are returned efficiently using cursors. Similarly, bulk writes using update() use an efficient, atomic batch operation.

Despite the simple APIs, Sophia has quite a few advanced features. There is too much to cover everything in this document, so be sure to check out the official Sophia storage engine documentation.

The next section will show how to perform common actions with sonya.

Using Sonya

Let’s begin by import sonya and creating an environment. The environment can host multiple databases, each of which may have a different schema. In this example our database will store python objects as the key and value. Finally we’ll open the environment so we can start storing and retrieving data.

from sonya import Environment, fields, Schema


class DictSchema(Schema):
    key = fields.PickleField(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-database', DictSchema(), compression='zstd')
env.open()

document = db.document(key='foo', value=[1, 2, 3, 'bar'])

# Insert a document
db.set(document)

print(db.get(key='foo'))
# {'value': [1, 2, 3, 'bar'], 'key': 'foo'}

CRUD operations

Sonya

from sonya import Environment, fields, Schema


class DictSchema(Schema):
    key = fields.PickleField(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-database', DictSchema(), compression='zstd')
env.open()

document = db.document(key='foo', value=[1, 2, 3, 'bar'])

# Create a document
db.set(document)

# Read document
document = db.get(key='foo')

# Update the document
document = db.document(key='foo', value=None)
db.set(document)

# Delete the document
document = db.document(key='foo', value=None)
db.delete(key='foo')

# Iterate through the database
for document in db.cursor():
   print(document)

# Delete multiple documents
# fastest method for remove multiple documents from database
db.delete_many(order='>=')

Fetching ranges (Cursors)

Because Sophia is an ordered data-store, performing ordered range scans is efficient. To retrieve a range of key-value pairs with Sonya dictionary lookup with a slice instead.

For finer-grained control over iteration, or to do prefix-matching, Sonya provides a cursor interface.

The cursor() method accepts special keyword parameter order and all key fields:

  • order (default=`>=`) – semantics for matching the start key and ordering results.
from sonya import Environment, fields, Schema


class IntSchema(Schema):
    key = fields.UInt32Field(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-integer-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=i, value=None))

# Iterate through the database
for document in db.cursor(order='>=', key=9995):
    print(document)

# {'key': 9995, 'value': None}
# {'key': 9996, 'value': None}
# {'key': 9997, 'value': None}
# {'key': 9998, 'value': None}
# {'key': 9999, 'value': None}

For prefix search use a part of the key and order:

from sonya import Environment, fields, Schema


class StringSchema(Schema):
    key = fields.StringField(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-string-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=str(i), value=None))

# Iterate through the database
for document in db.cursor(order='>=', key='999'):
    print(document)

# {'value': None, 'key': '999'}
# {'value': None, 'key': '9990'}
# {'value': None, 'key': '9991'}
# {'value': None, 'key': '9992'}
# {'value': None, 'key': '9993'}
# {'value': None, 'key': '9994'}
# {'value': None, 'key': '9995'}
# {'value': None, 'key': '9996'}
# {'value': None, 'key': '9997'}
# {'value': None, 'key': '9998'}
# {'value': None, 'key': '9999'}

Deleting multiple documents

Sonya provides delete_many method. This method is fastest option when you want to remove multiple documents from the database. The method has cursor-like interface. The whole operation will be processed in the one transaction.

The method returns number of affected rows.

from sonya import Environment, fields, Schema


class IntSchema(Schema):
    key = fields.UInt32Field(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-integer-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=i, value=None))

# returns the number of affected rows
db.delete_many(order='>=', key=9995):

Document count

The Database objects has a __len__ method. Please avoid to use it for any big database, it iterates and count the documents each time (faster then using len(list(db.cursor())) but still has O(n) complexity).

from sonya import Environment, fields, Schema


class IntSchema(Schema):
    key = fields.UInt32Field(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-integer-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=i, value=None))

print(len(db))
# 10000

Transactions

Sophia supports ACID transactions. Even better, a single transaction can cover operations to multiple databases in a given environment.

Example usage:

class Users(Schema):
    name = fields.StringField(index=0)
    surname = fields.StringField(index=1)
    age = fields.UInt8Field()


with users.transaction() as tx:
    tx.set(users.document(name='Jane', surname='Doe', age=19))
    tx.set(users.document(name='John', surname='Doe', age=18))

    # Raises LookupError
    db.get(name='John', surname='Doe')

Multiple transactions are allowed to be open at the same time, but if there are conflicting changes, an exception will be thrown when attempting to commit the offending transaction.

Configuring and Administering Sophia

Sophia can be configured using special properties on the Sophia and Database objects. Refer to the configuration document for the details on the available options, including whether they are read-only, and the expected data-type.

For example, to query Sophia’s status, you can use the status property, which is a readonly setting returning a string:

>>> print(env['sophia.status'])
"online"

Other properties can be changed by assigning a new value to the property. For example, to read and then increase the number of threads used by the scheduler:

>>> env['scheduler.threads'] = env['scheduler.threads'] + 2
>>> env.open()
>>> print(env['scheduler.threads'])
8
>>> print(dict(env))
{'db.test-string-db.stat.cursor_latency': '0 0 0.0', ...}

Refer to the documentation for complete lists of settings. Dotted-paths are translated into underscore-separated attributes.

Project details


Release history Release notifications

This version

0.6.6

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for sonya, version 0.6.6
Filename, size File type Python version Upload date Hashes
Filename, size sonya-0.6.6-cp27-cp27m-manylinux1_x86_64.whl (1.2 MB) File type Wheel Python version cp27 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp34-cp34m-macosx_10_6_intel.whl (1.5 MB) File type Wheel Python version cp34 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp34-cp34m-manylinux1_x86_64.whl (1.3 MB) File type Wheel Python version cp34 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp35-cp35m-macosx_10_6_intel.whl (1.5 MB) File type Wheel Python version cp35 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp35-cp35m-manylinux1_x86_64.whl (1.3 MB) File type Wheel Python version cp35 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp36-cp36m-linux_armv7l.whl (1.2 MB) File type Wheel Python version cp36 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp36-cp36m-macosx_10_6_intel.whl (862.0 kB) File type Wheel Python version cp36 Upload date Hashes View hashes
Filename, size sonya-0.6.6-cp36-cp36m-manylinux1_x86_64.whl (1.3 MB) File type Wheel Python version cp36 Upload date Hashes View hashes
Filename, size sonya-0.6.6.tar.gz (290.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page