NoSQL Abstraction Library

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Database :: Front-Ends
- System :: Distributed Computing
Typing
- Typed

Project description

NoSQL Abstraction Library

Basic CRUD and query support for NoSQL databases, allowing for portable cloud native applications

AWS DynamoDB
Azure Cosmos NoSQL
Google Firestore

This library is not intended to create databases/tables, use Terraform/ARM/CloudFormation etc for that

Why not just use the name 'nosql' or 'pynosql'? because they already exist on pypi :-)

NoSQL Abstraction Library
- Installation
Usage
Configuration
Plugins and Hooks
Testing
CLI
Future Enhancements / Ideas

Installation

pip install 'abnosql[dynamodb]'
pip install 'abnosql[cosmos]'
pip install 'abnosql[firestore]'

For optional client side field level envelope encryption

pip install 'abnosql[aws-kms]'
pip install 'abnosql[azure-kms]'

By default, abnosql does not include database dependencies. This is to facilitate packaging abnosql into AWS Lambda or Azure Functions (for example), without over-bloating the packages

Usage

from abnosql import table
import os

os.environ['ABNOSQL_DB'] = 'dynamodb'
os.environ['ABNOSQL_KEY_ATTRS'] = 'hk,rk'

item = {
    'hk': '1',
    'rk': 'a',
    'num': 5,
    'obj': {
        'foo': 'bar',
        'num': 5,
        'list': [1, 2, 3],
    },
    'list': [1, 2, 3],
    'str': 'str'
}

tb = table('mytable')

# create/replace
tb.put_item(item)

# update - using ABNOSQL_KEY_ATTRS
updated_item = tb.put_item(
    {'hk': '1', 'rk': 'a', 'str': 'STR'},
    update=True
)
assert updated_item['str'] == 'STR'

# bulk
tb.put_items([item])

# note partition/hash key should be first kwarg
assert tb.get_item(hk='1', rk='a') == item

assert tb.query({'hk': '1'})['items'] == [item]

# scan
assert tb.query()['items'] == [item]

# be careful not to use cloud specific statements!
assert tb.query_sql(
    'SELECT * FROM mytable WHERE mytable.hk = @hk AND mytable.num > @num',
    {'@hk': '1', '@num': 4}
)['items'] == [item]

tb.delete_item({'hk': '1', 'rk': 'a'})

API Docs

See API Docs

Querying

query() performs DynamoDB Query using KeyConditionExpression (if key supplied) and exact match on FilterExpression if filters are supplied. For Cosmos, SQL is generated. This is the safest/most cloud agnostic way to query and probably OK for most use cases.

query_sql() performs Dynamodb ExecuteStatement passing in the supplied PartiQL statement. Cosmos uses the NoSQL SELECT syntax.

During mocked tests, SQLGlot is used to execute the statement, so results may differ...

Care should be taken with query_sql() to not to use SQL features that are specific to any specific provider (breaking the abstraction capability of using abnosql in the first place)

The Firestore plugin uses sqlglot to parse simple SQL statements (eg AND only supported)

Indexes

Beyond partition and range keys defined on the table, indexes currently have limited support within abnosql

The DynamoDB implemention of query() allows a secondary index to be specified via optional index kwarg
Cosmos has Range, Spatial and Composite indexes, however the abnosql library does not do anything yet with index kwarg in query() implementation.

Updates

put_item() and put_items() support update boolean attribute, which if supplied will do an update_item() on DynamoDB, and a patch_item() on Cosmos. For this to work however, you must specify the key attribute names, either via ABNOSQL_KEY_ATTRS env var as a comma separated list (eg perhaps multiple tables all share common partition/range key scheme), or as the key_attrs config item when instantiating the table, eg:

tb = table('mytable', {'key_attrs': ['hk', 'rk']})

If you don't need to do any updates and only need to do create/replace, then these key attribute names do not need to be supplied

All items being updated must actually exist first, or else exception raised

Firestore does not return updated item, so if this is required use put_get = True config variable

Existence Checking

If check_exists config attribute is True, then CRUD operations will raise exceptions as follows:

get_item() raises NotFoundException if item doesnt exist
put_item() raises ExistsException if item already exists
put_item(update=True) raises NotFoundException if item doesnt exist to update
delete_item() raises NotFoundException if item doesnt exist

This adds some delay overhead as abnosql must check if item exists

This can also be enabled by setting environment variable ABNOSQL_CHECK_EXISTS=TRUE

If for some reason you need to override this behaviour once enabled for put_item() create operation, you can pass abnosql_check_exists=False into the item (this gets popped out so not persisten), which will allow create operation to overwrite the existing item without throwing ExistsException

Schema Validation

config can define jsonschema to validate upon create or update operations (via put_item())

Combination of the following config attributes supported

schema : jsonschema dict or yaml string, applied to both create and update
create_schema : jsonschema dict/yaml only on create
update_schema : jsonschema dict/yaml only on update
schema_errmsg : override default error message on both create and update
create_schema_errmsg : override default error message on create
update_schema_errmsg : override default error message on update

You can get details of validation errors through e.to_problem() or e.detail

NOTE: key_attrs required when updating (see Updates)

Partition Keys

A few methods such as get_item(), delete_item() and query() need to know partition/hash keys as defined on the table. To avoid having to configure this or lookup from the provider, the convention used is that the first kwarg or dictionary item is the partition key, and if supplied the 2nd is the range/sort key.

Pagination

query and query_sql accept limit and next optional kwargs and return next in response. Use these to paginate.

This works for AWS DyanmoDB & Firestore, however Azure Cosmos has a limitation with continuation token for cross partitions queries (see Python SDK documentation). For Cosmos, abnosql appends OFFSET and LIMIT in the SQL statement if not already present, and returns next. limit is defaulted to 100. See the tests for examples

Audit

put_item() and put_items() take an optional audit_user kwarg. If supplied, absnosql will add the following to the item:

createdBy - value of audit_user, added if does not exist in item supplied to put_item()
createdDate - UTC ISO timestamp string, added if does not exist
modifiedBy - value of audit_user always added
modifiedDate - UTC ISO timestamp string, always added

You can also specify audit_user as config attribute to table. If you prefer snake_case over CamelCase, you can set env var ABNOSQL_CAMELCASE = FALSE

NOTE: created* will only be added if update is not True in a put_item() operation

Change Feed / Stream Support

AWS DynamoDB Streams allow Lambda functions to be triggered upon create, update and delete table operations. The event sent to the lambda (see aws docs) contains eventName and eventSourceARN, where:

eventName - name of event, eg INSERT, MODIFY or REMOVE (see here)
eventSourceARN - ARN of the table name

This allows a single stream processor lambda to process events from multiple tables (eg for writing into ElasticSearch)

Like DynamoDB, Azure CosmosDB supports change feeds, however the event sent to the function (currently) omits the event source (table name) and only delete event names are available if a preview change feed mode is enabled, which needs explicit enablement for.

Because both the eventName and eventSource are ideally needed (irrespective of preview mode or not), abnosql library automatically adds the changeMetadata to an item during create, update and delete, eg:

item = {
    "hk": "1",
    "rk": "a",
    "changeMetadata": {
        "eventName": "INSERT",
        "eventSource": "sometable"
    }
}

Because no REMOVE event is sent at all without preview change feed mode above - abnosql must first update the item, and then delete it. This is also needed for the eventSource / table name to be captured in the event, so unfortunately until Cosmos supports both attributes, update is needed before a delete. 5 second synchronous sleep is added by default between update and delete to allow CosmosDB to send the update event (0 seconds results in no update event). This can be controlled with ABNOSQL_COSMOS_CHANGE_META_SLEEPSECS env var (defaults to 5 seconds), and disabled by setting to 0

This behaviour is enabled by default, however can be disabled by setting ABNOSQL_COSMOS_CHANGE_META env var to FALSE or cosmos_change_meta=False in table config. ABNOSQL_CAMELCASE = FALSE env var can also be used to change attribute names used to snake_case if needed

To write an Azure Function / AWS Lambda that is able to process both DynamoDB and Cosmos events, look for changeMetadata first and if present use that otherwise look for eventName and eventSourceARN in the event payload assuming its DynamoDB

Google Firestore should support triggering functions similar to DynamoDB Streams, so changeMetadata is not required

Client Side Encryption

If configured in table config with kms attribute, abnosql will perform client side encryption using AWS KMS or Azure KeyVault

Each attribute value defined in the config is encrypted with a 256-bit AES-GCM data key generated for each attribute value:

aws uses AWS Encryption SDK for Python
azure uses python cryptography to generate AES-GCM data key, encrypt the attribute value and then uses an RSA CMK in Azure Keyvault to wrap/unwrap (envelope encryption) the AES-GCM data key. The module uses the azure-keyvaults-keys python SDK for wrap/unrap functionality of the generated data key (Azure doesnt support generate data key as AWS does)

Both providers use a 256-bit AES-GCM generated data key with AAD/encryption context (Azure provider uses a 96-nonce). AES-GCM is an Authenticated symmetric encryption scheme used by both AWS and Azure (and Hashicorp Vault)

Configuration

It is recommended to use environment variables where possible to avoid provider specific application code

if ABNOSQL_DB env var is not set, abnosql will attempt to apply defaults based on available environment variables:

AWS_DEFAULT_REGION - sets database to dynamodb (see aws docs)
FUNCTIONS_WORKER_RUNTIME - sets database to cosmos (see azure docs)
K_SERVICE - sets database to firestore (though this could also get confused if running on knative)

AWS DynamoDB

Set the following environment variable and use the usual AWS environment variables that boto3 uses

ABNOSQL_DB = "dynamodb"

Or set the boto3 session in the config

from abnosql import table
import boto3

tb = table(
    'mytable',
    config={'session': boto3.Session()},
    database='dynamodb'
)

Azure Cosmos NoSQL

Set the following environment variables:

ABNOSQL_DB = "cosmos"
ABNOSQL_COSMOS_ACCOUNT = your database account
ABNOSQL_COSMOS_ENDPOINT = drived from ABNOSQL_COSMOS_ACCOUNT if not set
ABNOSQL_COSMOS_CREDENTIAL = your cosmos credential, use Azure Key Vault References if using Azure Functions. Don't set to use DefaultAzureCredential / managed identity.
ABNOSQL_COSMOS_DATABASE = cosmos database

OR - use the connection string format:

ABNOSQL_DB = "cosmos://account@credential:database" or "cosmos://account@:database" to use managed identity (credential could also be "DefaultAzureCredential")

Alternatively, define in config (though ideally you want to use env vars to avoid application / environment specific code).

from abnosql import table

tb = table(
    'mytable',
    config={'account': 'foo', 'database': 'bar'},
    database='cosmos'
)

Google Firestore

Set the following environment variables:

ABNOSQL_DB = "firestore"
ABNOSQL_FIRESTORE_PROJECT or GOOGLE_CLOUD_PROJECT = google cloud project
ABNOSQL_FIRESTORE_DATABASE = Firestore database
ABNOSQL_FIRESTORE_CREDENTIALS = oauth, optional - if using google CLI, its also picked up from ~/.config/gcloud/application_default_credentials.json if found

OR - use the connection string format:

ABNOSQL_DB = "firestore://project@credential:database"

Alternatively, define in config (though ideally you want to use env vars to avoid application / environment specific code).

from abnosql import table

tb = table(
    'mytable',
    config={'project': 'foo', 'database': 'bar'},
    database='firestore'
)

Plugins and Hooks

abnosql uses pluggy and registers in the abnosql.table namespace

The following hooks are available

set_config - set config
get_item_post - called after get_item(), can return modified data
put_item_pre
put_item_post
put_items_post
delete_item_post

See the TableSpecs and example test_hooks()

Testing

AWS DynamoDB

Use moto package and abnosql.mocks.mock_dynamodbx

mock_dynamodbx is used for query_sql and only needed if/until moto provides full partiql support

Example:

from abnosql.mocks import mock_dynamodbx 
from moto import mock_dynamodb

@mock_dynamodb
@mock_dynamodbx  # needed for query_sql only
def test_something():
    ...

More examples in tests/test_dynamodb.py

Azure Cosmos NoSQL

Use requests package and abnosql.mocks.mock_cosmos

Example:

from abnosql.mocks import mock_cosmos
import requests

@mock_cosmos
@responses.activate
def test_something():
    ...

More examples in tests/test_cosmos.py

Google Firestore

Use python-mock-firestore and pass MockFirestore() to table config as client attribute

Example:

from mockfirestore import MockFirestore


def test_something():
    tb = table('mytable', {'client': MockFirestore()})
    item = tb.get_item(foo='bar')

CLI

Small abnosql CLI installed with few of the commands above

Usage: abnosql [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  delete-item
  get-item
  put-item
  put-items
  query
  query-sql

To install dependencies

pip install 'abnosql[cli]'

Example querying table in Azure Cosmos, with cosmos.json config file containing endpoint, credential and database

$ abnosql query-sql mytable 'SELECT * FROM mytable' -d cosmos -c cosmos.json
partkey      id      num  obj                                          list       str
-----------  ----  -----  -------------------------------------------  ---------  -----
p1           p1.1      5  {'foo': 'bar', 'num': 5, 'list': [1, 2, 3]}  [1, 2, 3]  str
p2           p2.1      5  {'foo': 'bar', 'num': 5, 'list': [1, 2, 3]}  [1, 2, 3]  str
p2           p2.2      5  {'foo': 'bar', 'num': 5, 'list': [1, 2, 3]}  [1, 2, 3]  str

Future Enhancements / Ideas

client side encryption
test pagination & exception handling
Google Firestore support, ideally in the core library (though could be added outside via use of the plugin system). Would need something like FireSQL implemented for python, maybe via sqlglot
Google Vault KMS support
Hashicorp Vault KMS support
Simple caching (maybe) using globals (used for AWS Lambda / Azure Functions)
PostgresSQL support using JSONB column (see here for example). Would be nice to avoid an ORM and having to define a model for each table...
blob storage backend? could use something similar to NoDB but maybe combined with smart_open and DuckDB's Hive Partitioning
Redis..
Hook implementations to write to ElasticSearch / OpenSearch for better searching. Useful when not able to use AWS Stream Processors Azure Change Feed, or Elasticstore. Why? because not all databases support stream processing, and if they do you don't want the hastle of using CDC

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Database :: Front-Ends
- System :: Distributed Computing
Typing
- Typed

Release history Release notifications | RSS feed

0.0.26

May 25, 2024

0.0.25

May 21, 2024

0.0.24

May 16, 2024

This version

0.0.23

Apr 26, 2024

0.0.21

Feb 16, 2024

0.0.20

Oct 9, 2023

0.0.19

Oct 9, 2023

0.0.18

Oct 9, 2023

0.0.17

Oct 5, 2023

0.0.16

Sep 29, 2023

0.0.15

Sep 28, 2023

0.0.14

Sep 27, 2023

0.0.13

Sep 25, 2023

0.0.12

Sep 22, 2023

0.0.11

Sep 12, 2023

0.0.10

Sep 8, 2023

0.0.9

Aug 31, 2023

0.0.8

Aug 24, 2023

0.0.7

Aug 8, 2023

0.0.6

Jul 31, 2023

0.0.5

Jul 20, 2023

0.0.4

Jul 20, 2023

0.0.3

Jul 18, 2023

0.0.2

Jul 7, 2023

0.0.1

Jul 6, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abnosql-0.0.23.tar.gz (41.2 kB view details)

Uploaded Apr 26, 2024 Source

Built Distribution

abnosql-0.0.23-py3-none-any.whl (42.6 kB view details)

Uploaded Apr 26, 2024 Python 3

File details

Details for the file abnosql-0.0.23.tar.gz.

File metadata

Download URL: abnosql-0.0.23.tar.gz
Upload date: Apr 26, 2024
Size: 41.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for abnosql-0.0.23.tar.gz
Algorithm	Hash digest
SHA256	`1a8848dee9a1ef71672c6ae42088589e30aeda116da0638cb13dccf965080af4`
MD5	`5a20502913881b4347fc7e1f79033f16`
BLAKE2b-256	`761835b40f46ba518811a6636a55f4b84cd72fb47627faffbd8e71d02dbad4a7`

See more details on using hashes here.

File details

Details for the file abnosql-0.0.23-py3-none-any.whl.

File metadata

Download URL: abnosql-0.0.23-py3-none-any.whl
Upload date: Apr 26, 2024
Size: 42.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for abnosql-0.0.23-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bda137ae70d155e58ecbdea30dab2bfde60c2235a0e8f6e636f1913739de8fce`
MD5	`185f8e27055e81ad782c6320e7185f93`
BLAKE2b-256	`cc52619a6a571b28522c269ddb2dde3ba42b075686756c14328b4b6acfc363c1`

See more details on using hashes here.

abnosql 0.0.23

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NoSQL Abstraction Library

Installation

Usage

API Docs

Querying

Indexes

Updates

Existence Checking

Schema Validation

Partition Keys

Pagination

Audit

Change Feed / Stream Support

Client Side Encryption

Configuration

AWS DynamoDB

Azure Cosmos NoSQL

Google Firestore

Plugins and Hooks

Testing

AWS DynamoDB

Azure Cosmos NoSQL

Google Firestore

CLI

Future Enhancements / Ideas

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes