Skip to main content

No project description provided

Project description

PyCozo

Python client and Jupyter helper for CozoDB.

This document describes how to set up CozoDB in Python. To learn how to use CozoDB (CozoScript), read the docs.

Install

pip install "pycozo-async[embedded,requests,pandas]"

To be useful, you must specify either the embedded option, which enables using CozoDB in the embedded mode, or the requests option, which enables using CozoDB through the HTTP API. The pandas option installs pandas as a dependency and allows optional auto-conversion of output relations to Pandas dataframes. You should specify pandas if you use the Jupyter helper.

Python client

First you need to import the client to use it:

from pycozo_async.client import Client

Opening a database

In-memory database:

client = Client()

SQLite-backed (lightweight persistent storage):

client = Client('sqlite', 'file.db')

RocksDB-backed (highly concurrent persistent storage):

client = Client('rocksdb', 'file.db')

Connecting to a standalone server:

client = Client('http', options={'host': 'http://127.0.0.1:9070'})

If the address is not a loopback address, you also need to provide the auth string:

client = Client('http', options={'host': ..., 'auth': ...})

The auth string is in the file created when you run the standalone server.

After you are done with a client, you need to explicitly close it:

client.close()

If you don't do this, the database resources may linger for an undetermined length of time inside your process, even if you del the client variable. It is OK to close a client multiple times.

Query

res = await client.run(SCRIPT)

If you need to bind variables:

res = await client.run('?[] <- [[$name]]', {'name': 'Python'})

If pandas is available, a dataframe containing the results is returned. If you want to disable this behaviour even when you have pandas installed, pass dataframe=False in the constructor of Client, in which case a python dict containing the relation data in res['rows'] and the relation header in res['header'] is returned.

When a query is unsuccessful, an exception is raised containing the details. If you want a nicely formatted message:

try:
    res = await client.run('BAD!')
except Exception as e:
    print(repr(e))

Client is thread-safe, but you cannot spawn multiple processes opening the same embedded database (connecting to the same standalone server is of course OK).

In the embedded mode, Client will release the GIL when executing queries so that multiple queries in different threads can proceed concurrently.

The embedded database exchanges data with the Python runtime directly, without going through JSON. Hence you can pass Python bytes directly in named parameters, and bytes returned by the database does not need any decoding.

Convenience methods

Client has convenience methods for common operations:

await client.put('test_rel', {'a': 1, 'b': 2, 'c': 3})
await client.put('test_rel', [{'a': 3, 'b': 4, 'c': 2}, {'a': 5, 'b': 6, 'c': 7}])
await client.put('test_rel', pandas.DataFrame({'a': [7, 8, 9], 'b': [9, 10, 11], 'c': [12, 13, 14]}))
# for update, only specify the keys and the values you want to update
await client.update('test_rel', {'a': 7, 'b': 8})
# for rm, only the keys are needed
await client.rm('test_rel', [{'a': 9}, {'a': 11}])

Other operations

Client has other methods on it: export_relations, import_relations, backup, restore and import_from_backup. See the doc for more details.

Multi-statement transaction

You can intersperse CozoDB statements within a single transaction with Python computations by using a multi-statement transaction.

tx = client.multi_transact(True)  # Pass False or nothing for read-only transaction

tx.run(':create a {a}')
tx.run('?[a] <- [[1]] :put a {a}')
try:
    tx.run(':create a {a}')
except:
    pass

tx.run('?[a] <- [[2]] :put a {a}')
tx.run('?[a] <- [[3]] :put a {a}')
tx.commit()  # `tx.abort()` abandons the changes so far 
# and deletes resources associated with the transaction.

r = client.run('?[a] := *a[a]')
assert r['rows'] == [[1], [2], [3]]

You must run either tx.commit() or tx.abort() at the end, otherwise you will have a resource leak.

To automatically clean up transactions, the return value of client.multi-transact can be used as a context-manager which automatically aborts the transaction at the end of the context if it has not already been committed.

with client.multi_transact(True) as tx:
    tx.run(':create a {a})
    tx.run('?[a] <- [[1]] :put a {a}')
    tx.commit()

Mutation callbacks

You can register functions to run whenever mutations are made against stored relations. As an example:

# callbacks must be callable and accept three arguments
def cb(op_name, new_rows, old_rows):
    # op_name is 'Put' or 'Rm'
    # new_rows is a list of lists containing the new rows (i.e., requested puts or deletes)
    # old_rows is a list of lists containing the changed rows (i.e., the old rows in the case of puts, 
    # or the rows actually deleted in the case of deletes)
    pass


# this registers the callback to run when the stored relation `test_rel` changes
cb_id = await client.register_callback('test_rel', cb)

# your application logic here

# use the returned id for unregistration
# client.unregister_callback(cb_id)

User-defined fixed rules

You can define your own fixed rules in Python to be used inside CozoDB queries. As an example:

# custom rule implementation, must accept two arguments
def rule_impl(inputs, options):
    # inputs is a list of lists of lists, representing the input relations to the rule
    # option is a dict with string keys, representing the options passed in when the rule is called

    # You should return a list of tuples (or lists) to represent the return relation of the rule.
    # Here the returned relation has arity one.
    # If you cannot perform the computation due to any reason (wrong parameters, etc.),
    # simply raise an exception.
    return [('Nicely',), ('Done!',)]


# Actually registering the rule, the second argument is the arity, must match the actual arity
# of the relation returned by the implementation.
client.register_fixed_rule('Custom', 1, rule_impl)

r = await client.run("""
    rel[u, v, w] <- [[1,2,3],[4,5,6]]
    ?[] <~ Custom(rel[], x: 1, y: null)
""")
assert r['rows'] == [['Done!'], ['Nicely']]

# Custom rules can be unregistered
client.unregister_fixed_rule('Custom')

Jupyter helper

There are two versions of the helper loaded through magic commands that allows you to query CozoDB directly. The first version is activated by

%load_ext pycozo_async.ipyext_direct

and allows all subsequent cells to be interpreted as CozoScript, unless the first line of the cell starts with %. If A cell has the first line %%py, then all following lines are interpreted as python.

The second is activated by

%load_ext pycozo_async.ipyext

This version is less intrusive in that you need to prefix a cell by the line %%cozo in order for subsequent content to be interpreted as CozoScript.

To execute queries, you also need to connect to a database. If you have the embedded option enabled and you do nothing, you connect to a default in-memory database. To override:

%cozo_open <ENGINE>, <PATH>

where <ENGINE> can now be 'http', 'sqlite', 'rocksdb' or 'mem'.

To connect to a standalone server, use

%cozo_open 'http', '', {'host': 'http://127.0.0.1:9070', 'auth': '<AUTH_STRING>'}

where <AUTH_STRING> is optional if <ADDRESS> is a loopback address. For how to determine the <AUTH_STRING>, see here.

There are other magic commands you can use:

  • %cozo_run_file <PATH_TO_FILE> runs a local file as CozoScript.
  • %cozo_run_string <VARIABLE> runs variable containing string as CozoScript.
  • %cozo_set <KEY> <VALUE> sets a parameter with the name <KEY> to the expression <VALUE>. The updated parameters will be used by subsequent queries.
  • %cozo_set_params <PARAM_MAP> replace all parameters by the given expression, which must evaluate to a dictionary with string keys.
  • %cozo_clear clears all set parameters.
  • %cozo_params returns the parameters currently set.

Programmatically constructing queries

You can use builders in pycozo_async.builder to construct queries programmatically. This is both safer and more convenient than concatenating strings. See here for how to use it.

Building

This library is pure Python, but the embedded option depends on cozo-embedded native package described here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycozo_async-0.7.6.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycozo_async-0.7.6-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file pycozo_async-0.7.6.tar.gz.

File metadata

  • Download URL: pycozo_async-0.7.6.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for pycozo_async-0.7.6.tar.gz
Algorithm Hash digest
SHA256 20c66aade516abe0ed273c5955807a4ebd753e17fecad14f4f05277213f08d28
MD5 af93e7b0d28d7f5f7f42f49317ed933a
BLAKE2b-256 1f0a9a23a6d8a7d140bb4fc3c081f7e5f7b87f33138269371c81980461531fda

See more details on using hashes here.

File details

Details for the file pycozo_async-0.7.6-py3-none-any.whl.

File metadata

  • Download URL: pycozo_async-0.7.6-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for pycozo_async-0.7.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f4c798ccf1315bff16c705792ff566e4e32e0e002dab5eb98d9e7c959015167a
MD5 b00d2d94dfdb407cef71d9295031819b
BLAKE2b-256 7006b4942e4dcf419bef844d4c19472b2b1c685ac7559196420349900089a964

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page