Skip to main content

No project description provided

Project description

PyCozo

Python client and Jupyter helper for CozoDB.

This document describes how to set up CozoDB in Python. To learn how to use CozoDB (CozoScript), read the docs.

Install

pip install "pycozo-async[embedded,requests,pandas]"

To be useful, you must specify either the embedded option, which enables using CozoDB in the embedded mode, or the requests option, which enables using CozoDB through the HTTP API. The pandas option installs pandas as a dependency and allows optional auto-conversion of output relations to Pandas dataframes. You should specify pandas if you use the Jupyter helper.

Python client

First you need to import the client to use it:

from pycozo_async.client import Client

Opening a database

In-memory database:

client = Client()

SQLite-backed (lightweight persistent storage):

client = Client('sqlite', 'file.db')

RocksDB-backed (highly concurrent persistent storage):

client = Client('rocksdb', 'file.db')

Connecting to a standalone server:

client = Client('http', options={'host': 'http://127.0.0.1:9070'})

If the address is not a loopback address, you also need to provide the auth string:

client = Client('http', options={'host': ..., 'auth': ...})

The auth string is in the file created when you run the standalone server.

After you are done with a client, you need to explicitly close it:

client.close()

If you don't do this, the database resources may linger for an undetermined length of time inside your process, even if you del the client variable. It is OK to close a client multiple times.

Query

res = await client.run(SCRIPT)

If you need to bind variables:

res = await client.run('?[] <- [[$name]]', {'name': 'Python'})

If pandas is available, a dataframe containing the results is returned. If you want to disable this behaviour even when you have pandas installed, pass dataframe=False in the constructor of Client, in which case a python dict containing the relation data in res['rows'] and the relation header in res['header'] is returned.

When a query is unsuccessful, an exception is raised containing the details. If you want a nicely formatted message:

try:
    res = await client.run('BAD!')
except Exception as e:
    print(repr(e))

Client is thread-safe, but you cannot spawn multiple processes opening the same embedded database (connecting to the same standalone server is of course OK).

In the embedded mode, Client will release the GIL when executing queries so that multiple queries in different threads can proceed concurrently.

The embedded database exchanges data with the Python runtime directly, without going through JSON. Hence you can pass Python bytes directly in named parameters, and bytes returned by the database does not need any decoding.

Convenience methods

Client has convenience methods for common operations:

await client.put('test_rel', {'a': 1, 'b': 2, 'c': 3})
await client.put('test_rel', [{'a': 3, 'b': 4, 'c': 2}, {'a': 5, 'b': 6, 'c': 7}])
await client.put('test_rel', pandas.DataFrame({'a': [7, 8, 9], 'b': [9, 10, 11], 'c': [12, 13, 14]}))
# for update, only specify the keys and the values you want to update
await client.update('test_rel', {'a': 7, 'b': 8})
# for rm, only the keys are needed
await client.rm('test_rel', [{'a': 9}, {'a': 11}])

Other operations

Client has other methods on it: export_relations, import_relations, backup, restore and import_from_backup. See the doc for more details.

Multi-statement transaction

You can intersperse CozoDB statements within a single transaction with Python computations by using a multi-statement transaction.

tx = client.multi_transact(True)  # Pass False or nothing for read-only transaction

tx.run(':create a {a}')
tx.run('?[a] <- [[1]] :put a {a}')
try:
    tx.run(':create a {a}')
except:
    pass

tx.run('?[a] <- [[2]] :put a {a}')
tx.run('?[a] <- [[3]] :put a {a}')
tx.commit()  # `tx.abort()` abandons the changes so far 
# and deletes resources associated with the transaction.

r = client.run('?[a] := *a[a]')
assert r['rows'] == [[1], [2], [3]]

You must run either tx.commit() or tx.abort() at the end, otherwise you will have a resource leak.

To automatically clean up transactions, the return value of client.multi-transact can be used as a context-manager which automatically aborts the transaction at the end of the context if it has not already been committed.

with client.multi_transact(True) as tx:
    tx.run(':create a {a})
    tx.run('?[a] <- [[1]] :put a {a}')
    tx.commit()

Mutation callbacks

You can register functions to run whenever mutations are made against stored relations. As an example:

# callbacks must be callable and accept three arguments
def cb(op_name, new_rows, old_rows):
    # op_name is 'Put' or 'Rm'
    # new_rows is a list of lists containing the new rows (i.e., requested puts or deletes)
    # old_rows is a list of lists containing the changed rows (i.e., the old rows in the case of puts, 
    # or the rows actually deleted in the case of deletes)
    pass


# this registers the callback to run when the stored relation `test_rel` changes
cb_id = await client.register_callback('test_rel', cb)

# your application logic here

# use the returned id for unregistration
# client.unregister_callback(cb_id)

User-defined fixed rules

You can define your own fixed rules in Python to be used inside CozoDB queries. As an example:

# custom rule implementation, must accept two arguments
def rule_impl(inputs, options):
    # inputs is a list of lists of lists, representing the input relations to the rule
    # option is a dict with string keys, representing the options passed in when the rule is called

    # You should return a list of tuples (or lists) to represent the return relation of the rule.
    # Here the returned relation has arity one.
    # If you cannot perform the computation due to any reason (wrong parameters, etc.),
    # simply raise an exception.
    return [('Nicely',), ('Done!',)]


# Actually registering the rule, the second argument is the arity, must match the actual arity
# of the relation returned by the implementation.
client.register_fixed_rule('Custom', 1, rule_impl)

r = await client.run("""
    rel[u, v, w] <- [[1,2,3],[4,5,6]]
    ?[] <~ Custom(rel[], x: 1, y: null)
""")
assert r['rows'] == [['Done!'], ['Nicely']]

# Custom rules can be unregistered
client.unregister_fixed_rule('Custom')

Jupyter helper

There are two versions of the helper loaded through magic commands that allows you to query CozoDB directly. The first version is activated by

%load_ext pycozo_async.ipyext_direct

and allows all subsequent cells to be interpreted as CozoScript, unless the first line of the cell starts with %. If A cell has the first line %%py, then all following lines are interpreted as python.

The second is activated by

%load_ext pycozo_async.ipyext

This version is less intrusive in that you need to prefix a cell by the line %%cozo in order for subsequent content to be interpreted as CozoScript.

To execute queries, you also need to connect to a database. If you have the embedded option enabled and you do nothing, you connect to a default in-memory database. To override:

%cozo_open <ENGINE>, <PATH>

where <ENGINE> can now be 'http', 'sqlite', 'rocksdb' or 'mem'.

To connect to a standalone server, use

%cozo_open 'http', '', {'host': 'http://127.0.0.1:9070', 'auth': '<AUTH_STRING>'}

where <AUTH_STRING> is optional if <ADDRESS> is a loopback address. For how to determine the <AUTH_STRING>, see here.

There are other magic commands you can use:

  • %cozo_run_file <PATH_TO_FILE> runs a local file as CozoScript.
  • %cozo_run_string <VARIABLE> runs variable containing string as CozoScript.
  • %cozo_set <KEY> <VALUE> sets a parameter with the name <KEY> to the expression <VALUE>. The updated parameters will be used by subsequent queries.
  • %cozo_set_params <PARAM_MAP> replace all parameters by the given expression, which must evaluate to a dictionary with string keys.
  • %cozo_clear clears all set parameters.
  • %cozo_params returns the parameters currently set.

Programmatically constructing queries

You can use builders in pycozo_async.builder to construct queries programmatically. This is both safer and more convenient than concatenating strings. See here for how to use it.

Building

This library is pure Python, but the embedded option depends on cozo-embedded native package described here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycozo_async-0.7.7.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycozo_async-0.7.7-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file pycozo_async-0.7.7.tar.gz.

File metadata

  • Download URL: pycozo_async-0.7.7.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for pycozo_async-0.7.7.tar.gz
Algorithm Hash digest
SHA256 fae95d8e9e11448263a752983b12a5a05b7656fa1dda0eeeb6f213d6fc592e1d
MD5 b8b326d0b8ae972bc7c56e76c81c0688
BLAKE2b-256 01172fc41dd8311f366625fc6fb70fe2dc27c345da8db0a4de78f39ccf759977

See more details on using hashes here.

File details

Details for the file pycozo_async-0.7.7-py3-none-any.whl.

File metadata

  • Download URL: pycozo_async-0.7.7-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.5 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for pycozo_async-0.7.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2c23b184f6295d4dc6178350425110467e512638b3f4def937ed0609df321dd1
MD5 b1bcabe74f1e8b1352988556103267d4
BLAKE2b-256 226463330e6bd9bc30abfc863bd392c20c81f8ad1d6b5d1b6511d477496a6fbe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page