Skip to main content

No project description provided

Project description

riffq

riffq is a toolkit in python (built in Rust) for building PostgreSQL wire-compatible databases.
It allows you to serve data from Python over the PostgreSQL protocol — turning your Python-based logic or in-memory data into a queryable, network-exposed system. We also have a catalog emulation system in rust with datafusion.


What It Does

  • Implements the PostgreSQL wire protocol in Rust for performance and concurrency
  • Sends raw SQL queries (Simple or Extended protocol) to Python for interpretation
  • Implements postgres catalog compatibility layer, see pg_catalog_rs

Since you are in python, you can

  • Allows you to connect to remote data sources (e.g., analytics DB, CRM) and expose them as a unified PostgreSQL database
  • Enables serving Pandas DataFrames over the network as virtual SQL tables
  • Can delegate SQL execution to DuckDB, Polars, or any other Python engine
  • Acts as a programmable federated query engine or custom data service

Example Use Cases

  • Serve a Pandas DataFrame as a PostgreSQL table to BI tools
  • Build a custom federated engine from multiple APIs or databases
  • Implement your own data lake query frontend
  • Expose dynamic ML feature stores for training or real-time inference
  • Provide fine-grained, code-controlled access to internal metrics or logs

Example

import logging
import duckdb
import pyarrow as pa
import riffq
logging.basicConfig(level=logging.DEBUG)

class Connection(riffq.BaseConnection):
    def _handle_query(self, sql, callback, **kwargs):
        cur = duckdb_con.cursor()
        try:
            reader = cur.execute(sql).fetch_record_batch()
            self.send_reader(reader, callback)
        except Exception as exc:
            logging.exception("error on executing query")
            batch = self.arrow_batch(
                [pa.array(["ERROR"]), pa.array([str(exc)])],
                ["error", "message"],
            )
            self.send_reader(batch, callback)

    def handle_query(self, sql, callback=callable, **kwargs):
        self.executor.submit(self._handle_query, sql, callback, **kwargs)

def main():
    global duckdb_con
    duckdb_con = duckdb.connect()
    duckdb_con.execute(
        """
        CREATE VIEW klines AS 
        SELECT * 
        FROM 'data/klines.parquet'
        """
    )
    server = riffq.RiffqServer("127.0.0.1:5433", connection_cls=Connection)
    server.set_tls("certs/server.crt", "certs/server.key")
    server.start(tls=True)

if __name__ == "__main__":
    main()

The Rust side calls this Python handler when a SQL query comes in via the PostgreSQL protocol.


Architecture

  • Rust layer handles:
    • PostgreSQL protocol (via pgwire)
    • Connection management
    • Query routing
    • Metadata compatibility (pg_catalog emulation)
  • Python layer handles:
    • SQL execution (via any engine: DuckDB, Polars, etc.)
    • Data transformation
    • Custom logic and dynamic schema definitions

Zero Copy

We try to achieve zero-copy by using arrow/pycapsule. So data from duckdb comes to python as a pycapsule pointer, which goes to thread in python which goes to the callback in rust still as a pycapsule pointer. We then stream to network with postgresql using pgwire.

https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html


Getting Started

Install the package

pip install riffq --pre

Extend BaseConnection class

You can extend the base class for

  • handle_query
  • handle_auth - To check for username/password
  • handle_connect - To check ip/port restriction
  • handle_disconnect - WIP
class Connection(riffq.BaseConnection):
    def _handle_query(self, sql, callback, **kwargs):
        cur = duckdb_con.cursor()
        try:
            reader = cur.execute(sql).fetch_record_batch()
            self.send_reader(reader, callback)
        except Exception as exc:
            logging.exception("error on executing query")
            batch = self.arrow_batch(
                [pa.array(["ERROR"]), pa.array([str(exc)])],
                ["error", "message"],
            )
            self.send_reader(batch, callback)

    def handle_query(self, sql, callback=callable, **kwargs):
        self.executor.submit(self._handle_query, sql, callback, **kwargs)

Start the server

    global duckdb_con
    duckdb_con = duckdb.connect()
    duckdb_con.execute(
        """
        CREATE VIEW klines AS 
        SELECT * 
        FROM 'data/klines.parquet'
        """
    )
    server = riffq.RiffqServer("127.0.0.1:5433", connection_cls=Connection)
    server.set_tls("certs/server.crt", "certs/server.key")
    server.start()

You can check server implementations on test_concurrency/ and example/ directory

Then connect using any PostgreSQL client:

psql -h localhost -p 5433

Enabling TLS

Generate a temporary certificate and key:

openssl req -newkey rsa:2048 -nodes -keyout server.key -x509 -days 1 -out server.crt -subj "/CN=localhost"

Enabling Catalog Emulation

Postgresql clients sends queries to pg_catalog schema to find out databases, schemas, tables, columns.

We have this (datafusion_pg_catalog)[https://github.com/ybrs/pg_catalog] for this purpose.

You can register your own database and your tables.

For example

    server = riffq.RiffqServer(f"127.0.0.1:{port}", connection_cls=Connection)
    server.set_tls("certs/server.crt", "certs/server.key")

    server._server.register_database("duckdb")

    tbls = duckdb_con.execute(
        "SELECT table_schema, table_name FROM information_schema.tables "
        "WHERE table_schema NOT IN ('pg_catalog','information_schema')"
    ).fetchall()

    for schema_name, table_name in tbls:
        server._server.register_schema("duckdb", schema_name)
        cols_info = duckdb_con.execute(
            "SELECT column_name, data_type, is_nullable FROM information_schema.columns "
            "WHERE table_schema=? AND table_name=?",
            (schema_name, table_name),
        ).fetchall()
        columns = []
        for col_name, data_type, is_nullable in cols_info:
            columns.append(
                {
                    col_name: {
                        "type": map_type(data_type),
                        "nullable": is_nullable.upper() == "YES",
                    }
                }
            )
        server._server.register_table("duckdb", schema_name, table_name, columns)

    server.start(catalog_emulation=True)

Status

  • ✅ Wire protocol support (simple + extended)
  • ✅ Query dispatching to Python
  • ✅ Thread based non-blocking query execution (long queries don't block)
  • ✅ DuckDB, Pandas, Polars compatibility
  • ✅ Limited SQL parsing on Rust side (forwarded to Python)
  • ✅ Optional TLS encryption
  • ✅ Integration with optional catalog emulation layer
  • 🟡 More examples
  • 🟡 Better logging, monitoring, observability

Installation

We currently have a pre release on pypi. You can install it with --pre tag.

pip install riffq --pre

Running Locally

Install the development requirements and run the test suite:

git clone git@github.com:ybrs/riffq.git
cd riffq
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

maturin build --profile=fast -i python3
pip install target/wheels/*.whl
# or maturin developer
make all-tests

The tests require the Rust extension to build successfully; any build failure will cause the suite to fail.


License

MIT or Apache 2.0 — your choice.


Contributing

Contributions are welcome! Especially for:

  • For the emulation layer, I am currently testing with

    So testing with other clients, especially BI tools are very welcomed.

  • Better Python DX

  • Example apps (data lake, feature store, etc.)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

riffq-0.1.6.tar.gz (79.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

riffq-0.1.6-cp313-cp313-macosx_11_0_arm64.whl (35.1 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

riffq-0.1.6-cp312-cp312-manylinux_2_38_x86_64.whl (40.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

riffq-0.1.6-cp39-cp39-win_amd64.whl (33.9 MB view details)

Uploaded CPython 3.9Windows x86-64

File details

Details for the file riffq-0.1.6.tar.gz.

File metadata

  • Download URL: riffq-0.1.6.tar.gz
  • Upload date:
  • Size: 79.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for riffq-0.1.6.tar.gz
Algorithm Hash digest
SHA256 91cd77a4321c73118505005648f419d56512e4f507456b9a51cfb024ee63d229
MD5 c291b6e2a162e9db07f8c83a0100b41b
BLAKE2b-256 7bbb2cb2df7f1f6d6d63a4b58b563a894b3ac70e371b68eb86584e42d837de5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.6.tar.gz:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file riffq-0.1.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for riffq-0.1.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a8388ada7baf62707604b5c166eadd1c6713f4f40078968fbfcb3b4be01da2f6
MD5 ed7e17064edf56699d4f6e6b420b1af2
BLAKE2b-256 df2714d358eed6578aaaee978e3740a9acba6803f933adf79fb611ae7af4eb8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.6-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file riffq-0.1.6-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for riffq-0.1.6-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 a3526dad2f4babac4cf211163d0590d3dd2a26e62721492de637fc58d15cd90a
MD5 b1cc8fcabe03e83685dfa0f81aa8b80d
BLAKE2b-256 74586a89d32846f1240f0262ebb75e5f42e4d322d82ebad0621f26a14540866f

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.6-cp312-cp312-manylinux_2_38_x86_64.whl:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file riffq-0.1.6-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: riffq-0.1.6-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 33.9 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for riffq-0.1.6-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 023cba4f2a249df63caddff845c476783dcbc92f1450d584d558bd2813d69b15
MD5 8a0450305f8626738ba2c89996261232
BLAKE2b-256 2503fbc132b8dc4dbaa9d665234a23ca5f0ed7bf432e3e5e2f27bec25c2e3430

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.6-cp39-cp39-win_amd64.whl:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page