Skip to main content

No project description provided

Project description

riffq

riffq is a toolkit in python (built in Rust) for building PostgreSQL wire-compatible databases.
It allows you to serve data from Python over the PostgreSQL protocol — turning your Python-based logic or in-memory data into a queryable, network-exposed system. We also have a catalog emulation system in rust with datafusion.


What It Does

  • Implements the PostgreSQL wire protocol in Rust for performance and concurrency
  • Sends raw SQL queries (Simple or Extended protocol) to Python for interpretation
  • Implements postgres catalog compatibility layer, see pg_catalog_rs

Since you are in python, you can

  • Allows you to connect to remote data sources (e.g., analytics DB, CRM) and expose them as a unified PostgreSQL database
  • Enables serving Pandas DataFrames over the network as virtual SQL tables
  • Can delegate SQL execution to DuckDB, Polars, or any other Python engine
  • Acts as a programmable federated query engine or custom data service

Example Use Cases

  • Serve a Pandas DataFrame as a PostgreSQL table to BI tools
  • Build a custom federated engine from multiple APIs or databases
  • Implement your own data lake query frontend
  • Expose dynamic ML feature stores for training or real-time inference
  • Provide fine-grained, code-controlled access to internal metrics or logs

Example

import logging
import duckdb
import pyarrow as pa
import riffq
logging.basicConfig(level=logging.DEBUG)

class Connection(riffq.BaseConnection):
    def _handle_query(self, sql, callback, **kwargs):
        cur = duckdb_con.cursor()
        try:
            reader = cur.execute(sql).fetch_record_batch()
            self.send_reader(reader, callback)
        except Exception as exc:
            logging.exception("error on executing query")
            batch = self.arrow_batch(
                [pa.array(["ERROR"]), pa.array([str(exc)])],
                ["error", "message"],
            )
            self.send_reader(batch, callback)

    def handle_query(self, sql, callback=callable, **kwargs):
        self.executor.submit(self._handle_query, sql, callback, **kwargs)

def main():
    global duckdb_con
    duckdb_con = duckdb.connect()
    duckdb_con.execute(
        """
        CREATE VIEW klines AS 
        SELECT * 
        FROM 'data/klines.parquet'
        """
    )
    server = riffq.RiffqServer("127.0.0.1:5433", connection_cls=Connection)
    server.set_tls("certs/server.crt", "certs/server.key")
    server.start()

if __name__ == "__main__":
    main()

The Rust side calls this Python handler when a SQL query comes in via the PostgreSQL protocol.


Architecture

  • Rust layer handles:
    • PostgreSQL protocol (via pgwire)
    • Connection management
    • Query routing
    • Metadata compatibility (pg_catalog emulation)
  • Python layer handles:
    • SQL execution (via any engine: DuckDB, Polars, etc.)
    • Data transformation
    • Custom logic and dynamic schema definitions

Zero Copy

We try to achieve zero-copy by using arrow/pycapsule. So data from duckdb comes to python as a pycapsule pointer, which goes to thread in python which goes to the callback in rust still as a pycapsule pointer. We then stream to network with postgresql using pgwire.

https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html


Getting Started

Install the package

pip install riffq --pre

Extend BaseConnection class

You can extend the base class for

  • handle_query
  • handle_auth - To check for username/password
  • handle_connect - To check ip/port restriction
  • handle_disconnect - WIP
class Connection(riffq.BaseConnection):
    def _handle_query(self, sql, callback, **kwargs):
        cur = duckdb_con.cursor()
        try:
            reader = cur.execute(sql).fetch_record_batch()
            self.send_reader(reader, callback)
        except Exception as exc:
            logging.exception("error on executing query")
            batch = self.arrow_batch(
                [pa.array(["ERROR"]), pa.array([str(exc)])],
                ["error", "message"],
            )
            self.send_reader(batch, callback)

    def handle_query(self, sql, callback=callable, **kwargs):
        self.executor.submit(self._handle_query, sql, callback, **kwargs)

Start the server

    global duckdb_con
    duckdb_con = duckdb.connect()
    duckdb_con.execute(
        """
        CREATE VIEW klines AS 
        SELECT * 
        FROM 'data/klines.parquet'
        """
    )
    server = riffq.RiffqServer("127.0.0.1:5433", connection_cls=Connection)
    server.set_tls("certs/server.crt", "certs/server.key")
    server.start()

You can check server implementations on test_concurrency/ and example/ directory

Then connect using any PostgreSQL client:

psql -h localhost -p 5433

Enabling TLS

Generate a temporary certificate and key:

openssl req -newkey rsa:2048 -nodes -keyout server.key -x509 -days 1 -out server.crt -subj "/CN=localhost"

Enabling Catalog Emulation

Postgresql clients sends queries to pg_catalog schema to find out databases, schemas, tables, columns.

We have this (datafusion_pg_catalog)[https://github.com/ybrs/pg_catalog] for this purpose.

You can register your own database and your tables.

For example

    server = riffq.RiffqServer(f"127.0.0.1:{port}", connection_cls=Connection)
    server.set_tls("certs/server.crt", "certs/server.key")

    server._server.register_database("duckdb")

    tbls = duckdb_con.execute(
        "SELECT table_schema, table_name FROM information_schema.tables "
        "WHERE table_schema NOT IN ('pg_catalog','information_schema')"
    ).fetchall()

    for schema_name, table_name in tbls:
        server._server.register_schema("duckdb", schema_name)
        cols_info = duckdb_con.execute(
            "SELECT column_name, data_type, is_nullable FROM information_schema.columns "
            "WHERE table_schema=? AND table_name=?",
            (schema_name, table_name),
        ).fetchall()
        columns = []
        for col_name, data_type, is_nullable in cols_info:
            columns.append(
                {
                    col_name: {
                        "type": map_type(data_type),
                        "nullable": is_nullable.upper() == "YES",
                    }
                }
            )
        server._server.register_table("duckdb", schema_name, table_name, columns)

    server.start(catalog_emulation=True)

Status

  • ✅ Wire protocol support (simple + extended)
  • ✅ Query dispatching to Python
  • ✅ Thread based non-blocking query execution (long queries don't block)
  • ✅ DuckDB, Pandas, Polars compatibility
  • ✅ Limited SQL parsing on Rust side (forwarded to Python)
  • ✅ Optional TLS encryption
  • ✅ Integration with optional catalog emulation layer
  • 🟡 More examples
  • 🟡 Better logging, monitoring, observability

Installation

We currently have a pre release on pypi. You can install it with --pre tag.

pip install riffq --pre

Running Locally

Install the development requirements and run the test suite:

git clone git@github.com:ybrs/riffq.git
cd riffq
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

maturin build --profile=fast -i python3
pip install target/wheels/*.whl
# or maturin developer
make all-tests

The tests require the Rust extension to build successfully; any build failure will cause the suite to fail.


License

MIT or Apache 2.0 — your choice.


Contributing

Contributions are welcome! Especially for:

  • For the emulation layer, I am currently testing with

    So testing with other clients, especially BI tools are very welcomed.

  • Better Python DX

  • Example apps (data lake, feature store, etc.)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

riffq-0.1.3.tar.gz (69.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

riffq-0.1.3-cp313-cp313-macosx_11_0_arm64.whl (35.1 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

riffq-0.1.3-cp312-cp312-manylinux_2_38_x86_64.whl (39.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

riffq-0.1.3-cp39-cp39-win_amd64.whl (33.9 MB view details)

Uploaded CPython 3.9Windows x86-64

File details

Details for the file riffq-0.1.3.tar.gz.

File metadata

  • Download URL: riffq-0.1.3.tar.gz
  • Upload date:
  • Size: 69.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for riffq-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f00202337f09e33569ff9b5c8bada1da05ccc2826a42774d5b9a5ff4d8b789c2
MD5 1159bdc5e5760ed465026badf7aafad2
BLAKE2b-256 fff00462106dac03c7d6edac93b580fb0eb0b4de8dd7a1d980015e3dcb6e460b

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.3.tar.gz:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file riffq-0.1.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for riffq-0.1.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a267b6a2851566b4b5a03dc6a9b78eb8fdd886d677999f30038a2f6d764d62e1
MD5 3c1e4c9663a39d8ad7dfd58c5e6e0fac
BLAKE2b-256 a5686783d865f7ef7d746c194edd11b40696848806ed45195b30d4ba1e2001d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.3-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file riffq-0.1.3-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for riffq-0.1.3-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 4830995c767bd8e02a38b28cd4898f3a2e1adb7fb8024e60bb08b49249ded63d
MD5 554b8fa46a17ee42c121eef6f1730bc5
BLAKE2b-256 21002c45e22095bfc8565b31d9fffe8f8b001ca9f0592f21c8436237a4936ab6

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.3-cp312-cp312-manylinux_2_38_x86_64.whl:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file riffq-0.1.3-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: riffq-0.1.3-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 33.9 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for riffq-0.1.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 65c0874a2f8e8930e9f64fb58fb79ca7ac98c9abc3c6aed63d18d9661760c1e6
MD5 4c5aba062cf34acecf254314fa888d2e
BLAKE2b-256 b5a3228acc5e5cd1d334f98a2f9353eac51563c8365776226d063ff0beda87da

See more details on using hashes here.

Provenance

The following attestation bundles were made for riffq-0.1.3-cp39-cp39-win_amd64.whl:

Publisher: build-release.yml on ybrs/riffq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page