riffq is a toolkit for building PostgreSQL wire-compatible databases. Think exposing pandas/polars dataframe or duckdb with PostgreSQL wire protocol with pg_catalog compatibility.
Project description
riffq
riffq is a toolkit in python (built in Rust) for building PostgreSQL wire-compatible databases.
It allows you to serve data from Python over the PostgreSQL protocol — turning your Python-based logic or in-memory data into a queryable, network-exposed system. We also have a catalog emulation system in rust with datafusion.
What It Does
- Implements the PostgreSQL wire protocol in Rust for performance and concurrency
- Sends raw SQL queries (Simple or Extended protocol) to Python for interpretation
- Implements postgres catalog compatibility layer, see
pg_catalog_rs
Since you are in python, you can
- Allows you to connect to remote data sources (e.g., analytics DB, CRM) and expose them as a unified PostgreSQL database
- Enables serving Pandas DataFrames over the network as virtual SQL tables
- Can delegate SQL execution to DuckDB, Polars, or any other Python engine
- Acts as a programmable federated query engine or custom data service
Example Use Cases
- Serve a Pandas DataFrame as a PostgreSQL table to BI tools
- Build a custom federated engine from multiple APIs or databases
- Implement your own data lake query frontend
- Expose dynamic ML feature stores for training or real-time inference
- Provide fine-grained, code-controlled access to internal metrics or logs
Example
import logging
import duckdb
import pyarrow as pa
import riffq
logging.basicConfig(level=logging.DEBUG)
class Connection(riffq.BaseConnection):
def handle_auth(self, user, password, host, database=None, callback=callable):
# simple username/password check
callback(user == "user" and password == "secret")
def handle_connect(self, ip, port, callback=callable):
# allow every incoming connection
callback(True)
def handle_disconnect(self, ip, port, callback=callable):
# invoked when client disconnects
callback(True)
def _handle_query(self, sql, callback, **kwargs):
cur = duckdb_con.cursor()
try:
if sql.strip().lower() == "select err":
# custom error returned to client
callback(("ERROR", "42846", "bad type"), is_error=True)
return
reader = cur.execute(sql).fetch_record_batch()
self.send_reader(reader, callback)
except Exception as exc:
logging.exception("error on executing query")
batch = self.arrow_batch(
[pa.array(["ERROR"]), pa.array([str(exc)])],
["error", "message"],
)
self.send_reader(batch, callback)
def handle_query(self, sql, callback=callable, **kwargs):
self.executor.submit(self._handle_query, sql, callback, **kwargs)
def main():
global duckdb_con
duckdb_con = duckdb.connect()
duckdb_con.execute(
"""
CREATE VIEW klines AS
SELECT *
FROM 'data/klines.parquet'
"""
)
server = riffq.RiffqServer("127.0.0.1:5433", connection_cls=Connection)
server.set_tls("certs/server.crt", "certs/server.key")
server.start(tls=True)
if __name__ == "__main__":
main()
The Rust side calls this Python handler when a SQL query comes in via the PostgreSQL protocol.
Architecture
- Rust layer handles:
- PostgreSQL protocol (via
pgwire) - Connection management
- Query routing
- Metadata compatibility (
pg_catalogemulation)
- PostgreSQL protocol (via
- Python layer handles:
- SQL execution (via any engine: DuckDB, Polars, etc.)
- Data transformation
- Custom logic and dynamic schema definitions
Zero Copy
We try to achieve zero-copy by using arrow/pycapsule. So data from duckdb comes to python as a pycapsule pointer, which goes to thread in python which goes to the callback in rust still as a pycapsule pointer. We then stream to network with postgresql using pgwire.
https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html
Getting Started
Install the package
pip install riffq --pre
Extend BaseConnection class
You can extend the base class for
- handle_query
- handle_auth - To check for username/password
- handle_connect - To check ip/port restriction
- handle_disconnect - WIP
class Connection(riffq.BaseConnection):
def handle_auth(self, user, password, host, database=None, callback=callable):
# simple username/password check
callback(user == "user" and password == "secret")
def handle_connect(self, ip, port, callback=callable):
# allow every incoming connection
callback(True)
def handle_disconnect(self, ip, port, callback=callable):
# invoked when client disconnects
callback(True)
def _handle_query(self, sql, callback, **kwargs):
cur = duckdb_con.cursor()
try:
if sql.strip().lower() == "select err":
# custom error returned to client
callback(("ERROR", "42846", "bad type"), is_error=True)
return
reader = cur.execute(sql).fetch_record_batch()
self.send_reader(reader, callback)
except Exception as exc:
logging.exception("error on executing query")
batch = self.arrow_batch(
[pa.array(["ERROR"]), pa.array([str(exc)])],
["error", "message"],
)
self.send_reader(batch, callback)
def handle_query(self, sql, callback=callable, **kwargs):
self.executor.submit(self._handle_query, sql, callback, **kwargs)
Start the server
global duckdb_con
duckdb_con = duckdb.connect()
duckdb_con.execute(
"""
CREATE VIEW klines AS
SELECT *
FROM 'data/klines.parquet'
"""
)
server = riffq.RiffqServer("127.0.0.1:5433", connection_cls=Connection)
server.set_tls("certs/server.crt", "certs/server.key")
server.start()
You can check server implementations on test_concurrency/ and example/ directory
Then connect using any PostgreSQL client:
psql -h localhost -p 5433
Enabling TLS
Generate a temporary certificate and key:
openssl req -newkey rsa:2048 -nodes -keyout server.key -x509 -days 1 -out server.crt -subj "/CN=localhost"
Enabling Catalog Emulation
Postgresql clients sends queries to pg_catalog schema to find out databases, schemas, tables, columns.
We have this (datafusion_pg_catalog)[https://github.com/ybrs/pg_catalog] for this purpose.
You can register your own database and your tables.
For example
server = riffq.RiffqServer(f"127.0.0.1:{port}", connection_cls=Connection)
server.set_tls("certs/server.crt", "certs/server.key")
server._server.register_database("duckdb")
tbls = duckdb_con.execute(
"SELECT table_schema, table_name FROM information_schema.tables "
"WHERE table_schema NOT IN ('pg_catalog','information_schema')"
).fetchall()
for schema_name, table_name in tbls:
server._server.register_schema("duckdb", schema_name)
cols_info = duckdb_con.execute(
"SELECT column_name, data_type, is_nullable FROM information_schema.columns "
"WHERE table_schema=? AND table_name=?",
(schema_name, table_name),
).fetchall()
columns = []
for col_name, data_type, is_nullable in cols_info:
columns.append(
{
col_name: {
"type": map_type(data_type),
"nullable": is_nullable.upper() == "YES",
}
}
)
server._server.register_table("duckdb", schema_name, table_name, columns)
server.start(catalog_emulation=True)
Status
- ✅ Wire protocol support (simple + extended)
- ✅ Query dispatching to Python
- ✅ Thread based non-blocking query execution (long queries don't block)
- ✅ DuckDB, Pandas, Polars compatibility
- ✅ Limited SQL parsing on Rust side (forwarded to Python)
- ✅ Optional TLS encryption
- ✅ Integration with optional catalog emulation layer
- 🟡 More examples
- 🟡 Better logging, monitoring, observability
Installation
We currently have a pre release on pypi. You can install it with --pre tag.
pip install riffq --pre
Running Locally
Install the development requirements and run the test suite:
git clone git@github.com:ybrs/riffq.git
cd riffq
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
maturin build --profile=fast -i python3
pip install target/wheels/*.whl
# or maturin developer
make all-tests
The tests require the Rust extension to build successfully; any build failure will cause the suite to fail.
License
MIT or Apache 2.0 — your choice.
Contributing
Contributions are welcome! Especially for:
-
For the emulation layer, I am currently testing with
- Intellij Datagrid
- Dbeaver
- psql cli
- Vscode Postgresql Extension
So testing with other clients, especially BI tools are very welcomed.
-
Better Python DX
-
Example apps (data lake, feature store, etc.)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file riffq-0.1.7.tar.gz.
File metadata
- Download URL: riffq-0.1.7.tar.gz
- Upload date:
- Size: 86.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ba994480199c2030835f8acaf124084df6a15faa78d819b4ca596d7fce3cd60
|
|
| MD5 |
8e712371dbca631c9903a9939ed38b08
|
|
| BLAKE2b-256 |
3283a60cd6ed1dae290d9f8fdbb4cf3a7095b52e29262bd44a99fc5caa7ecb3c
|
Provenance
The following attestation bundles were made for riffq-0.1.7.tar.gz:
Publisher:
build-release.yml on ybrs/riffq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
riffq-0.1.7.tar.gz -
Subject digest:
2ba994480199c2030835f8acaf124084df6a15faa78d819b4ca596d7fce3cd60 - Sigstore transparency entry: 592280356
- Sigstore integration time:
-
Permalink:
ybrs/riffq@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Branch / Tag:
refs/tags/release-0.1.8 - Owner: https://github.com/ybrs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-release.yml@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Trigger Event:
push
-
Statement type:
File details
Details for the file riffq-0.1.7-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: riffq-0.1.7-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 35.1 MB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65722f4e9443d04f6a4cd98865a00772d7d9aea026471899d431de14620d0279
|
|
| MD5 |
1723528338f23c7bdba0eb130aedf3b1
|
|
| BLAKE2b-256 |
c2bc635f404a9a25ee8a835cc7c8059bbea8e785bed2875f00221cb84b68159d
|
Provenance
The following attestation bundles were made for riffq-0.1.7-cp313-cp313-macosx_11_0_arm64.whl:
Publisher:
build-release.yml on ybrs/riffq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
riffq-0.1.7-cp313-cp313-macosx_11_0_arm64.whl -
Subject digest:
65722f4e9443d04f6a4cd98865a00772d7d9aea026471899d431de14620d0279 - Sigstore transparency entry: 592280362
- Sigstore integration time:
-
Permalink:
ybrs/riffq@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Branch / Tag:
refs/tags/release-0.1.8 - Owner: https://github.com/ybrs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-release.yml@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Trigger Event:
push
-
Statement type:
File details
Details for the file riffq-0.1.7-cp312-cp312-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: riffq-0.1.7-cp312-cp312-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 39.6 MB
- Tags: CPython 3.12, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99c9e96b901d59a749e886f46a5b748fd796a981e0bdd87350a621474fd6fb9b
|
|
| MD5 |
3ae9c77c1b1d61a26a16f3383a53e228
|
|
| BLAKE2b-256 |
5798d7802c9d90a9b053c0c194afed98748297a12a2ba81998bb06192361a96d
|
Provenance
The following attestation bundles were made for riffq-0.1.7-cp312-cp312-manylinux_2_38_x86_64.whl:
Publisher:
build-release.yml on ybrs/riffq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
riffq-0.1.7-cp312-cp312-manylinux_2_38_x86_64.whl -
Subject digest:
99c9e96b901d59a749e886f46a5b748fd796a981e0bdd87350a621474fd6fb9b - Sigstore transparency entry: 592280363
- Sigstore integration time:
-
Permalink:
ybrs/riffq@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Branch / Tag:
refs/tags/release-0.1.8 - Owner: https://github.com/ybrs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-release.yml@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Trigger Event:
push
-
Statement type:
File details
Details for the file riffq-0.1.7-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: riffq-0.1.7-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 34.1 MB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bab9590a23cd097f41730c5a5fab5642b41919661b361d148b729215a1fed011
|
|
| MD5 |
a37758557e9b6fe0e761141710e2afb3
|
|
| BLAKE2b-256 |
c7c465873022af408cf588d3f93f965ad2d6f5706110040d1fd8fd433e2c9337
|
Provenance
The following attestation bundles were made for riffq-0.1.7-cp39-cp39-win_amd64.whl:
Publisher:
build-release.yml on ybrs/riffq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
riffq-0.1.7-cp39-cp39-win_amd64.whl -
Subject digest:
bab9590a23cd097f41730c5a5fab5642b41919661b361d148b729215a1fed011 - Sigstore transparency entry: 592280360
- Sigstore integration time:
-
Permalink:
ybrs/riffq@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Branch / Tag:
refs/tags/release-0.1.8 - Owner: https://github.com/ybrs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-release.yml@5296a0a095603cdbbedfa68ada797d8a76574b18 -
Trigger Event:
push
-
Statement type: