Skip to main content

SqlAlchemy library

Project description

QuestDB Logo

QuestDB community Slack channel

QuestDB Connect

This repository contains the official implementation of QuestDB's dialect for SQLAlchemy, as well as an engine specification for Apache Superset, using psycopg2 for database connectivity.

The Python module is available here:

PyPi https://pypi.org/project/questdb-connect/

Psycopg2 is a widely used and trusted Python module for connecting to, and working with, QuestDB and other PostgreSQL databases.

SQLAlchemy is a SQL toolkit and ORM library for Python. It provides a high-level API for communicating with relational databases, including schema creation and modification. The ORM layer abstracts away the complexities of the database, allowing developers to work with Python objects instead of raw SQL statements.

Apache Superset is an open-source business intelligence web application that enables users to visualize and explore data through customizable dashboards and reports. It provides a rich set of data visualizations, including charts, tables, and maps.

Requirements

  • Python from 3.9 to 3.11 (superset itself use version 3.9.x)
  • Psycopg2 ('psycopg2-binary~=2.9.6')
  • SQLAlchemy ('SQLAlchemy>=1.4')

You need to install these packages because questdb-connect depends on them. Note that questdb-connect v1.1 is compatible with both SQLAlchemy v1.4 and v2.0 while questdb-connect v1.0 is compatible with SQLAlchemy v1.4 only.

Versions 0.0.X

These are versions released for testing purposes.

Installation

You can install this package using pip:

pip install questdb-connect

SQLAlchemy Sample Usage

Use the QuestDB dialect by specifying it in your SQLAlchemy connection string:

questdb://admin:quest@localhost:8812/main
questdb://admin:quest@host.docker.internal:8812/main

From that point on use standard SQLAlchemy. Example with raw SQL API:

import datetime
import time
import uuid
from sqlalchemy import create_engine, text

def main():
    engine = create_engine('questdb://admin:quest@localhost:8812/main')

    with engine.begin() as connection:
        # Create the table
        connection.execute(text("""
            CREATE TABLE IF NOT EXISTS signal (
                source SYMBOL,
                value DOUBLE,
                ts TIMESTAMP,
                uuid UUID
            ) TIMESTAMP(ts) PARTITION BY HOUR WAL;
        """))

        # Insert 2 rows
        connection.execute(text("""
            INSERT INTO signal (source, value, ts, uuid) VALUES
            (:source1, :value1, :ts1, :uuid1),
            (:source2, :value2, :ts2, :uuid2)
        """), {
            'source1': 'coconut', 'value1': 16.88993244, 'ts1': datetime.datetime.utcnow(), 'uuid1': uuid.uuid4(),
            'source2': 'banana', 'value2': 3.14159265, 'ts2': datetime.datetime.utcnow(), 'uuid2': uuid.uuid4()
        })

    # WAL is applied asynchronously, so we need to wait for it to be applied before querying
    time.sleep(1)

    # Start a new transaction
    with engine.begin() as connection:
        # Query the table for rows where value > 10
        result = connection.execute(
            text("SELECT source, value, ts, uuid FROM signal WHERE value > :value"),
            {'value': 10}
        )
        for row in result:
            print(row.source, row.value, row.ts, row.uuid)


if __name__ == '__main__':
    main()

Alternatively, you can use the ORM API:

import datetime
import uuid
import time
from questdb_connect import Symbol, PartitionBy, UUID, Double, Timestamp, QDBTableEngine
from sqlalchemy import Column, MetaData, create_engine, text
from sqlalchemy.orm import declarative_base, sessionmaker

Base = declarative_base(metadata=MetaData())


class Signal(Base):
    # Stored in a QuestDB table 'signal'. The tables has WAL enabled, is partitioned by hour, designated timestamp is 'ts'
    __tablename__ = 'signal'
    __table_args__ = (QDBTableEngine(None, 'ts', PartitionBy.HOUR, is_wal=True),)
    source = Column(Symbol)
    value = Column(Double)
    ts = Column(Timestamp)
    uuid = Column(UUID, primary_key=True)

    def __repr__(self):
        return f"Signal(source={self.source}, value={self.value}, ts={self.ts}, uuid={self.uuid})"


def main():
    engine = create_engine('questdb://admin:quest@localhost:8812/main')

    # Create the table
    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    session = Session()

    # Insert 2 rows
    session.add(Signal(
        source='coconut',
        value=16.88993244,
        ts=datetime.datetime.utcnow(),
        uuid=uuid.uuid4()
    ))

    session.add(Signal(
        source='banana',
        value=3.14159265,
        ts=datetime.datetime.utcnow(),
        uuid=uuid.uuid4()
    ))
    session.commit()

    # WAL is applied asynchronously, so we need to wait for it to be applied before querying
    time.sleep(1)

    # Query the table for rows where value > 10
    signals = session.query(Signal).filter(Signal.value > 10).all()
    for signal in signals:
        print(signal.source, signal.value, signal.ts, signal.uuid)


if __name__ == '__main__':
    main()

ORM (Object-Relational Mapping) API is not recommended for QuestDB due to its fundamental design differences from traditional transactional databases. While ORMs excel at managing relationships and complex object mappings in systems like PostgreSQL or MySQL, QuestDB is specifically optimized for time-series data operations and high-performance ingestion. It intentionally omits certain SQL features that ORMs typically rely on, such as generated columns, foreign keys, and complex joins, in favor of time-series-specific optimizations.

For optimal performance and to fully leverage QuestDB's capabilities, we strongly recommend using the raw SQL API, which allows direct interaction with QuestDB's time-series-focused query engine and provides better control over time-based operations.

Primary Key Considerations

QuestDB differs from traditional relational databases in its handling of data uniqueness. While most databases enforce primary keys to guarantee unique record identification, QuestDB operates differently due to its time-series optimized architecture.

When using SQLAlchemy with QuestDB:

  • You can define primary keys in your SQLAlchemy models, but QuestDB won't enforce uniqueness for individual columns
  • Duplicate rows with identical primary key values can exist in the database
  • Data integrity must be managed at the application level
  • QuestDB support deduplication during ingestion to avoid data duplication, this can be enabled in the table creation

Recommended Approaches

  1. Composite Keys + QuestDB Deduplication

Composite keys can be used to define uniqueness based on multiple columns. This approach:

  • Can combine timestamp with any number of additional columns
  • Works with QuestDB's deduplication capabilities
  • Useful for scenarios where uniqueness is defined by multiple attributes
  • Common combinations might include:
    • timestamp + device_id + metric_type
    • timestamp + location + sensor_id
    • timestamp + instrument_id + exchange + side

Deduplication is often enabled in QuestDB regardless of the primary key definition since it's required to avoid data duplication during ingestion.

Example:

from questdb_connect import QDBTableEngine, PartitionBy, Double, Timestamp, Symbol
class Measurement(Base):
    __tablename__ = 'signal'
    __table_args__ = (QDBTableEngine(None, 'timestamp', PartitionBy.HOUR, is_wal=True),)
    timestamp = Column(Timestamp, primary_key=True)
    sensor_id = Column(Symbol, primary_key=True)
    location = Column(Symbol, primary_key=True)
    value = Column(Double)

Choose your approach based on your data model and whether you need to leverage QuestDB's deduplication capabilities.

  1. UUID-based Identification

UUIDs are ideal for QuestDB applications because they:

  • Are globally unique across distributed systems
  • Can be generated client-side without database coordination
  • Work well with high-throughput data ingestion

Example:

from questdb_connect import Symbol, PartitionBy, UUID, Double, Timestamp, QDBTableEngine
class Signal(Base):
    __tablename__ = 'signal'
    __table_args__ = (QDBTableEngine(None, 'ts', PartitionBy.HOUR, is_wal=True),)
    source = Column(Symbol)
    value = Column(Double)
    ts = Column(Timestamp)
    uuid = Column(UUID, primary_key=True)
    # other columns...

Superset Installation

This repository also contains an engine specification for Apache Superset, which allows you to connect to QuestDB from within the Superset interface.

Apache Superset

Follow the official QuestDB Superset guide available on the QuestDB website to install and configure the QuestDB engine in Superset.

Contributing

This package is open-source, contributions are welcome. If you find a bug or would like to request a feature, please open an issue on the GitHub repository. Have a look at the instructions for developers if you would like to push a PR.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

questdb_connect-1.1.4.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

questdb_connect-1.1.4-py3-none-any.whl (28.9 kB view details)

Uploaded Python 3

File details

Details for the file questdb_connect-1.1.4.tar.gz.

File metadata

  • Download URL: questdb_connect-1.1.4.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for questdb_connect-1.1.4.tar.gz
Algorithm Hash digest
SHA256 0095a745c7a126bb3d38045f0b9f379e265ece7f558846b8a0f802a0405ecc3e
MD5 4f9c0196543a1e3ce179226c627b5690
BLAKE2b-256 8797829463e0031797b1de5e7170da28b014f2b2c9fbc225883530efa0dd86f4

See more details on using hashes here.

File details

Details for the file questdb_connect-1.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for questdb_connect-1.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ccb9cda9c5a5477cdcea719c9c25fa4e7a61d552490ba328dcd439b0cc2b6423
MD5 b89661a17af86d7d86b56ff740d6e92c
BLAKE2b-256 ee0afdee4bb27594559449246036ef828d3a3302d92e584c21c26b7e4cce1981

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page