Skip to main content

Q-strings: query anything

Project description

Q Strings

Once upon a time, there was a three-line class for running SQL queries as a method on a string.

class Q(str):
    def run(self):
        return duckdb.sql(self)

# usage
Q(my_query).run()

Installation

pip install qstrings  # base only
pip install qstrings[all]  # with ai and cli

Why?

What if the string itself was the * of the show? Q-strings are just strings carrying extra methods and attributes that unlock a few useful tricks.

Composable queries and engines

SQL-oriented python libraries usually pass a string to a SQL engine. If you routinely work with multiple databases, the variations in dialects and in implementation can pile up into something awkward.

Q-strings help in two ways. First, the queries are parsed into an AST (abstract syntax tree):

q = Q("SELECT 42 LIMIT 1")
q.ast  # abstract syntax tree

The parsing is performed with sqlglot that offers a rich collection SQL tools. Whenever you want to query different tables in different databases, it's easy to transpile and swap clauses:

q1 = Q("SELECT 42 FROM one_table")
q2 = q1.ast.from_("another_table").q()
assert q2 == "SELECT 42 FROM another_table"
assert q.transpile(read="duckdb", write="tsql") == Q("SELECT TOP 1 42")

The second trick is the template pattern for easily defining engines. The default query engine is DuckDB, q.run() and q.run(engine="duckdb") are equivalent and will execute a DuckDB query. It is obviously impossible to cover all possible scenarios; hard-coding the engine selection logic if engine == "this": run_that() only gets you so far. To make another engine, subclass Engine and write the run method. The new engine becomes available right away, at runtime, thanks to autoregistry.

from qstrings import Engine, Q

class FunnyDuckDBEngine(Engine):
    """The engine name will be "FunnyDuckDB" (dropping 'Engine', not case sensitive)."""
    def run(q: Q):
        result = duckdb.sql(q)
        funny = "lol, running a funny query"
        return result, 
        
Q("SELECT 42 AS haha").run(engine="FunnyDuckDB")

A more serious example might be this implementation for Athena. It uses awswrangler, handles different profiles, and sets certain options:

class AthenaEngine(Engine):
    def df(self, **kwargs) -> pd.DataFrame:
        import awswrangler as wr
        import boto3

        if not boto3.DEFAULT_SESSION or kwargs.get("profile"):
            boto3.setup_default_session(
                region_name=kwargs.get("region", "us-east-1"),
                profile_name=kwargs.get("profile")
            )
        df = wr.athena.read_sql_query(
            self,
            database=kwargs.get("database", "default"),
            ctas_approach=kwargs.get("ctas", False),
            keep_files=kwargs.get("keep_files", False),
        )
    return df

Q("SHOW CREATE TABLE db.table").run(engine="athena")

LLM queries

Q strings can do more than SQL! The class HFEngine uses OpenAI-compatible API and secret HF_API_KEY.

Q("Pick a number from 1 to 50").run(engine="hf", model="openai/gpt-oss-20b:fireworks-ai")
# the answer is not 42!

Also, check out SQL+LLM query language BlendSQL.

QoL features

1. Format strings with variables

from qstrings import Q

# easily swap tables, filters, column lists
Q("SELECT * FROM {table}", table="prod.whatever").run()

# or use f-strings
value = 42
Q(f"SELECT * FROM prod.whatever WHERE col = {value}").run()

# or both
cols = ",".join(["col1","col2"])
Q(f"SELECT col0,{cols} FROM prod.whatever WHERE col = {value}", value=1).run()

# use SQL templates
Q(file="tests/test_format.sql", num=42)

# variables can also come from env vars (keyword args take precedence)
Q("""
CREATE SECRET my_secret (
    TYPE s3,
    KEY_ID '{KEY_ID}'
    SECRET '{SECRET_KEY}'
)
""").run()

# easy to write unit tests
def test_keys_given_in_env():"
    s = "SELECT {num} AS {Q_name}"
    os.environ["Q_name"] = "answer"
    q = Q(s, num=42)
    assert q == "SELECT 42 AS answer"
    assert q.refs == {"num": 42, "Q_name": "answer"}
    assert q.file is None

2. Quick access to COUNT and LIMIT

When working on a potentially long-running query, you may want to check the count first, or returning a few rows.

from qstrings import Q
Q("SELECT * FROM table").count.run()
Q("SELECT * FROM table").limit(9).run()

CLI

Installing qstrings[all] or qstrings[cli] gives access to executable q, powered by the excellent CLI library cyclopts:

q --help

Usage: run-query [ARGS] [OPTIONS]


Query anything!

╭─ Commands ─────────────────────────────────────────────────╮
│ --help -h  Display this message and exit.                  │
│ --version  Display application version.                    │
╰────────────────────────────────────────────────────────────╯
╭─ Parameters ───────────────────────────────────────────────╮
│ QUERY --query           Query string                       │
│ --file              -f  File template                      │
│ --engine            -e  Query engine [default: duckdb]     │
│ --model             -m  HuggingFace model [default:        │
│                         openai/gpt-oss-20b:fireworks-ai]   │
│                     -o  Output format (defaults to         │
│                         whatever Engine.run() returns)     │
│                         [choices: engine, csv, list, line] │
│                         [default: engine]                  │
│ --LIMIT             -L  Limit rows                         │
│ --COUNT --no-COUNT  -C  Return row count only [default:    │
│                         False]                             │
│ --[KEYWORD]                                                │
╰────────────────────────────────────────────────────────────╯

The crown of creation: pipe results of an LLM query to a SQL query. The prompt provides an affinity data model and asks to find elements with 8 or more isotopes.

export URL="https://raw.githubusercontent.com/liquidcarbon/chembiodata/main/isotopes.csv"

q -f tests/test_prompt1.md -e hf | q -o csv
/* 250825@22:48:46.297|DEBUG|qstrings[0.2.1].Q.run:190|input_tokens=209 */
/* 250825@22:48:46.297|DEBUG|qstrings[0.2.1].Q.run:191|output_tokens=611 */
/* 250825@22:48:46.864|INFO|qstrings[0.2.1].Q.run:151|4 rows x 2 cols in 0.0000 sec */
Symbol,num_isotopes
Sn,10
Xe,9
Te,8
Cd,8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qstrings-0.2.4.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qstrings-0.2.4-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file qstrings-0.2.4.tar.gz.

File metadata

  • Download URL: qstrings-0.2.4.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for qstrings-0.2.4.tar.gz
Algorithm Hash digest
SHA256 ed5c5008366c91f29173059f268569e337640b2b27bb25299876cf7c2c71f654
MD5 0e3c46cbc3f056e416e9b9e12639f2bc
BLAKE2b-256 b73386cbc7cb65b6abfdc1b661663b432bf55784b1285876b820ccdef4093500

See more details on using hashes here.

Provenance

The following attestation bundles were made for qstrings-0.2.4.tar.gz:

Publisher: pypi-release.yml on liquidcarbon/qstrings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file qstrings-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: qstrings-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for qstrings-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 31c46768caa9eb4d8e94747e588fa41439fa0b30ec1ce25f80933c6d222ad8b3
MD5 140a43d929cd5e83e9bf0165471dd83c
BLAKE2b-256 8ab77e4001fb406a23582860c7f0358cfe3ddbf61830a9bd56045cf996cac7b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for qstrings-0.2.4-py3-none-any.whl:

Publisher: pypi-release.yml on liquidcarbon/qstrings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page