Skip to main content

Programmatically generate SQL queries for different database platforms.

Project description

pikaQ

This library is heavily inspired by PyPika, so a lot of the syntax is based on how PyPika works. Just like PyPika, PikaQ replaces handwritten SQL queries with programmatic construction of queries. The main difference is that pikaQ is designed solve the problem of writing the same code that can be parsed to different SQL dialects. This is done by using a common syntax that is then translated to the specific dialect.

This library provides the core components and implementation for constructing the Select query and the core mechanism for parsing it into different dialect. It does not offer complete coverage of SQL syntax nor the detailed implementation of all the dialects. However, it should be easy to extend the library to support more SQL syntax and other types of queries you want to construct.

Validation of SQL correctness is not an explicit goal of PikaQ. You are encouraged to check inputs you provide to PyPika or appropriately handle errors raised from your SQL database - just as you would have if you were writing SQL yourself.

Install

pip install pikaQ

How to use

For example, if we want to write the same query in Spark SQL and AWS Athena, we might encounter this problem: we have ADD_MONTHS function in Spark SQL, but in AWS Athena (Presto) we don’t have this function.

We can define an ADD_MONTHS function in the following way:

from pikaQ.terms import custom_func
from pikaQ.queries import Query


@custom_func
def add_months(column, value, dialect='spark'):
    if dialect == 'athena':
        return f"DATE_ADD('month', {value}, {column})"
    elif dialect == 'spark':
        return f'ADD_MONTH({column}, {value})'
    else:
        raise ValueError(f'Unsupported dialect: {dialect}')


q = (Query.from_('table')
        .select('col1', add_months('col2', 3).as_('col2'))
)

Then we can generate the query for Spark SQL and AWS Athena:

print(q.get_sql(dialect='athena'))
select col1, DATE_ADD('month', 3, col2) AS col2
from table
print(q.get_sql(dialect='spark'))
select col1, ADD_MONTH(col2, 3) AS col2
from table

A more complex example to show how the syntax works:

from pikaQ.queries import Query, Table, Field, AliasedQuery
from pikaQ.terms import Case, Preceding, CURRENT_ROW
import pikaQ.functions as fn


a = Table('tbl1').as_('a')
b = Table('tbl2').as_('b')
s = AliasedQuery('s')
m = AliasedQuery('m')
v = AliasedQuery('v')

q0 = (Query
      .from_(a)
      .select(
         a.col1,
         a.col2,
         fn.Sum(a.col3).over(b.col2).rows(Preceding(3), CURRENT_ROW).as_('total'), 
         fn.RowNumber().over(b.col2).orderby(b.col4).as_('row_num')
      ).distinct()
      .where((Field('col2')-100>2) & (Field('col3')/9<=1))
      .orderby(b.col2)
)
q1 = (Query
      .from_(b)
      .select(b.col1, b.col2, fn.Avg(b.col3).as_('avg'))
      .groupby(b.col1, b.col2)
      .having(fn.Count(b.col3)>2)
)

q = (Query
     .with_(q0, 's')
     .with_(Query
         .from_(s)
         .select(s.star())
         .where(s.row_num == 1), 'm')
     .with_(q1, 'v')
     .from_('v')
     .join('m')
        .on(
            (v.col1 == m.col1) &
            (v.col2 == m.col2)
            )
     .select(
         v.col1, v.col2, v.avg,
         Case().when(m.total>100, 1).else_(0).as_('flag')
         )
)

print(q.get_sql())
with s as (
select distinct a.col1, a.col2, SUM(a.col3) OVER (PARTITION BY b.col2 ROWS BETWEEN 3 PRECEDING AND CURRENT_ROW) AS total, ROW_NUMBER() OVER (PARTITION BY b.col2 ORDER BY b.col4) AS row_num
from tbl1 as a
where col2 - 100 > 2 and col3 / 9 <= 1
order by b.col2)

, m as (
select s.*
from s
where s.row_num = 1)

, v as (
select b.col1, b.col2, AVG(b.col3)
from tbl2 as b
group by b.col1, b.col2
having COUNT(b.col3) > 2)

select v.col1, v.col2, v.avg, CASE
WHEN m.total > 100 THEN 1
ELSE 0
END AS flag
from v
join m on v.col1 = m.col1 and v.col2 = m.col2

For more syntax examples, please refer to the docs.

Extension

One can use the core components and logic implemented in this library to extend the functionality to support more SQL syntax and other types of queries. For details of how to extend the SelectQuery to support more clauses, please refer to the Query section of the docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pikaQ-1.0.2.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pikaQ-1.0.2-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file pikaQ-1.0.2.tar.gz.

File metadata

  • Download URL: pikaQ-1.0.2.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for pikaQ-1.0.2.tar.gz
Algorithm Hash digest
SHA256 3dbcc46594eb91196b0204beb73ed3d247c351b8a4f4f50694284fa920d5629f
MD5 45091adf89a895607fcc092758607b04
BLAKE2b-256 02430b54b941c9c610813adcc38ac9daf1d7ec9dc08d6b053ba9fa620f266ada

See more details on using hashes here.

File details

Details for the file pikaQ-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pikaQ-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for pikaQ-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0fdaf158e80c254f30a1920da474297c90c6021ffd8a8dcf567be0671a010e70
MD5 297a3a3f9ba51335aad14fbbc1077a87
BLAKE2b-256 2134daa4daf3b4db714a65e6341251bc35f2e2700bd85d041a7306f73096b62d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page