Skip to main content

Client for the Trino distributed SQL Engine

Project description

Trino Python client

Client for Trino, a distributed SQL engine for interactive and batch big data processing. Provides a low-level client and a DBAPI 2.0 implementation and a SQLAlchemy adapter. It supports Python>=3.8 and PyPy.

Build Status Trino Slack Trino: The Definitive Guide book download

Development

See DEVELOPMENT for information about code style, development process, and guidelines.

See CONTRIBUTING for contribution requirements.

Usage

The Python Database API (DBAPI)

Installation

$ pip install trino

Quick Start

Use the DBAPI interface to query Trino:

if host is a valid url, the port and http schema will be automatically determined. For example https://my-trino-server:9999 will assign the http_schema property to https and port to 9999.

from trino.dbapi import connect

conn = connect(
    host="<host>",
    port=<port>,
    user="<username>",
    catalog="<catalog>",
    schema="<schema>",
)
cur = conn.cursor()
cur.execute("SELECT * FROM system.runtime.nodes")
rows = cur.fetchall()

This will query the system.runtime.nodes system tables that shows the nodes in the Trino cluster.

The DBAPI implementation in trino.dbapi provides methods to retrieve fewer rows for example Cursor.fetchone() or Cursor.fetchmany(). By default Cursor.fetchmany() fetches one row. Please set trino.dbapi.Cursor.arraysize accordingly.

SQLAlchemy

Prerequisite

  • Trino server >= 351

Compatibility

trino.sqlalchemy is compatible with the latest 1.3.x, 1.4.x and 2.0.x SQLAlchemy versions at the time of release of a particular version of the client.

Installation

$ pip install trino[sqlalchemy]

Usage

To connect to Trino using SQLAlchemy, use a connection string (URL) following this pattern:

trino://<username>:<password>@<host>:<port>/<catalog>/<schema>

NOTE: password and schema are optional

Examples:

from sqlalchemy import create_engine
from sqlalchemy.schema import Table, MetaData
from sqlalchemy.sql.expression import select, text

engine = create_engine('trino://user@localhost:8080/system')
connection = engine.connect()

rows = connection.execute(text("SELECT * FROM runtime.nodes")).fetchall()

# or using SQLAlchemy schema
nodes = Table(
    'nodes',
    MetaData(schema='runtime'),
    autoload=True,
    autoload_with=engine
)
rows = connection.execute(select(nodes)).fetchall()

In order to pass additional connection attributes use connect_args method. Attributes can also be passed in the connection string.

from sqlalchemy import create_engine
from trino.sqlalchemy import URL

engine = create_engine(
    URL(
        host="localhost",
        port=8080,
        catalog="system"
    ),
    connect_args={
      "session_properties": {'query_max_run_time': '1d'},
      "client_tags": ["tag1", "tag2"],
      "roles": {"catalog1": "role1"},
    }
)

# or in connection string
engine = create_engine(
    'trino://user@localhost:8080/system?'
    'session_properties={"query_max_run_time": "1d"}'
    '&client_tags=["tag1", "tag2"]'
    '&roles={"catalog1": "role1"}'
)

# or using the URL factory method
engine = create_engine(URL(
  host="localhost",
  port=8080,
  client_tags=["tag1", "tag2"]
))

Authentication mechanisms

Basic authentication

The BasicAuthentication class can be used to connect to a Trino cluster configured with the Password file, LDAP or Salesforce authentication type:

  • DBAPI

    from trino.dbapi import connect
    from trino.auth import BasicAuthentication
    
    conn = connect(
        user="<username>",
        auth=BasicAuthentication("<username>", "<password>"),
        http_scheme="https",
        ...
    )
    
  • SQLAlchemy

    from sqlalchemy import create_engine
    
    engine = create_engine("trino://<username>:<password>@<host>:<port>/<catalog>")
    
    # or as connect_args
    from trino.auth import BasicAuthentication
    engine = create_engine(
        "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": BasicAuthentication("<username>", "<password>"),
            "http_scheme": "https",
        }
    )
    

JWT authentication

The JWTAuthentication class can be used to connect to a Trino cluster configured with the JWT authentication type:

  • DBAPI

    from trino.dbapi import connect
    from trino.auth import JWTAuthentication
    
    conn = connect(
        user="<username>",
        auth=JWTAuthentication("<jwt_token>"),
        http_scheme="https",
        ...
    )
    
  • SQLAlchemy

    from sqlalchemy import create_engine
    
    engine = create_engine("trino://<username>@<host>:<port>/<catalog>/<schema>?access_token=<jwt_token>")
    
    # or as connect_args
    from trino.auth import JWTAuthentication
    engine = create_engine(
        "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": JWTAuthentication("<jwt_token>"),
            "http_scheme": "https",
        }
    )
    

OAuth2 authentication

The OAuth2Authentication class can be used to connect to a Trino cluster configured with the OAuth2 authentication type.

A callback to handle the redirect url can be provided via param redirect_auth_url_handler of the trino.auth.OAuth2Authentication class. By default, it will try to launch a web browser (trino.auth.WebBrowserRedirectHandler) to go through the authentication flow and output the redirect url to stdout (trino.auth.ConsoleRedirectHandler). Multiple redirect handlers are combined using the trino.auth.CompositeRedirectHandler class.

The OAuth2 token will be cached either per trino.auth.OAuth2Authentication instance and username or, when keyring is installed, it will be cached within a secure backend (MacOS keychain, Windows credential locker, etc) under a key including host of the Trino connection. Keyring can be installed using pip install 'trino[external-authentication-token-cache]'.

[!WARNING] If username is not specified then the OAuth2 token cache is shared and stored per host.

  • DBAPI

    from trino.dbapi import connect
    from trino.auth import OAuth2Authentication
    
    conn = connect(
        user="<username>",
        auth=OAuth2Authentication(),
        http_scheme="https",
        ...
    )
    
  • SQLAlchemy

    from sqlalchemy import create_engine
    from trino.auth import OAuth2Authentication
    
    engine = create_engine(
    "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": OAuth2Authentication(),
            "http_scheme": "https",
        }
    )
    

Certificate authentication

CertificateAuthentication class can be used to connect to Trino cluster configured with certificate based authentication. CertificateAuthentication requires paths to a valid client certificate and private key.

  • DBAPI

    from trino.dbapi import connect
    from trino.auth import CertificateAuthentication
    
    conn = connect(
        user="<username>",
        auth=CertificateAuthentication("/path/to/cert.pem", "/path/to/key.pem"),
        http_scheme="https",
        ...
    )
    
  • SQLAlchemy

    from sqlalchemy import create_engine
    from trino.auth import CertificateAuthentication
    
    engine = create_engine("trino://<username>@<host>:<port>/<catalog>/<schema>?cert=<cert>&key=<key>")
    
    # or as connect_args
    engine = create_engine(
    "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": CertificateAuthentication("/path/to/cert.pem", "/path/to/key.pem"),
            "http_scheme": "https",
        }
    )
    

Kerberos authentication

Make sure that the Kerberos support is installed using pip install trino[kerberos]. The KerberosAuthentication class can be used to connect to a Trino cluster configured with the Kerberos authentication type:

  • DBAPI

    from trino.dbapi import connect
    from trino.auth import KerberosAuthentication
    
    conn = connect(
        user="<username>",
        auth=KerberosAuthentication(...),
        http_scheme="https",
        ...
    )
    
  • SQLAlchemy

    from sqlalchemy import create_engine
    from trino.auth import KerberosAuthentication
    
    engine = create_engine(
        "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": KerberosAuthentication(...),
            "http_scheme": "https",
        }
    )
    

GSSAPI authentication

Make sure that the GSSAPI support is installed using pip install trino[gssapi]. The GSSAPIAuthentication class can be used to connect to a Trino cluster configured with the Kerberos authentication type:

It follows the interface for KerberosAuthentication, but is using requests-gssapi, instead of requests-kerberos under the hood.

  • DBAPI

    from trino.dbapi import connect
    from trino.auth import GSSAPIAuthentication
    
    conn = connect(
        user="<username>",
        auth=GSSAPIAuthentication(...),
        http_scheme="https",
        ...
    )
    
  • SQLAlchemy

    from sqlalchemy import create_engine
    from trino.auth import GSSAPIAuthentication
    
    engine = create_engine(
        "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": GSSAPIAuthentication(...),
            "http_scheme": "https",
        }
    )
    

User impersonation

In the case where user who submits the query is not the same as user who authenticates to Trino server (e.g in Superset), you can set username to be different from principal_id. Note that principal_id is extracted from auth, for example username in BasicAuthentication, sub in JWT token or service-name in KerberosAuthentication. You need to make sure that principal_id has permission to impersonate username.

Extra credentials

Extra credentials can be sent as:

import trino
conn = trino.dbapi.connect(
    host='localhost',
    port=443,
    user='the-user',
    extra_credential=[('a.username', 'bar'), ('a.password', 'foo')],
)

cur = conn.cursor()
cur.execute('SELECT * FROM system.runtime.nodes')
rows = cur.fetchall()

Roles

Authorization roles to use for catalogs, specified as a dict with key-value pairs for the catalog and role. For example, {"catalog1": "roleA", "catalog2": "roleB"} sets roleA for catalog1 and roleB for catalog2. See Trino docs.

import trino
conn = trino.dbapi.connect(
    host='localhost',
    port=443,
    user='the-user',
    roles={"catalog1": "roleA", "catalog2": "roleB"},
)

You could also pass system role without explicitly specifing "system" catalog:

import trino
conn = trino.dbapi.connect(
    host='localhost',
    port=443,
    user='the-user',
    roles="role1" # equivalent to {"system": "role1"}
)

Timezone

The time zone for the session can be explicitly set using the IANA time zone name. When not set the time zone defaults to the client side local timezone.

import trino
conn = trino.dbapi.connect(
    host='localhost',
    port=443,
    user='username',
    timezone='Europe/Brussels',
)

NOTE: The behaviour till version 0.320.0 was the same as setting session timezone to UTC. To preserve that behaviour pass timezone='UTC' when creating the connection.

SSL

SSL verification

In order to disable SSL verification, set the verify parameter to False.

from trino.dbapi import connect
from trino.auth import BasicAuthentication

conn = connect(
    user="<username>",
    auth=BasicAuthentication("<username>", "<password>"),
    http_scheme="https",
    verify=False
)

Self-signed certificates

To use self-signed certificates, specify a path to the certificate in verify parameter. More details can be found in the Python requests library documentation.

from trino.dbapi import connect
from trino.auth import BasicAuthentication

conn = connect(
    user="<username>",
    auth=BasicAuthentication("<username>", "<password>"),
    http_scheme="https",
    verify="/path/to/cert.crt"
)

Transactions

The client runs by default in autocommit mode. To enable transactions, set isolation_level to a value different than IsolationLevel.AUTOCOMMIT:

from trino.dbapi import connect
from trino.transaction import IsolationLevel

with connect(
        isolation_level=IsolationLevel.REPEATABLE_READ,
        ...
) as conn:
    cur = conn.cursor()
    cur.execute('INSERT INTO sometable VALUES (1, 2, 3)')
    cur.fetchall()
    cur.execute('INSERT INTO sometable VALUES (4, 5, 6)')
    cur.fetchall()

The transaction is created when the first SQL statement is executed. trino.dbapi.Connection.commit() will be automatically called when the code exits the with context and the queries succeed, otherwise trino.dbapi.Connection.rollback() will be called.

Legacy Primitive types

By default, the client will convert the results of the query to the corresponding Python types. For example, if the query returns a DECIMAL column, the result will be a Decimal object. If you want to disable this behaviour, set flag legacy_primitive_types to True.

Limitations of the Python types are described in the Python types documentation. These limitations will generate an exception trino.exceptions.TrinoDataError if the query returns a value that cannot be converted to the corresponding Python type.

import trino

conn = trino.dbapi.connect(
    legacy_primitive_types=True,
    ...
)

cur = conn.cursor()
# Negative DATE cannot be represented with Python types
# legacy_primitive_types needs to be enabled
cur.execute("SELECT DATE '-2001-08-22'")
rows = cur.fetchall()

assert rows[0][0] == "-2001-08-22"
assert cur.description[0][1] == "date"

Trino to Python type mappings

Trino type Python type
BOOLEAN bool
TINYINT int
SMALLINT int
INTEGER int
BIGINT int
REAL float
DOUBLE float
DECIMAL decimal.Decimal
VARCHAR str
CHAR str
VARBINARY bytes
DATE datetime.date
TIME datetime.time
TIMESTAMP datetime.datetime
ARRAY list
MAP dict
ROW tuple

Trino types other than those listed above are not mapped to Python types. To use those use legacy primitive types.

Need help?

Feel free to create an issue as it makes your request visible to other users and contributors.

If an interactive discussion would be better or if you just want to hangout and chat about the Trino Python client, you can join us on the #python-client channel on Trino Slack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trino-0.330.0.tar.gz (50.7 kB view details)

Uploaded Source

Built Distribution

trino-0.330.0-py3-none-any.whl (53.5 kB view details)

Uploaded Python 3

File details

Details for the file trino-0.330.0.tar.gz.

File metadata

  • Download URL: trino-0.330.0.tar.gz
  • Upload date:
  • Size: 50.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for trino-0.330.0.tar.gz
Algorithm Hash digest
SHA256 1e731be22bc6fb4ce6537287419c3d221faaa8d089f5a05b0f01ef25b860e96e
MD5 0362c2bb1d6a7031784fe600a06fd7d0
BLAKE2b-256 fb4e9ac0c8acfff93f8cf34a332e1878d05ad0295a62af1a5355a06c0960cc16

See more details on using hashes here.

File details

Details for the file trino-0.330.0-py3-none-any.whl.

File metadata

  • Download URL: trino-0.330.0-py3-none-any.whl
  • Upload date:
  • Size: 53.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for trino-0.330.0-py3-none-any.whl
Algorithm Hash digest
SHA256 535f612d754338cfefa4b3fe86b63c8c000d21cb5ea476ae4ec4390d5cc37659
MD5 2b36c78fd7a1d5babc842c485b8bfb8c
BLAKE2b-256 537ee73e9ffd871387997f13c96e1ae1db558e678f6606aa1d067dac81f1f6fb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page