PyStarburst DataFrame API allows you to query and transform data in Starburst products in a data pipeline without having to download the data locally.
Project description
PyStarburst DataFrame API
PyStarburst DataFrame API allows you to query and transform data in Starburst products in a data pipeline without having to download the data locally.
Getting started
Install pystarburst
pip install pystarburst
Connect to a Trino server
The parameters are the same connect parameters as in Trino Python Client.
from pystarburst import Session
connection_parameters = {
"host": "localhost",
"port": 8080,
"user": "admin",
"catalog": "tpch",
"schema": "tiny"
}
session = Session.builder.configs(connection_parameters).create()
Using SQL
from pystarburst import Session
session = Session.builder.configs({ ... }).create()
session.sql("SELECT 1 as a").show()
Querying a table
from pystarburst import Session
session = Session.builder.configs({ ... }).create()
df = session.table("nation")
print(df.schema)
df.show()
Filtering a data frame
from pystarburst import Session
session = Session.builder.configs({ ... }).create()
df = session.table("nation")
df.filter(df.col("regionkey") == 0).show()
Joining data frames
from pystarburst import Session
session = Session.builder.configs({ ... }).create()
df = session.table("nation")
df.filter(df.col("regionkey") == 0).show()
Aggregation
from pystarburst import Session
from pystarburst.functions import col
session = Session.builder.configs({ ... }).create()
df = session.table("nation")
df.agg((col("regionkey"), "max"), (col("regionkey"), "avg")).show()
Development setup
Poetry is used for dependency management. Install the dependencies through poetry install
.
pre-commit
is used for code quality checks. Install through pre-commit install
.
The tests assume will either connect to your already running server on port 8080 or will start a new server based on your checked out starburst-dataframe project in the parent folder ../starburst-dataframe
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for pystarburst-0.6.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb14c04d2cbf8bcbf56f2cf7903405654eb8ebbe076bf8820299c4340999472e |
|
MD5 | 733028450f2705489a74b717b4e64991 |
|
BLAKE2b-256 | f56c7839acbee50c8aa542959106cef22dcc074c53761e38ab7530cc62e13565 |