Apache Livy Client
Project description
livyc
Apache Livy Client
Install library
pip install livyc
Import library
from livyc import livyc
Setting livy configuration
data_livy = {
"livy_server_url": "localhost",
"port": "8998",
"jars": ["org.postgresql:postgresql:42.3.1"]
}
Let's try launch a pySpark script to Apache Livy Server
params = {"host": "localhost", "port":"5432", "database": "db", "table":"staging", "user": "postgres", "password": "pg12345"}
pyspark_script = """
from pyspark.sql.functions import udf, col, explode
from pyspark.sql.types import StructType, StructField, IntegerType, StringType, ArrayType
from pyspark.sql import Row
from pyspark.sql import SparkSession
df = spark.read.format("jdbc") \
.option("url", "jdbc:postgresql://{host}:{port}/{database}") \
.option("driver", "org.postgresql.Driver") \
.option("dbtable", "{table}") \
.option("user", "{user}") \
.option("password", "{password}") \
.load()
n_rows = df.count()
spark.stop()
"""
Creating an livyc Object
lvy = livyc.LivyC(data_livy)
Creating a new session to Apache Livy Server
session = lvy.create_session()
Send and execute script in the Apache Livy server
lvy.run_script(session, pyspark_script.format(**params))
Accesing to the variable "n_rows" available in the session
lvy.read_variable(session, "n_rows")
Contributing and Feedback
Any ideas or feedback about this repository?. Help me to improve it.
Authors
- Created by Ramses Alexander Coraspe Valdez
- Created on 2022
License
This project is licensed under the terms of the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
livyc-0.0.14.tar.gz
(19.2 kB
view hashes)
Built Distribution
livyc-0.0.14-py3-none-any.whl
(5.7 kB
view hashes)