DataJunction client library for connecting to a DataJunction server
Project description
DataJunction Python Client
Installation
To install:
pip install datajunction
Examples
To initialize the client:
from datajunction import DJClient
dj = DJClient("http://dj-endpoint:8000")
Catalogs and Engines
To list available catalogs for the DJ host:
dj.catalogs()
To list available engines for the DJ host:
dj.engines()
To create a catalog:
from datajunction import Catalog
catalog = Catalog(
name="prod"
)
catalog.publish()
To create an engine:
from datajunction import Engine
engine = Engine(
name="spark",
version="3.2.2",
uri="..."
)
engine.publish()
To attach an engine to a catalog:
catalog.add_engine(engine)
Nodes
All nodes for a given namespace can be found with:
dj.namespace("default").nodes()
Specific node types can be retrieved with:
dj.namespace("default").sources()
dj.namespace("default").dimensions()
dj.namespace("default").metrics()
dj.namespace("default").transforms()
dj.namespace("default").cubes()
To create a source node:
from datajunction import NodeMode
repair_orders = dj.new_source(
name="repair_orders",
display_name="Repair Orders",
description="Repair orders",
catalog="dj",
schema_="roads",
table="repair_orders",
)
repair_orders.save(mode=NodeMode.PUBLISHED)
Nodes can also be created as drafts with:
repair_orders.save(mode=NodeMode.DRAFT)
To create a dimension node:
repair_order = dj.new_dimension(
name="repair_order",
query="""
SELECT
repair_order_id,
municipality_id,
hard_hat_id,
dispatcher_id
FROM repair_orders
""",
description="Repair order dimension",
primary_key=["repair_order_id"],
)
repair_order.save()
To create a transform node:
large_revenue_payments_only = dj.new_transform(
name="large_revenue_payments_only",
query="""
SELECT
payment_id,
payment_amount,
customer_id,
account_type
FROM revenue
WHERE payment_amount > 1000000
""",
description="Only large revenue payments",
)
large_revenue_payments_only.save()
To create a metric:
num_repair_orders = dj.new_metric(
name="num_repair_orders",
query="""
SELECT
count(repair_order_id)
FROM repair_orders
""",
description="Number of repair orders",
)
num_repair_orders.save()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datajunction-0.0.1a14.tar.gz
(125.0 kB
view hashes)
Built Distribution
Close
Hashes for datajunction-0.0.1a14-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3901e031ed7d4d21a5600cf0fa1cde3b5fa87e42ed1572f043b20d67d260e011 |
|
MD5 | 55a9cb1c82c1c78631fbc758a21dd354 |
|
BLAKE2b-256 | a57cb9fd4b1d84bed41fa9d241c3ede1ac35d4cf2147f41ec11f5a64fe0ba7a1 |