DataJunction client library for connecting to a DataJunction server
Project description
DataJunction Python Client
Installation
To install:
pip install datajunction
Examples
To initialize the client:
from datajunction import DJClient
dj = DJClient("http://dj-endpoint:8000")
Catalogs and Engines
To list available catalogs for the DJ host:
dj.catalogs()
To list available engines for the DJ host:
dj.engines()
To create a catalog:
from datajunction import Catalog
catalog = Catalog(
name="prod"
)
catalog.publish()
To create an engine:
from datajunction import Engine
engine = Engine(
name="spark",
version="3.2.2",
uri="..."
)
engine.publish()
To attach an engine to a catalog:
catalog.add_engine(engine)
Nodes
All nodes for a given namespace can be found with:
dj.namespace("default").nodes()
Specific node types can be retrieved with:
dj.namespace("default").sources()
dj.namespace("default").dimensions()
dj.namespace("default").metrics()
dj.namespace("default").transforms()
dj.namespace("default").cubes()
To create a source node:
from datajunction import NodeMode
repair_orders = dj.new_source(
name="repair_orders",
display_name="Repair Orders",
description="Repair orders",
catalog="dj",
schema_="roads",
table="repair_orders",
)
repair_orders.save(mode=NodeMode.PUBLISHED)
Nodes can also be created as drafts with:
repair_orders.save(mode=NodeMode.DRAFT)
To create a dimension node:
repair_order = dj.new_dimension(
name="repair_order",
query="""
SELECT
repair_order_id,
municipality_id,
hard_hat_id,
dispatcher_id
FROM repair_orders
""",
description="Repair order dimension",
primary_key=["repair_order_id"],
)
repair_order.save()
To create a transform node:
large_revenue_payments_only = dj.new_transform(
name="large_revenue_payments_only",
query="""
SELECT
payment_id,
payment_amount,
customer_id,
account_type
FROM revenue
WHERE payment_amount > 1000000
""",
description="Only large revenue payments",
)
large_revenue_payments_only.save()
To create a metric:
num_repair_orders = dj.new_metric(
name="num_repair_orders",
query="""
SELECT
count(repair_order_id)
FROM repair_orders
""",
description="Number of repair orders",
)
num_repair_orders.save()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datajunction-0.0.1a11.tar.gz
(123.7 kB
view hashes)
Built Distribution
Close
Hashes for datajunction-0.0.1a11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2e1b189981764df87e2548b849e27fd5a50c3bdc843e89b225ca598b2bd729f |
|
MD5 | eceeaac9126f5e32c7b6d2fc108d0034 |
|
BLAKE2b-256 | a8627330f0adbc14edef0dcc088257d32cf00a5f34c5b690c1d199829835d7fd |