Skip to main content

just a engine template tool

Project description

Jett

Just a Engine Template Tool that easy to use and develop for Data Engineer. This project support the ETL template for multiple DataFrame engine like PySpark, Duckdb, Polars, etc.

Supported Features:

  • Dynamic Supported Transform Engines via configuration
  • JSON Schema Validation on any IDE

📦 Installation

uv pip install -U jett

Version Tracking:

Package Version Next Support
Python 3.10.13 >=3.11.0
Spark 3.4.2 >=4.0.0
Hadoop 3 3
Java openjdk@11 openjdk@17
Pyspark 3.4.1 >=4.0.0
Scala 2.12.17 2.12.17
DuckDB 1.3.2
Polars 1.32.0
Arrow 21.0.0

Engine Supported:

Name Status Description
Pyspark Pyspark and Spark submit CLI for distributed workload
DuckDB DuckDB and Spark API DuckDB
Polars Polars for Python workload
Arrow Arrow for Python workflow with Columnar
Daft
DBT DBT for SQL workload

📝 Usage

For example, making file, etl.polars.tool (I use .tool be file extension for validate it with the JSON schema with pattern *.tool), for ETL state like:

type: polars
name: Load CSV to GGSheet
app_name: load_csv_to_ggsheet
master: local

# 1) 🚰 Load data from source
source:
  type: local
  file_format: csv
  path: ./assets/data/customer.csv

# 2) ⚙️ Transform this data.
transforms:
  - op: rename_to_snakecase
  - op: group
    transforms:
      - op: expr
        sql: "CAST(id AS string)"

# 3) 🎯 Sink result to target
sink:
  type: local
  file_type: google_sheet
  path: ./assets/landing/customer.gsheet

# 4) 📩 Metric that will send after execution.
metric:
  - type: console
    convertor: basic
  - type: restapi
    convertor: basic
    host: "localhost"
    port: 1234

Use by Python API:

from jett import Tool

tool = Tool(path="./etl.spark.tool")
tool.execute(allow_raise=True)

📖 Documents

This project will reference emoji from the Pipeline Emojis.

💬 Contribute

I do not think this project will go around the world because it has specific propose, and you can create by your coding without this project dependency for long term solution. So, on this time, you can open the GitHub issue on this project 🙌 for fix bug or request new feature if you want it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jett-0.0.1.tar.gz (771.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jett-0.0.1-py3-none-any.whl (116.3 kB view details)

Uploaded Python 3

File details

Details for the file jett-0.0.1.tar.gz.

File metadata

  • Download URL: jett-0.0.1.tar.gz
  • Upload date:
  • Size: 771.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for jett-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b01c8bfbbbe5996c9cf04994d7ad69d1d711a3ae7e98cf97cc3bedf1a3c13ef2
MD5 b6b0a73f7624c9893161957332499f42
BLAKE2b-256 45665401bd90b230bcbae934ac18733a115317ec33f91ce6c5500230b1afddfd

See more details on using hashes here.

Provenance

The following attestation bundles were made for jett-0.0.1.tar.gz:

Publisher: publish.yml on ddeutils/jett

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jett-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: jett-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 116.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for jett-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 450c8f17f71358d1a790189cc3488ec6b642a76e0031427a0d109fefff521618
MD5 6413864482cbb70626638e788ce0ff32
BLAKE2b-256 18608e05797f87211c8df2664aa14a0b2b0a4a39c15e8fa00fcb793ba7ae81d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for jett-0.0.1-py3-none-any.whl:

Publisher: publish.yml on ddeutils/jett

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page