Skip to main content

Run transforms quickly on local machine and analyse using DuckDb and pyspark

Project description

Transforms package

A package for transforming data using transform syntax similar to Palantir Foundry. It enables you to write and test transforms locally before running them on foundry. Also they can be run faster using duckdb as underlying data engine.

Getting Started

Installation

Install the package using pip:

pip install foundry-duck-transforms

Basic Usage

  1. Create a transform file (e.g. my_transform.py)
  2. Run it using the transforms CLI:
python -m transforms.run my_transform.py dev,master

Above command will run the transform while downloading data from dev, fallbacking to master if dataset has no data on dev.

CLI Options

The transforms runner supports several options to customize execution:

python -m transforms.run [OPTIONS] TRANSFORM_TO_RUN FALLBACK_BRANCHES

Available options:

  • --engine [spark|duckdb|spark-sail]: Engine to use for the transformation (default: spark)
  • --omit-checks: Disables checks running
  • --sail-server-url TEXT: Sail server url (required when using spark-sail engine)
  • --dry-run: Dry run the transformation without writing results
  • --local-dev-branch-name TEXT: Branch name for local development (default: "duck-fndry-dev")

Example with options:

python -m transforms.run my_transform.py dev,master --engine duckdb --dry-run

Development Setup

Prerequisites

  • Python 3.7+
  • pip
  • Access to Palantir Foundry environment

Local Development

  1. Clone the repository
  2. Install development dependencies:
pip install -e ".[dev]"

Foundry Dev Tools Configuration

See here for detailed configuration instructions.

VSCode Setup

Add this to your .vscode/launch.json for debugging support:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python Debugger: Current File",
      "type": "debugpy",
      "request": "launch",
      "module": "transforms.run",
      "args": ["${file}", "dev,master"],
      "console": "integratedTerminal"
    }
  ]
}

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foundry_duck_transforms-0.1.11.tar.gz (89.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foundry_duck_transforms-0.1.11-py3-none-any.whl (41.4 kB view details)

Uploaded Python 3

File details

Details for the file foundry_duck_transforms-0.1.11.tar.gz.

File metadata

  • Download URL: foundry_duck_transforms-0.1.11.tar.gz
  • Upload date:
  • Size: 89.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for foundry_duck_transforms-0.1.11.tar.gz
Algorithm Hash digest
SHA256 eba3b8cb1c0ba84182e9cc859f59f97fbfbe7485ae9747de50e1064bcb0a6a8a
MD5 939d5d227d60f35157146b715c9ca772
BLAKE2b-256 1d1e6b3609b7b0f3751ed314faca19780e6a57f37a48406b6d8432bfee6d11c1

See more details on using hashes here.

File details

Details for the file foundry_duck_transforms-0.1.11-py3-none-any.whl.

File metadata

File hashes

Hashes for foundry_duck_transforms-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 9b1221a2e1285f6549462557b079b2d682c82c65cb7833a61a2d159354e9a64c
MD5 92f5a71777bf9d38a590cd57d89a2578
BLAKE2b-256 a78a0ecc1e6e586a69b397a7971910d1545b6a91a4dd45b7e91f2f7d787f634d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page