Skip to main content

fal allows you to run python scripts directly from your dbt project.

Project description

fal: do more with dbt

fal allows you to run python scripts directly from your dbt project.

Downloads

Join Us on Discord

With fal, you can:

  • Send Slack notifications upon dbt model success or failure.
  • Download dbt models into a Python context with a familiar syntax: ref('my_dbt_model')
  • Use python libraries such as sklearn or prophet to build more complex pipelines downstream of dbt models.

and more...

Check out our Getting Started guide to get a quickstart or play with in-depth examples to see how fal can help you get more done with dbt.

Getting Started

1. Install fal

$ pip install fal

2. Go to your dbt directory

$ cd ~/src/my_dbt_project

3. Create a python script: send_slack_message.py

import os
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError

CHANNEL_ID = os.getenv("SLACK_BOT_CHANNEL")
SLACK_TOKEN = os.getenv("SLACK_BOT_TOKEN")

client = WebClient(token=SLACK_TOKEN)
message_text = f"Model: {context.current_model.name}. Status: {context.current_model.status}."

try:
    response = client.chat_postMessage(
        channel=CHANNEL_ID,
        text=message_text
    )
except SlackApiError as e:
    assert e.response["error"]

4. Add a meta section in your schema.yml

models:
  - name: historical_ozone_levels
    description: Ozone levels
    config:
      materialized: table
    columns:
      - name: ozone_level
        description: Ozone level
      - name: ds
        description: Date
    meta:
      fal:
        scripts:
          - send_slack_message.py

5. Run dbt and fal consecutively

$ dbt run
# Your dbt models are ran

$ fal run
# Your python scripts are ran

Examples

To explore what is possible with fal, take a look at the in-depth examples below. We will be adding more examples here over time:

How it works?

fal is a command line tool that can read the state of your dbt project and help you run Python scripts after your dbt runs by leveraging the meta config.

models:
  - name: historical_ozone_levels
    ...
    meta:
      fal:
        scripts:
          - send_slack_message.py
          - another_python_script.py # will be ran after the first script

By default, the fal run command runs the Python scripts as a post-hook, only on the models that were ran on the last dbt run (So if you are using model selectors, fal will only run on the selected models). If you want to run all Python scripts regardless, you can use the --all flag with the fal CLI.

fal also provides useful helpers within the Python context to seamlessly interact with dbt models: ref("my_dbt_model_name") will pull a dbt model into your Python script as a pandas.DataFrame.

Concepts

profile.yml and Credentials

fal integrates with dbt's profile.yml file to access and read data from the data warehouse. Once you setup credentials in your profile.yml file for your existing dbt workflows anytime you use ref or source to create a dataframe fal authenticates using the credentials specified in the profile.yml file.

meta Syntax

models:
  - name: historical_ozone_levels
    ...
    meta:
      owner: "@me"
      fal:
        scripts:
          - send_slack_message.py
          - another_python_script.py # will be run sequentially

Use the fal and scripts keys underneath the meta config to let fal CLI know where to look for the Python scripts. You can pass a list of scripts as shown above to run one or more scripts as a post-hook operation after a dbt run.

Variables and functions

Inside a Python script, you get access to some useful variables and functions

Variables

A context object with information relevant to the model through which the script was run. For the meta Syntax example, we would get the following:

context.current_model.name
#= historical_ozone_levels

context.current_model.meta
#= {'owner': '@me'}

context.current_model.meta.get("owner")
#= '@me'

context.current_model.status
# Could be one of
#= 'success'
#= 'error'
#= 'skipped'

ref and source functions

There are also available some familiar functions from dbt

# Refer to dbt models or sources by name and returns it as `pandas.DataFrame`
ref('model_name')
source('source_name', 'table_name')

# You can use it to get the running model data
ref(context.current_model.name)

write_to_source function

It is also possible to send data back to your datawarehouse. This makes it easy to get the data, process it and upload it back into dbt territory.

All you have to do is define the target source in your schema and use it in fal. This operation appends to the existing source by default and should only be used targetting tables, not views.

# Upload a `pandas.DataFrame` back to the datawarehouse
write_to_source(df, 'source_name', 'table_name2')

Lifecycle and State Management

By default, the fal run command runs the Python scripts as a post-hook, only on the models that were ran on the last dbt run (So if you are using model selectors, fal will only run on the selected models).

If you want to run all Python scripts regardless, you can do so by using the --all flag with the fal CLI:

$ fal run --all

Importing fal as a Python package

You may be interested in accessing dbt models and sources easily from a Jupyter Notebook or another Python script. For that, just import the fal package and intantiate a FalDbt project:

from fal import FalDbt
faldbt = FalDbt(profiles_dir="~/.dbt", project_dir="../my_project")

faldbt.list_sources()
# [['results', 'ticket_data_sentiment_analysis']]

faldbt.list_models()
# {
#   'zendesk_ticket_metrics': <RunStatus.Success: 'success'>,
#   'stg_o3values': <RunStatus.Success: 'success'>,
#   'stg_zendesk_ticket_data': <RunStatus.Success: 'success'>,
#   'stg_counties': <RunStatus.Success: 'success'>
# }

sentiments = faldbt.source('results', 'ticket_data_sentiment_analysis')
# pandas.DataFrame
tickets = faldbt.ref('stg_zendesk_ticket_data')
# pandas.DataFrame

Supported dbt versions

Any extra configuration to work with different dbt versions is not needed, latest fal version currently supports:

  • 0.20.*
  • 0.21.*
  • 1.0.*

If you need another version, open an issue and we will take a look!

Why are we building this?

We think dbt is great because it empowers data people to get more done with the tools that they are already familiar with.

dbt's SQL only design is powerful, but if you ever want to get out of SQL-land and connect to external services or get into Python-land for any reason, you will have a hard time. We built fal to enable Python workloads (sending alerts to Slack, building predictive models, pushing data to non-data warehose destinations and more) right within dbt.

This library will form the basis of our attempt to more comprehensively enable data science workloads downstream of dbt. And because having reliable data pipelines is the most important ingredient in building predictive analytics, we are building a library that integrates well with dbt.

Have feedback or need help?

Join us in #fal on Discord

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fal-0.1.36.tar.gz (30.5 kB view details)

Uploaded Source

Built Distribution

fal-0.1.36-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file fal-0.1.36.tar.gz.

File metadata

  • Download URL: fal-0.1.36.tar.gz
  • Upload date:
  • Size: 30.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.12 Linux/5.11.0-1022-azure

File hashes

Hashes for fal-0.1.36.tar.gz
Algorithm Hash digest
SHA256 4ca106b9719375bccc24b17a4f4ae2ddb4b9f23b923d0755883563849f33d9cb
MD5 d928335da4ebd16853a10861af0b52d0
BLAKE2b-256 6ed4154b0e9e48d59b1179b2f0c773c4069aced7a0852235b92524a76e1645c7

See more details on using hashes here.

File details

Details for the file fal-0.1.36-py3-none-any.whl.

File metadata

  • Download URL: fal-0.1.36-py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.12 Linux/5.11.0-1022-azure

File hashes

Hashes for fal-0.1.36-py3-none-any.whl
Algorithm Hash digest
SHA256 d41ec7d4d382c25b1ffc4a4e24d9c02341ac1bac391c1c209935495e4d37fcd0
MD5 924d52bf246d7fa6f87ff596267b9a13
BLAKE2b-256 2a73cbc07ced56abe6461c07a0adf7d84d0226ac34e776974d1b8b3e520f4cd3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page