fal allows you to run python scripts directly from your dbt project.

These details have been verified by PyPI

Maintainers

burkay_fal chamini2 fal.ai gorkemyurt

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

fal: do more with dbt

fal allows you to run Python scripts directly from your dbt project.

Sign up for the private beta of fal Cloud
See our March Roadmap and give us feedback.

With fal, you can:

Send Slack notifications upon dbt model success or failure.
Download dbt models into a Python context with a familiar syntax: ref('my_dbt_model')
Use Python libraries such as sklearn or prophet to build more complex pipelines downstream of dbt models.

and more...

Check out our Getting Started guide to get a quickstart, head to our documentation site for a deeper dive or play with in-depth examples to see how fal can help you get more done with dbt.

Getting Started

1. Install fal

$ pip install fal

2. Go to your dbt directory

$ cd ~/src/my_dbt_project

3. Create a Python script: `send_slack_message.py`

import os
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError

CHANNEL_ID = os.getenv("SLACK_BOT_CHANNEL")
SLACK_TOKEN = os.getenv("SLACK_BOT_TOKEN")

client = WebClient(token=SLACK_TOKEN)
message_text = f"Model: {context.current_model.name}. Status: {context.current_model.status}."

try:
    response = client.chat_postMessage(
        channel=CHANNEL_ID,
        text=message_text
    )
except SlackApiError as e:
    assert e.response["error"]

4. Add a `meta` section in your `schema.yml`

models:
  - name: historical_ozone_levels
    description: Ozone levels
    config:
      materialized: table
    columns:
      - name: ozone_level
        description: Ozone level
      - name: ds
        description: Date
    meta:
      fal:
        scripts:
          - send_slack_message.py

5. Run `dbt` and `fal` consecutively

$ dbt run
# Your dbt models are run

$ fal run
# Your python scripts are run

Examples

To explore what is possible with fal, take a look at the in-depth examples below. We will be adding more examples here over time:

Check out the examples directory for more

How it works?

fal is a command line tool that can read the state of your dbt project and help you run Python scripts after your dbt runs by leveraging the meta config.

models:
  - name: historical_ozone_levels
    ...
    meta:
      fal:
        scripts:
          - send_slack_message.py
          - another_python_script.py # will be run after the first script

fal also provides useful helpers within the Python context to seamlessly interact with dbt models: ref("my_dbt_model_name") will pull a dbt model into your Python script as a pandas.DataFrame.

Model scripts selection

By default, the fal run command runs the Python scripts as a post-hook, only on the models that were run on the last dbt run; that means that if you are using model selectors, fal will only run on the models dbt ran. To achieve this, fal needs the dbt-generated file run_results.json available.

If you are running fal in a clean environment (no run_results.json available) or just want to specify which models you want to run the scripts for, fal handles dbt's selection flags for dbt run as well as offering an extra flag for just running all models:

--all                 Run scripts for all models.
-s SELECT [SELECT ...], --select SELECT [SELECT ...]
                      Specify the nodes to include.
-m SELECT [SELECT ...], --models SELECT [SELECT ...]
                      Specify the nodes to include.
--exclude EXCLUDE [EXCLUDE ...]
                      Specify the nodes to exclude.
--selector SELECTOR   The selector name to use, as defined in selectors.yml

You may pass more than one selection at a time:

$ fal run --select model_alpha model_beta
... | Starting fal run for following models and scripts:
model_alpha: script.py
model_beta: script.py, other.py

Running scripts before dbt runs

The --before flag let's users run scripts before their dbt runs.

Given the following schema.yml:

models:
  - name: boston
    description: Ozone levels
    config:
      materialized: table
    meta:
      owner: "@meder"
      fal:
      	scripts:
          before:
            - fal_scripts/postgres.py
  	      after:
            - fal_scripts/slack.py

fal run --before will run fal_scripts/postgres.py script regardless if dbt has calculated the boston model or not. fal run without the --before flag, will run fal_scripts/slack.py, but only if boston model is already calculated by dbt.

A typical workflow involves running dbt run after invoking fal run --before.

$ fal run --before --select boston
$ dbt run --select boston

Concepts

profile.yml and Credentials

fal integrates with dbt's profile.yml file to access and read data from the data warehouse. Once you setup credentials in your profile.yml file for your existing dbt workflows anytime you use ref or source to create a dataframe fal authenticates using the credentials specified in the profile.yml file.

`meta` Syntax

models:
  - name: historical_ozone_levels
    ...
    meta:
      owner: "@me"
      fal:
        scripts:
          - send_slack_message.py
          - another_python_script.py # will be run sequentially

Use the fal and scripts keys underneath the meta config to let fal CLI know where to look for the Python scripts. You can pass a list of scripts as shown above to run one or more scripts as a post-hook operation after a dbt run.

Variables and functions

Inside a Python script, you get access to some useful variables and functions

Variables

A context object with information relevant to the model through which the script was run. For the meta Syntax example, we would get the following:

context.current_model.name
#= historical_ozone_levels

context.current_model.meta
#= {'owner': '@me'}

context.current_model.meta.get("owner")
#= '@me'

context.current_model.status
# Could be one of
#= 'success'
#= 'error'
#= 'skipped'

context object also has access to test information related to the current model. If the previous dbt command was either test or build, the context.current_model.test property is populated with a list of tests:

context.current_model.tests
#= [CurrentTest(name='not_null', modelname='historical_ozone_levels, column='ds', status='Pass')]

`ref` and `source` functions

There are also available some familiar functions from dbt

# Refer to dbt models or sources by name and returns it as `pandas.DataFrame`
ref('model_name')
source('source_name', 'table_name')

# You can use it to get the running model data
ref(context.current_model.name)

`write_to_source` function

It is also possible to send data back to your datawarehouse. This makes it easy to get the data, process it and upload it back into dbt territory.

All you have to do is define the target source in your schema and use it in fal. This operation appends to the existing source by default and should only be used targetting tables, not views.

# Upload a `pandas.DataFrame` back to the datawarehouse
write_to_source(df, 'source_name', 'table_name2')

write_to_source also accepts an optional dtype argument, which lets you specify datatypes of columns. It works the same way as dtype argument for DataFrame.to_sql function.

from sqlalchemy.types import Integer
# Upload but specifically create the `value` column with type `integer`
# Can be useful if data has `None` values
write_to_source(df, 'source', 'table', dtype={'value': Integer()})

Lifecycle and State Management

By default, the fal run command runs the Python scripts as a post-hook, only on the models that were run on the last dbt run (So if you are using model selectors, fal will only run on the selected models).

If you want to run all Python scripts regardless, you can do so by using the --all flag with the fal CLI:

$ fal run --all

Importing `fal` as a Python package

You may be interested in accessing dbt models and sources easily from a Jupyter Notebook or another Python script. For that, just import the fal package and intantiate a FalDbt project:

from fal import FalDbt
faldbt = FalDbt(profiles_dir="~/.dbt", project_dir="../my_project")

faldbt.list_sources()
# [['results', 'ticket_data_sentiment_analysis']]

faldbt.list_models()
# {
#   'zendesk_ticket_metrics': <RunStatus.Success: 'success'>,
#   'stg_o3values': <RunStatus.Success: 'success'>,
#   'stg_zendesk_ticket_data': <RunStatus.Success: 'success'>,
#   'stg_counties': <RunStatus.Success: 'success'>
# }

sentiments = faldbt.source('results', 'ticket_data_sentiment_analysis')
# pandas.DataFrame
tickets = faldbt.ref('stg_zendesk_ticket_data')
# pandas.DataFrame

Supported `dbt` versions

Any extra configuration to work with different dbt versions is not needed, latest fal version currently supports:

0.20.*
0.21.*
1.0.*

If you need another version, open an issue and we will take a look!

Contributing / Development

We use Poetry for dependency management and easy development testing.

Use Poetry shell to trying your changes right away:

~ $ cd fal

~/fal $ poetry install

~/fal $ poetry shell
Spawning shell within [...]/fal-eFX98vrn-py3.8

~/fal fal-eFX98vrn-py3.8 $ cd ../dbt_project

~/dbt_project fal-eFX98vrn-py3.8 $ fal run
19:27:30  Found 5 models, 0 tests, 0 snapshots, 0 analyses, 165 macros, 0 operations, 0 seed files, 1 source, 0 exposures, 0 metrics
19:27:30 | Starting fal run for following models and scripts:
[...]

Running tests

Tests rely on a Postgres database to be present, this can be achieved with docker-compose:

~/fal $ docker-compose -f tests/docker-compose.yml up -d
Creating network "tests_default" with the default driver
Creating fal_db ... done

# Necessary for the import test
~/fal $ dbt run --profiles-dir tests/mock/mockProfile --project-dir tests/mock
Running with dbt=1.0.1
[...]
Completed successfully
Done. PASS=5 WARN=0 ERROR=0 SKIP=0 TOTAL=5

~/fal $ pytest -s

Why are we building this?

We think dbt is great because it empowers data people to get more done with the tools that they are already familiar with.

dbt's SQL only design is powerful, but if you ever want to get out of SQL-land and connect to external services or get into Python-land for any reason, you will have a hard time. We built fal to enable Python workloads (sending alerts to Slack, building predictive models, pushing data to non-data warehose destinations and more) right within dbt.

This library will form the basis of our attempt to more comprehensively enable data science workloads downstream of dbt. And because having reliable data pipelines is the most important ingredient in building predictive analytics, we are building a library that integrates well with dbt.

Have feedback or need help?

Join us in #fal on Discord

Project details

These details have been verified by PyPI

Maintainers

burkay_fal chamini2 fal.ai gorkemyurt

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.2

May 10, 2024

1.0.1

May 10, 2024

1.0.0

May 10, 2024

0.15.2

May 6, 2024

0.15.0

Apr 23, 2024

0.14.0

Apr 11, 2024

0.13.0

Mar 29, 2024

0.12.7

Mar 29, 2024

0.12.6

Mar 28, 2024

0.12.5

Mar 20, 2024

0.12.4

Mar 19, 2024

0.12.3

Mar 7, 2024

0.12.2

Feb 6, 2024

0.12.1

Jan 22, 2024

0.12.0

Jan 19, 2024

0.11.6

Jan 15, 2024

0.11.5

Jan 15, 2024

0.11.4

Jan 15, 2024

0.11.3

Dec 14, 2023

0.11.2

Dec 1, 2023

0.11.1

Nov 10, 2023

0.11.0

Nov 10, 2023

0.10.11

Nov 2, 2023

0.10.10

Oct 10, 2023

0.10.9

Sep 25, 2023

0.10.8

Sep 6, 2023

0.10.7

Aug 28, 2023

0.10.6

Aug 24, 2023

0.10.5

Aug 24, 2023

0.10.4

Aug 24, 2023

0.10.3

Aug 24, 2023

0.10.1

Aug 23, 2023

0.10.0

Aug 21, 2023

0.9.5

Jul 17, 2023

0.9.4

Jul 17, 2023

0.9.3

Jul 17, 2023

0.9.2

Jun 28, 2023

0.9.1

May 16, 2023

0.9.0

May 12, 2023

0.8.6

Apr 25, 2023

0.8.5

Apr 21, 2023

0.8.4

Apr 4, 2023

0.8.3

Mar 28, 2023

0.8.2

Mar 14, 2023

0.8.1

Feb 14, 2023

0.8.0

Feb 10, 2023

0.7.7

Jan 12, 2023

0.7.6

Dec 22, 2022

0.7.5

Dec 16, 2022

0.7.4

Dec 14, 2022

0.7.3

Nov 11, 2022

0.7.2

Nov 1, 2022

0.7.1

Oct 25, 2022

0.7.0

Oct 13, 2022

0.6.1 yanked

Oct 13, 2022

Reason this release was yanked:

Should have been a minor, not a patch

0.6.0

Sep 21, 2022

0.5.2

Aug 10, 2022

0.5.1

Aug 10, 2022

0.5.0

Aug 8, 2022

0.4.1

Aug 3, 2022

0.4.0

Jul 26, 2022

0.3.6

Jul 13, 2022

0.3.5

Jun 30, 2022

0.3.4

Jun 27, 2022

0.3.3

Jun 22, 2022

0.3.2

Jun 21, 2022

0.3.1

Jun 14, 2022

0.3.0

Jun 3, 2022

0.2.20

May 30, 2022

0.2.19

May 19, 2022

0.2.18

May 10, 2022

0.2.17

May 6, 2022

0.2.16

Apr 29, 2022

0.2.15

Apr 22, 2022

0.2.14

Apr 19, 2022

0.2.13

Apr 19, 2022

This version

0.2.12

Apr 11, 2022

0.2.11

Mar 29, 2022

0.2.10

Mar 24, 2022

0.2.9

Mar 21, 2022

0.2.8

Mar 19, 2022

0.2.7

Feb 28, 2022

0.2.6

Feb 24, 2022

0.2.5

Feb 22, 2022

0.2.4

Feb 18, 2022

0.2.3

Feb 18, 2022

0.2.2

Feb 17, 2022

0.2.1

Feb 17, 2022

0.2.0

Feb 17, 2022

0.1.39

Feb 16, 2022

0.1.38

Jan 25, 2022

0.1.37

Jan 10, 2022

0.1.36

Dec 25, 2021

0.1.35

Dec 24, 2021

0.1.34

Dec 16, 2021

0.1.33

Dec 15, 2021

0.1.32

Dec 14, 2021

0.1.31

Dec 14, 2021

0.1.30

Nov 30, 2021

0.1.29

Nov 18, 2021

0.1.28

Nov 17, 2021

0.1.27

Nov 16, 2021

0.1.26

Nov 12, 2021

0.1.19

Nov 11, 2021

0.1.2

Nov 16, 2021

0.1.1

Nov 16, 2021

0.1.0

Mar 25, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fal-0.2.12.tar.gz (48.8 kB view hashes)

Uploaded Apr 11, 2022 Source

Built Distribution

fal-0.2.12-py3-none-any.whl (51.8 kB view hashes)

Uploaded Apr 11, 2022 Python 3

Hashes for fal-0.2.12.tar.gz

Hashes for fal-0.2.12.tar.gz
Algorithm	Hash digest
SHA256	`c9ce3776af7b06cd7f31c9cb636d067eca611f88ed3e0542b2809b75eaa69f65`
MD5	`4aa83463719aafa808bee1955c973c35`
BLAKE2b-256	`883007f5abb96d4efab2a3ccf9b5da5e1126a1b463e5545b8d9e05c67f7c0294`

Hashes for fal-0.2.12-py3-none-any.whl

Hashes for fal-0.2.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c337d012abf0e083b64b8fadd513b73d5a372f27d7f419d03d979bd1563b56ad`
MD5	`ff7d6850bf35ffe814528ebc30aadac2`
BLAKE2b-256	`3c1bd81694fbb16e7dc1e800cdccf96ec8f873e7244c64133e7e2c4e6a128a75`

fal 0.2.12

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

fal: do more with dbt

Getting Started

1. Install fal

2. Go to your dbt directory

3. Create a Python script: send_slack_message.py

4. Add a meta section in your schema.yml

5. Run dbt and fal consecutively

Examples

How it works?

Model scripts selection

Running scripts before dbt runs

Concepts

profile.yml and Credentials

meta Syntax

Variables and functions

Variables

ref and source functions

write_to_source function

Lifecycle and State Management

Importing fal as a Python package

Supported dbt versions

Contributing / Development

Running tests

Why are we building this?

Have feedback or need help?

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

3. Create a Python script: `send_slack_message.py`

4. Add a `meta` section in your `schema.yml`

5. Run `dbt` and `fal` consecutively

`meta` Syntax

`ref` and `source` functions

`write_to_source` function

Importing `fal` as a Python package

Supported `dbt` versions