Skip to main content

No project description provided

Project description

A CLI for the containerized data orchestration world

PyPI Package Documentation Status Git tag Test status Code coverage

Fyoo is a consistent, extendable, templated CLI for dataflow operations. Fyoo makes sure that the individual tasks in data orchestration behave in the same way, so that every building block is easily understood and glued together.

Using Flows

You can install Fyoo from PyPI:

pip install fyoo

Note: Pipenv is the best deterministic dependency tool for building applications.

Fyoo provides two main features for those using the Fyoo CLI:

  • Consistent templating

  • Easy resource configuration

Templated Arguments

The simplest flow (subparser) built in is hello, which has an optional argument for the message. All arguments on every flow are templated, assuming the argument type is a string.

fyoo hello --message 'The date is {{ date() }}'
# The date is 2020-02-25

But arguments on Fyoo precurse the Flow subcommand, so you can provide context in the same way on every different Flow. Arguments to fyoo will always be the same, but the Flow subcommand may have any arguments it wishes.

# `hello` has an optional argument --message,
# and `touch` has a required argument filename.
# But both receive context from the --jinja-context
# argument on `fyoo`.

fyoo --jinja-context='{"a": "any_var"}' \
  hello --message 'Hello {{ a }}!'
# Hello any_var!

fyoo --jinja-context='{"a": "any_var"}' \
  touch '{{ a }}.txt'
ls -Ut | head -1
# any_var.txt

Resource Configuration

Note: Run postgres in the background if you’d like to try the following examples:

docker run --name fyoo-pg \
    -e POSTGRES_PASSWORD=secretpass \
    -p 5432:5432 \
    -d postgres

Fyoo resources are configured in a single way for all Flows. Simply add to a fyoo.ini file, and run Fyoo from the same directory.

# fyoo.ini

[postgres]
username = postgres
password = %(FYOO_POSTGRES_PASSWORD)s
host = 127.0.0.1
FYOO_POSTGRES_PASSWORD=supersecret \
fyoo \
  postgres_query_to_csv_file \
  'select {{ date() }} as d' out.csv
cat out.csv
# d
# "2020-01-01"

Running it All Together

The real power of Fyoo comes together when you use templating and resources together. Template and resource specification are generally static, so they can and should be declaratively set (with particular resource credentials provided at runtime). This means that executable arguments never change.

Here is an example putting it all together. We use the contents of a sql template file to run a query, and output to a csv file of the current date.

-- table_counter.tpl.sql

{% for i in range(0, num) %}
  {% if not loop.first %}union all{% endif %}
  select {{ i }} as a
{% endfor %}
FYOO_POSTGRES_PASSWORD=supersecret \
fyoo \
  --jinja-context '{"num": 5}' \
  postgres_query_to_csv_file \
  "$(cat table_counter.tpl.sql)" \
  'results-{{ date() }}.csv'

Building Flows

Flows are Fyoo’s subcommands, which are written as functions. Fyoo decorators allow you to build custom CLIs quickly and easily. When writing a Flow, you simply need to know your arguments and FyooResource’s that you will use. There are three main decorators.

@fyoo.flow will do one thing:

  1. Usage: Expose your Flow function as a CLI subcommand of fyoo

Once you have a Flow, @fyoo.argument will do two things if your Flow needs arguments:

  1. Usage: Add an argparse argument to the Flow CLI

  2. Implementation: Add a templated in version of that CLI argument as a keyword argument to the Flow function,

Lastly, @fyoo.resource will do one thing if your Flow needs a resource:

  1. Usage: Add that resource as a keyword argument to the Flow function, based on the contents of fyoo.ini.

Here is a minimalist example of fyoo postgres_query_to_csv_file, with less optional arguments than the real version:

@fyoo.argument('--query-batch-size', type=int, default=10_000)
@fyoo.argument('target')
@fyoo.argument('sql')
@fyoo.resource(PostgresResource)
@fyoo.flow()
def postgres_query_to_csv_file(
        postgres: Connection,
        sql: str,
        target: str,
        query_batch_size: int,
):
    result_proxy: ResultProxy = postgres.execute(sql)

    with open(target, 'w') as f:
        writer = csv.writer(f)
        writer.writerow(result_proxy.keys())
        while result_proxy.returns_rows:
            rows = result_proxy.fetchmany(query_batch_size)
            if not rows:
                break
            writer.writerows(rows)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fyoo-0.0.1b2.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

fyoo-0.0.1b2-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file fyoo-0.0.1b2.tar.gz.

File metadata

  • Download URL: fyoo-0.0.1b2.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for fyoo-0.0.1b2.tar.gz
Algorithm Hash digest
SHA256 cc27bc1effc93e22a0636016c270173e78462890949b876f0ea9fefbbb5e933b
MD5 5a4e3aa69a863c2db7ddc0b0dcd14251
BLAKE2b-256 d6b0cc5ab3d0071b5efcaf1f569354d5f726a48d64bfc4c4295f552be83b5bbd

See more details on using hashes here.

File details

Details for the file fyoo-0.0.1b2-py3-none-any.whl.

File metadata

  • Download URL: fyoo-0.0.1b2-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for fyoo-0.0.1b2-py3-none-any.whl
Algorithm Hash digest
SHA256 bacba4ca5f7ce8f95b7f1b85cd73cba9c6fc7ed13359f05608b72ebf92d894df
MD5 09548b9a03fc85ec782f3b1fb3e9ba88
BLAKE2b-256 f7764a8e3a3a5949e8175f80613a105bfd99742626d194916d2c65ebd7b1b159

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page