Skip to main content

An opinionated Data Engineering framework

Project description

Rony - Data Engineering made simple

PyPI version fury.io Test package License GitHub issues GitHub issues-closed PyPI status PyPI pyversions PyPi downloads

An opinionated Data Engineering framework

Developed with ❤️ by A3Data

What is Rony

Rony is an open source framework that helps Data Engineers setting up more organized code and build, test and deploy data pipelines faster.

Why Rony?

Rony is Hermione's best friend (or so...). This was a perfect choice for naming the second framework released by A3Data, this one focusing on Data Engineering.

In many years on helping companies building their data analytics projects and cloud infrastructure, we acquired a knowledge basis that led to a collection of code snippets and automation procedures that speed things up when it comes to developing data structure and data pipelines.

Some choices we made

Rony relies on top of a few decisions that make sense for the majority of projects conducted by A3Data:

You are free to change this decisions as you wish (that's the whole point of the framework - flexibility).

Installing

Dependencies

  • Python (>=3.6)

Install

pip install -U rony

Enabling autocompletion (unix users):

For bash:

echo 'eval "$(_RONY_COMPLETE=bash_source rony)"' >> ~/.bashrc

For Zsh:

echo 'eval "$(_RONY_COMPLETE=zsh_source rony)"' >> ~/.zshrc

How do I use Rony?

After installing Rony you can test if the installation is ok by running:

rony info

and you shall see a cute logo. Then,

  1. Create a new project:
rony new project_rony
  1. Rony already creates a virtual environment for the project. Windows users can activate it with
<project_name>_env\Scripts\activate

Linux and MacOS users can do

source <project_name>_env/bin/activate
  1. After activating, you should install some libraries. There are a few suggestions in “requirements.txt” file:
pip install -r requirements.txt
  1. Rony has also some handy cli commands to build and run docker images locally. You can do
cd etl
rony build <image_name>:<tag>

to build an image and run it with

rony run <image_name>:<tag>

In this particular implementation, run.py has a simple etl code that accepts a parameter to filter the data based on the Sex column. To use that, you can do

docker run <image_name>:<tag> -s female

Implementation suggestions

When you start a new rony project, you will find

  • an infrastructure folder with terraform code creating on AWS:

    • an S3 bucket
    • a Lambda function
    • a CloudWatch log group
    • a ECR repository
    • a AWS Glue Crawler
    • IAM roles and policies for lambda and glue
  • an etl folder with:

    • a Dockerfile and a run.py example of ETL code
    • a lambda_function.py with a "Hello World" example
  • a tests folder with unit testing on the Lambda function

  • a .github/workflow folder with a Github Actions CI/CD pipeline suggestion. This pipeline

    • Tests lambda function
    • Builds and runs the docker image
    • Sets AWS credentials
    • Make a terraform plan (but not actually deploy anything)
  • a dags folder with some Airflow example code.f

You also have a scripts folder with a bash file that builds a lambda deploy package.

Feel free to adjust and adapt everything according to your needs.

Contributing

Have a look at our contributing guide.

Make a pull request with your implementation.

For suggestions, contact us: rony@a3data.com.br

Licence

Rony is open source and has Apache 2.0 License: License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rony-0.3.2.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

rony-0.3.2-py3-none-any.whl (56.4 kB view details)

Uploaded Python 3

File details

Details for the file rony-0.3.2.tar.gz.

File metadata

  • Download URL: rony-0.3.2.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for rony-0.3.2.tar.gz
Algorithm Hash digest
SHA256 ef6106993793c2b11a5df9b5515b818efe9c272ad65e5d7cc8a2c6dc1189ff28
MD5 40989cb977f0e5f54238c8bb8f6cf1ed
BLAKE2b-256 b66e043b70a971165edbaed70d81b124815e81b68260227d2428edde6c062e4a

See more details on using hashes here.

File details

Details for the file rony-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: rony-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 56.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for rony-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 bc20b8a7e45691310cf1f19559905760ca409eba6427a50225951edfc880a1f8
MD5 a66129d0a7e082f8296ee2c8469df42e
BLAKE2b-256 5d5009f8d36393bf8e66037f397fc8fe87e7b2fda219764090018af692fa6aa2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page