Skip to main content

An opinionated Data Engineering framework

Project description

Rony - Data Engineering made simple

PyPI version fury.io Test package License GitHub issues GitHub issues-closed PyPI status PyPI pyversions PyPi downloads

An opinionated Data Engineering framework

Developed with ❤️ by A3Data

What is Rony

Rony is an open source framework that helps Data Engineers setting up more organized code and build, test and deploy data pipelines faster.

Why Rony?

Rony is Hermione's best friend (or so...). This was a perfect choice for naming the second framework released by A3Data, this one focusing on Data Engineering.

In many years on helping companies building their data analytics projects and cloud infrastructure, we acquired a knowledge basis that led to a collection of code snippets and automation procedures that speed things up when it comes to developing data structure and data pipelines.

Some choices we made

Rony relies on top of a few decisions that make sense for the majority of projects conducted by A3Data:

You are free to change this decisions as you wish (that's the whole point of the framework - flexibility).

Installing

Dependencies

  • Python (>=3.6)

Install

pip install -U rony

Enabling autocompletition (linux users):

For bash:

echo 'eval "$(_RONY_COMPLETE=source_bash rony)"' >> ~/.bashrc

For Zsh:

echo 'eval "$(_RONY_COMPLETE=source_zsh rony)"' >> ~/.zshrc

How do I use Rony?

After installing Rony you can test if the installation is ok by running:

rony info

and you shall see a cute logo. Then,

  1. Create a new project:
rony new project_rony
  1. Rony already creates a virtual environment for the project. Windows users can activate it with
<project_name>_env\Scripts\activate

Linux and MacOS users can do

source <project_name>_env/bin/activate
  1. After activating, you should install some libraries. There are a few suggestions in “requirements.txt” file:
pip install -r requirements.txt
  1. Rony has also some handy cli commands to build and run docker images locally. You can do
cd etl
rony build <image_name>:<tag>

to build an image and run it with

rony run <image_name>:<tag>

In this particular implementation, run.py has a simple etl code that accepts a parameter to filter the data based on the Sex column. To use that, you can do

docker run <image_name>:<tag> -s female

Implementation suggestions

When you start a new rony project, you will find

  • an infrastructure folder with terraform code creating on AWS:

    • an S3 bucket
    • a Lambda function
    • a CloudWatch log group
    • a ECR repository
    • a AWS Glue Crawler
    • IAM roles and policies for lambda and glue
  • an etl folder with:

    • a Dockerfile and a run.py example of ETL code
    • a lambda_function.py with a "Hello World" example
  • a tests folder with unit testing on the Lambda function

  • a .github/workflow folder with a Github Actions CI/CD pipeline suggestion. This pipeline

    • Tests lambda function
    • Builds and runs the docker image
    • Sets AWS credentials
    • Make a terraform plan (but not actually deploy anything)
  • a dags folder with some Airflow example code.f

You also have a scripts folder with a bash file that builds a lambda deploy package.

Feel free to adjust and adapt everything according to your needs.

Contributing

Have a look at our contributing guide.

Make a pull request with your implementation.

For suggestions, contact us: rony@a3data.com.br

Licence

Rony is open source and has Apache 2.0 License: License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rony-0.2.0.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

rony-0.2.0-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file rony-0.2.0.tar.gz.

File metadata

  • Download URL: rony-0.2.0.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5

File hashes

Hashes for rony-0.2.0.tar.gz
Algorithm Hash digest
SHA256 02da48d3fce9939bdfb7560a8069ab5369801918d14a7c45da191fe5bd9888c3
MD5 fc573d895037f7b679181dc4625c9427
BLAKE2b-256 154e6c0ce5bb969f8c6d5b2ddd5a7c3b5a6abd1e32ff46a96ce7684e4d5b9423

See more details on using hashes here.

File details

Details for the file rony-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rony-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 26.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5

File hashes

Hashes for rony-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b279c4dcab8b650812f5ef01c1f9f304c88091938d5fd35f75fb12e41108d3c
MD5 a856a82859dcc39f7a78cf195131d37b
BLAKE2b-256 e2fa3b4897bfd11693f7f82030fd797a7abfc4ba8da8a6219f6ffff8073bb13a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page