Skip to main content

An opinionated Data Engineering framework

Project description

Rony - Data Engineering made simple

PyPI version fury.io Test package License GitHub issues GitHub issues-closed PyPI status PyPI pyversions PyPi downloads

An opinionated Data Engineering framework

Developed with ❤️ by A3Data

What is Rony

Rony is an open source framework that helps Data Engineers setting up more organized code and build, test and deploy data pipelines faster.

Why Rony?

Rony is Hermione's best friend (or so...). This was a perfect choice for naming the second framework released by A3Data, this one focusing on Data Engineering.

In many years on helping companies building their data analytics projects and cloud infrastructure, we acquired a knowledge basis that led to a collection of code snippets and automation procedures that speed things up when it comes to developing data structure and data pipelines.

Some choices we made

Rony relies on top of a few decisions that make sense for the majority of projects conducted by A3Data:

You are free to change this decisions as you wish (that's the whole point of the framework - flexibility).

Installing

Dependencies

  • Python (>=3.6)

Install

pip install -U rony

Enabling autocompletition (linux users):

For bash:

echo 'eval "$(_RONY_COMPLETE=source_bash rony)"' >> ~/.bashrc

For Zsh:

echo 'eval "$(_RONY_COMPLETE=source_zsh rony)"' >> ~/.zshrc

How do I use Rony?

After installing Rony you can test if the installation is ok by running:

rony info

and you shall see a cute logo. Then,

  1. Create a new project:
rony new project_rony
  1. Rony already creates a virtual environment for the project. Windows users can activate it with
<project_name>_env\Scripts\activate

Linux and MacOS users can do

source <project_name>_env/bin/activate
  1. After activating, you should install some libraries. There are a few suggestions in “requirements.txt” file:
pip install -r requirements.txt
  1. Rony has also some handy cli commands to build and run docker images locally. You can do
cd etl
rony build <image_name>:<tag>

to build an image and run it with

rony run <image_name>:<tag>

In this particular implementation, run.py has a simple etl code that accepts a parameter to filter the data based on the Sex column. To use that, you can do

docker run <image_name>:<tag> -s female

Implementation suggestions

When you start a new rony project, you will find

  • an infrastructure folder with terraform code creating on AWS:

    • an S3 bucket
    • a Lambda function
    • a CloudWatch log group
    • a ECR repository
    • a AWS Glue Crawler
    • IAM roles and policies for lambda and glue
  • an etl folder with:

    • a Dockerfile and a run.py example of ETL code
    • a lambda_function.py with a "Hello World" example
  • a tests folder with unit testing on the Lambda function

  • a .github/workflow folder with a Github Actions CI/CD pipeline suggestion. This pipeline

    • Tests lambda function
    • Builds and runs the docker image
    • Sets AWS credentials
    • Make a terraform plan (but not actually deploy anything)
  • a dags folder with some Airflow example code.f

You also have a scripts folder with a bash file that builds a lambda deploy package.

Feel free to adjust and adapt everything according to your needs.

Contributing

Make a pull request with your implementation.

For suggestions, contact us: rony@a3data.com.br

Licence

Rony is open source and has Apache 2.0 License: License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rony-0.1.17.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rony-0.1.17-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file rony-0.1.17.tar.gz.

File metadata

  • Download URL: rony-0.1.17.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for rony-0.1.17.tar.gz
Algorithm Hash digest
SHA256 56e19bdfc8638ba0fad44040bf0b82927ebe13cd9dceae480f9124a93e26c5fa
MD5 f6694bf0a2912f5dcafd956fe7855456
BLAKE2b-256 d4874b770b7c2a6373b5180f0e3be1fbd842bff13f96a2bfb015416865985eee

See more details on using hashes here.

File details

Details for the file rony-0.1.17-py3-none-any.whl.

File metadata

  • Download URL: rony-0.1.17-py3-none-any.whl
  • Upload date:
  • Size: 24.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for rony-0.1.17-py3-none-any.whl
Algorithm Hash digest
SHA256 42883c613c86db9492b35da745c6657f21915338268cecda2d52d1aa62dfb3be
MD5 b049d680919a38b5d0b2811b6cba82c1
BLAKE2b-256 b32dc94bf7ce27f0885a87ccfce2f5719d919a20c7b3486d8f835458449b8378

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page