Skip to main content

The ingestion game CLI

Project description

The Ingestion Game

The Ingestion Game consists of:

  1. A data file with a varying number of entries containing some data. It is shaped as a CSV where each row has an arbitrary number of key=value pairs separated by comma.
    key1=value1,key2=value2
    key3=value3
    key4=value4,key5=value5,key6=value6
    
  2. Not all keys are always important! Each time we play the game, the keys to process can be different. The rest are just considered noise. We want to ingest all the important keys.
  3. All important keys are always going to be present in the data rows of the input file.
  4. Each key supports a specific data type. We should only process rows where the values match the expected datatype of each key.
  5. Each row consists of an Entity of a given type. The type will be informed as type=T in the data. type is always considered as an important key and we can assume it is a str.
  6. Entities have hierarchies defined in the games' rules. For example, given the rule A -> B, we cannot process any Entity of type B until we have processed all Entities of type A.

Your task is to win the Ingestion Game with the following rules:

  • The important keys are id:int, name:str, food:str and type:str.
  • Entities follow the hierarchy A -> B -> C.

The solution must be coded in Python and you can use any public libraries. The solution must be an CLI that outputs the processed data to stdout in CSV format following the important keys, e.g.,:

id,name,food,type
1,levy,veggies,A
2,lima,pizza,A
3,john,fish,B

We are going to provide a data file with the keys and hierarchies mentioned above, but your CLI should accept any important keys and hierarchies as inputs. You can see an example on input data in the provided input.txt.

🤓 We value in the solution

  • Good software design
  • Proper documentation
  • Compliance to Python standards and modern usages (e.g.: PEP8)
  • Proper use of data structures
  • Ergonomy of the command line interface
  • Setup/Launch instructions if required

Solution

Clone the project from GitHub:

git clone git@github.com:collate-hiring/python-test-txemac.git

Set up environment and install requirements:

python3 -m venv env
pip install -r src/requirements.txt

Run

You can see the help:

python src/main.py --help

For to run, you need to add 3 parameters to the call:

  • IMPORTANT_KEYS: [required] separate by commas without spaces example: id:int,name:str,food:str,type:str
  • HIERARCHY: [required] separate by commas without spaces example: A,B,C
  • PATH: [required] path of the input file example: /files/input.txt

Example:

python main.py id:int,name:str,type:str A,B,C /file/input.txt

Tests

Install requirements:

pip install -r tests/requirements.txt
pytest -vvv

Flake8

flake8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ingestion_game-0.1.1.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

ingestion_game-0.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file ingestion_game-0.1.1.tar.gz.

File metadata

  • Download URL: ingestion_game-0.1.1.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for ingestion_game-0.1.1.tar.gz
Algorithm Hash digest
SHA256 284079e7c1959c4069429ec488f7e2e1de83af216811f68632e9410f7152bf81
MD5 76d825edc3258f7ec9a7dd0206024ed9
BLAKE2b-256 e87c4930d06ef7d629e51a09d07a2d2720558eafea2cd89e03e576fa224662ad

See more details on using hashes here.

File details

Details for the file ingestion_game-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ingestion_game-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for ingestion_game-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a73fead0d0bd5937700b7f7218ecf7ed430278966055b33775d40895517809e1
MD5 7a4d9bd65ca0aba76402ce20542b345b
BLAKE2b-256 0e43548376a03b67e41cc5c67bb30153a656913fc77a29dee35c089d530e7b04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page