Data engineering & Data science Framework

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

py-analytics

This repo contains a python framework with Data Enginner, Data Scientist and 3rd party integration tools capabilities

Installation

py-test-utility can be installed via pip

pip install py-analytics

tdd_utility - module

class load_csv(csv,schema)

Contains methods to extract the equivalent json from csv with nested and repeated records structures

Args

csv
- path and file name of the csv
- mandatory
- nested fields shall be separated by a dot "." (i.e. item.id, item.quantity)

order	item.id	item.quantity	delivery.address	delivey.postcode
A0001	item1	5	address1	e13bp
	item2	1
	item3	3
A0002	item4	4	address4	e13bp
	item1	4
	item3	2

schema
- path and schema file name of the table schema
- required if the CSV contain nested and repeated records
- json format i.e.

[  
    {
      "mode": "NULLABLE", 
      "name": "order", 
      "type": "STRING"
    },  
    {
      "fields": [
        {
          "mode": "NULLABLE", 
          "name": "id", 
          "type": "STRING"
        },
        {
          "mode": "NULLABLE", 
          "name": "quantity", 
          "type": "STRING"
        }
      ], 
      "mode": "REPEATED", 
      "name": "item", 
      "type": "RECORD"
    }, 
    {
      "fields": [
        {
          "mode": "NULLABLE", 
          "name": "address", 
          "type": "STRING"
        }, 
        {
          "mode": "NULLABLE", 
          "name": "postcode", 
          "type": "STRING"
        }
      ], 
      "mode": "NULLABLE", 
      "name": "delivery", 
      "type": "RECORD"
    }
  ]

Methods

to_json()
- if successfuls return the json extracted from the csv
to_new_line_delimiter_file(output_file_name)
- return 0 if successfuls
- create new line delimiter "output_file_name" file

Usage

>>> from data_prep import tdd_utility as  tu
>>> mockdata_csv = tu.load_csv(
...     csv="path/to/filename/file.csv", 
...     schema="path/to/schema/schema.json") # initialise the object
>>> mockdata_json = mockdata_csv.to_json() # return the equivalent json
>>> mockdata_json = mockdata_csv.to_new_line_delimiter_file(output="path/output_file_name.json") # return output_file_name

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.1

Nov 28, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py-data-framework-0.0.1.tar.gz (2.5 kB view details)

Uploaded Nov 28, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

py_data_framework-0.0.1-py3-none-any.whl (3.7 kB view details)

Uploaded Nov 28, 2019 Python 3

File details

Details for the file py-data-framework-0.0.1.tar.gz.

File metadata

Download URL: py-data-framework-0.0.1.tar.gz
Upload date: Nov 28, 2019
Size: 2.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.3

File hashes

Hashes for py-data-framework-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`746712c63127986b7b3f3792c7e880e729a17e59ec38d0f876343b873bf2ea05`
MD5	`5f2bd6808083eae0f2c1c98ba7596ccc`
BLAKE2b-256	`fff282a3ba390f1bf138f9e14c6570470bdb683dd732611fb41aca9c4707435a`

See more details on using hashes here.

File details

Details for the file py_data_framework-0.0.1-py3-none-any.whl.

File metadata

Download URL: py_data_framework-0.0.1-py3-none-any.whl
Upload date: Nov 28, 2019
Size: 3.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.3

File hashes

Hashes for py_data_framework-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`28ff1568a3e9a432e5328aeb54ced028b30c702afda6d66f596ed090d8ad3a44`
MD5	`5295a8746ea082cf0918b903e2bd043c`
BLAKE2b-256	`4f08de97105a47a00ef77300c903c648dca0861fe9416cf27a6ae4616d1e1cee`

See more details on using hashes here.

py-data-framework 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

py-analytics

Installation

tdd_utility - module

class load_csv(csv,schema)

Args

Methods

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes