Contains functions for use in Ecotope Datapipelines

These details have not been verified by PyPI

Project description

DataPipelinePackage

To Install the Package

From the internet for use elsewhere:
$ pip install ecopipeline
Install locally in an editable mode:
Navigate to DataPipelinePackage directory and run the following command
$ pip install -e .

File Structure

.
├── src
├   ├── docs 
├   ├   ├── 
├   └── ecopipeline
├       ├── extract.py               # functionality for extracting data from a file system
├       ├── transform.py             # functionality for cleaning data and calcualting derived values
├       ├── load.py                  # functionality for loading pandas dataframe into a mySQL database table
├       ├── unit_convert.py   
├       ├── config.py                # file containing all file paths 
├       ├── bayview.py               # Bayview site-specific functionality
├       └── lbnl.py                  # LBNL site-specific functionality
├── testing
├   ├── Bayview
├   ├    ├── Bayview_input
├   ├    ├── extract.py              
├   ├    ├   └── extract_test.py     # testing for extract functionality
├   ├    ├── transform.py
├   ├    ├   ├── pickles             # pickles used for bayview unit testing
├   ├    ├   └── transform_test.py   # testing for transform functionality
├   ├    └── load.py                 
├   ├        └── load_test.py        # testing for load functionality
├   └── LBNL
├       ├── extract.py
├       ├   └── extract_test.py
├       ├── transform.py 
├       ├   ├── LBNL-input           # LBNL input dataframes used as testing input
├       ├   ├── LBNL-output          # LBNL output dataframes used for crossreferencing our output to expected output
├       ├   ├── pickles              # pickles used for bayview unit testing
├       ├   └── transform_test.py
├       └── load.py
├           └── load_test.py
├── config.ini                       # file containing all configuration parameters
└── README.md

Purpose

This project was developed with the help of Ecotope, Inc. It containes seperate modular functionalities that, when combined, can extract, transfrom, and load data from incoming sensors. The main goal was to rewrite the existing R pipeline code with Python making the codebase more readable. In addition to that, scalability was taken into account during this project since this codebase will be used to create pipelines for different sites in the future.

Architecture

Screenshot

extract.py

loading data from a local file system
extracting NOAA weather data from a FTP server

transform.py

cleaning the data
- rounding
- removing outliers
- renaming columns
- filling missing values
calculating dervived COP (coefficient of performance) values
agreggating the data

load.py

establishing a connection to the database
loading pandas dataframe into a table in the database

config.ini

database
- user: username for host database connection
- password: password for host database connection
- host: name of host
- database: name of database
minute
- table_name: name of table to be created in the mySQL database containing minute-by-minute data
hour
- table_name: name of table to be created in the mySQL database containing hour-by-hour data
day
- table_name: name of table to be created in the mySQL database containing day-by-day data
input
- directory: diretory of the folder containing the input files listed below
- site_info: name of the site information csv
- 410a_info: name of the 410a information csv
- superheat_info: name of the superheat infomation csv
output
- directory: diretory of the folder where any pipeline output should be written to
data
- directory: diretory of the folder from which extract loads the raw sensor data

Unit Testing

To run Unit tests, run the following command in the terminal in the corresponding directory:

python -m pytest

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.11

Nov 5, 2024

0.4.10

Nov 5, 2024

0.4.9

Nov 4, 2024

0.4.8

Oct 30, 2024

0.4.7

Oct 30, 2024

0.4.6

Oct 10, 2024

0.4.5

Oct 7, 2024

0.4.4

Oct 1, 2024

0.4.3

Sep 27, 2024

0.4.2

Aug 5, 2024

0.4.1

Aug 2, 2024

0.4.0

Aug 2, 2024

0.3.8

Jul 29, 2024

0.3.7

Jul 29, 2024

0.3.6

Jul 19, 2024

0.3.5

Jul 9, 2024

0.3.4

Jun 24, 2024

0.3.3

Jun 14, 2024

0.3.2

Jun 14, 2024

0.3.1

Jun 4, 2024

0.3.0

May 30, 2024

0.2.15

May 9, 2024

0.2.14

May 7, 2024

0.2.13

May 3, 2024

0.2.12

Apr 29, 2024

0.2.11

Apr 29, 2024

0.2.10

Apr 29, 2024

0.2.9

Apr 29, 2024

0.2.8

Apr 29, 2024

0.2.7

Apr 24, 2024

0.2.6

Apr 16, 2024

0.2.5

Apr 10, 2024

0.2.4

Mar 22, 2024

0.2.3

Mar 22, 2024

0.2.2

Mar 15, 2024

0.2.1

Mar 15, 2024

0.2.0

Mar 15, 2024

0.1.0

Mar 14, 2024

This version

0.0.64

Mar 8, 2024

0.0.63

Jan 30, 2024

0.0.61

Jan 3, 2024

0.0.60

Dec 21, 2023

0.0.58

Dec 20, 2023

0.0.57

Dec 20, 2023

0.0.56

Dec 15, 2023

0.0.55

Dec 6, 2023

0.0.54

Nov 16, 2023

0.0.53

Nov 10, 2023

0.0.52

Nov 7, 2023

0.0.51

Oct 19, 2023

0.0.50

Oct 19, 2023

0.0.49

Oct 17, 2023

0.0.48

Oct 17, 2023

0.0.47

Oct 17, 2023

0.0.46

Oct 17, 2023

0.0.45

Oct 13, 2023

0.0.44

Jun 29, 2023

0.0.43

Jun 29, 2023

0.0.42

Jun 28, 2023

0.0.41

Jun 23, 2023

0.0.40

Jun 23, 2023

0.0.39

Jun 16, 2023

0.0.38

May 19, 2023

0.0.37

May 15, 2023

0.0.36

May 15, 2023

0.0.35

May 15, 2023

0.0.34

May 15, 2023

0.0.33

May 15, 2023

0.0.32

May 12, 2023

0.0.31

May 12, 2023

0.0.30

May 11, 2023

0.0.29

May 11, 2023

0.0.27

May 11, 2023

0.0.26

May 11, 2023

0.0.25

May 11, 2023

0.0.24

May 11, 2023

0.0.23

May 11, 2023

0.0.22

May 4, 2023

0.0.21

May 3, 2023

0.0.20

May 3, 2023

0.0.19

May 2, 2023

0.0.18

May 1, 2023

0.0.17

May 1, 2023

0.0.16

May 1, 2023

0.0.15

May 1, 2023

0.0.14

May 1, 2023

0.0.13

Apr 20, 2023

0.0.12

Apr 20, 2023

0.0.11

Apr 18, 2023

0.0.10

Apr 18, 2023

0.0.9

Apr 17, 2023

0.0.8

Apr 17, 2023

0.0.7

Apr 14, 2023

0.0.6

Apr 14, 2023

0.0.5

Apr 13, 2023

0.0.4

Apr 10, 2023

0.0.3

Apr 10, 2023

0.0.2

Apr 10, 2023

0.0.1

Apr 10, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ecopipeline-0.0.64.tar.gz (33.3 kB view hashes)

Uploaded Mar 8, 2024 Source

Built Distribution

ecopipeline-0.0.64-py3-none-any.whl (34.7 kB view hashes)

Uploaded Mar 8, 2024 Python 3

Hashes for ecopipeline-0.0.64.tar.gz

Hashes for ecopipeline-0.0.64.tar.gz
Algorithm	Hash digest
SHA256	`ae7204ecb04f99f74eea61ac49d72aa755660cd4036429482a939cc8bda596e0`
MD5	`30c987cd520fcc30aab772e0e4a91a8b`
BLAKE2b-256	`d7bbb364904a148b437ac4442ff3f4e6d8285738d2b1f19efd4801528a0d38f1`

Hashes for ecopipeline-0.0.64-py3-none-any.whl

Hashes for ecopipeline-0.0.64-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e4f7178363de86d2951a3290a28987528c86e277a1e722b84e6355e8790faacc`
MD5	`1a2265bbf173a09b322c883a79c630df`
BLAKE2b-256	`a7c8105b01f3de621f71216656f1faafbd9ea30c15dc756158a503c9dc85d45f`