Global utilities for Humankind data science
Project description
Global Utilities
last modified 17 November, 2022 by colleen_treado@humankind.co and francisco_pena@humankind.co
The utilities-hki
repository contains the common utilities required by multiple other humankind-datascience
repositories. Unlike the old utilities
repo, this package contains no encrypted files, and credentials are now passed into the utility functions as input arguments.
Installation and setup
For first-time setup, clone the repository into a fresh work area:
# cloning via ssh is preferred but requires an ssh key connection in your account
git clone git@github.com:humankind-datascience/utilities-hki.git
The code requires a number of Python packages to run, which should be installed inside of a dedicated virtual environment. The preferred virtual environment tool is virtualenvwrapper.
To install the required packages in a new virtual environment, run the following command from the top-level directory of the git repository:
pip install -r requirements.txt
If additional packages need to be installed upon changes to the code, add them to the requirements-top-level.txt
file. Then run the below commands to install (and upgrade) the top-level dependencies and update the requirements.txt
file for future use.
pip install -r requirements-top-level.txt --upgrade
pip freeze -r requirements-top-level.txt > requirements.txt
Additionally, the AWS Command Line Interface (AWS CLI) is required for use of the botocore library, which is used in database utilitify functions to read from and write to the AWS RDS databases. See the [https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html](AWS CLI documentation) for installation instructions.
Now you can run the top-level scripts:
python <utilities-script.py>
The utilities-hki
repository contains only testing top-level scripts, designed to test the utility functions during package development.
Code updates
When making changes to the code, follow GitHub flow, i.e. create a new branch, make changes on that branch, frequently committing and pushing those changes to that branch, and then create a pull request to merge those changes into master upon review and approval.
Current status
This repository is the code repository for the eventual utilities-hki
pip package, which, together with the new credentials
repository, will replace the current utilities
submodule in the other repositories used for data science at Humankind. The utility functions have been updated to remove dependencies on encrypted credentials and instead receive the credentials as input arguments. The next step is to package the project and update the repositories that use the old utilities
submodule to import the new utilities-hki
pip package and call the updated utility functions, passing in the credentials from the new credentials
submodule, instead.
Utility code overview
The utilities-hki
repository contains common utility functions used across repositories in the Humankind Data Science code base. The utility functions are grouped by type into separate modules, as outlined below.
- analy_utils: analysis utility functions;
- db_utils: database utility functions;
- email_utils: email utility functions;
- fb_utils: Facebook Ads utility functions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for utilities_hki-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b30314e8e25b5c5a61852a78880047b4ec0b22c60d198d0ed7d43b1e12b21cd |
|
MD5 | fa36a5543d9274248d9988479696da1e |
|
BLAKE2b-256 | 4d64aad400cfd3a1d66dce0df8a894bba5b9263630b14b5765638576b4b54d6e |