A light set of enabling convenience functions based on Cloudframe's proprietary data science enablers.
Project description
The Cloudframe Data Scientist Simple Enabler
At Cloudframe we employ teams of Data Scientists, Data Engineers, and Software Developers. Check us out at http://cloudframe.io
If you're interested in joining our team as a Data Scientist see here: Bid Prediction Repo. There you'll find a fun problem and more info about our evergreen positions for Data Scientists, Data Engineers, and Software Developers.
This package contains some convenience functions meant help a Data Scientist get data into a format that is useful for training models. It is a light version of some of our proprietary enablers that we use to deliver data-informed products to our clients.
Installation
pip install datascientist
Dependencies
In addition to the following packages, datascientist
requires that you have the credentials (et cetera) to perform the operation required. For example, when connecting to an Oracle database you must install and configure Instant Client or something like that. This package does not do that for you.
pandas
psycopg2
mysql.connector
cx_Oracle
Structure
data-scientist/
|
|-- datascientist/
| |-- __init__.py
| |-- connection_convenience.py
|
|-- Manifest.in
|-- README.md
|-- setup.py
|-- bash_profile_example
Usage
A sample bash profile is provided for reference with values removed. Some of the functions will look for environment variables named according the conventions there. If it can't find them it will prompt you for the appropriate strings. Strings set via prompts are NOT saved for security reasons. It's up to you to make sure that if you set environment variables in a more permanent way that they remain secure.
This module replicates the functionality of pandas.read_sql()
, but is a little friendlier; handling the connection object for you while performing the same according to %timeit.
from datascientist.connection_convenience import *
sql = '''
select * from my_table
where my_field in ('cloud', 'frame');
'''
df = pg2df(sql)
# input at the prompts if necessary
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for datascientist-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a30bdd65a4aec763f09e98dbb1e195dff219e233970f1b36e3e98aaedf1d3366 |
|
MD5 | 5ca28133ff926e4452f5cd3f2655c2d6 |
|
BLAKE2b-256 | 61714312b20a8d4b5efdbe7308e02af6293055b9717d1586393bbef1258f28e1 |