Standardized DB Connections.
Project description
Description
A Python library by Dire Analytics to standardize database connections across platforms for more efficient data engineering.
Installation
pip install dbharbor
pip install git+https://github.com/edire/dbharbor.git
Code
import os
import dbharbor
# Microsoft SQL Server
from dbharbor.sql import SQL
con = SQL(
server=os.getenv('SQL_SERVER'),
db=os.getenv('SQL_DB'),
uid=os.getenv('SQL_UID'),
pwd=os.getenv('SQL_PWD'),
driver='ODBC Driver 17 for SQL Server'
)
# MySQL Server
from dbharbor.mysql import SQL
con = SQL(
server=os.getenv('MYSQL_SERVER'),
db=os.getenv('MYSQL_DB'),
uid=os.getenv('MYSQL_UID'),
pwd=os.getenv('MYSQL_PWD'),
port=3306
)
# BigQuery
from dbharbor.bigquery import SQL
con = SQL(
credentials_filepath = os.getenv('BIGQUERY_CRED')
)
# To request additional database connections please reach out to eric.dire@direanalytics.com
# Read
df = con.read("select * from [table]")
# Run
con.run("drop table if exists [table]")
# Clean Dataframe before inserting into database
df = dbharbor.clean(df)
# Write DataFrame to table
con.to_sql(df, "[table]", schema="dbo", index=False, if_exists='fail')
# Or use the connection directly with pandas dataframe for SQL or MySQL
df.to_sql("[table]", con=con.con, schema="dbo", index=False, if_exists='fail')
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
License
MIT License
Updates
09/13/2024 - Updated NaN to nan for Numpy 2.0.
09/09/2024 - Update SQL Varchar datatype to use max when greater than 8000 characters.
08/12/2024 - Added datetime_us datatype and Bigquery storage library to setup for faster API.
11/06/2023 - Fixed dtype missing lower function in sql and mysql and now use python tempfile module.
11/03/2023 - fixed index name bigquery to_sql issue.
11/03/2023 - added clean_dtypes function and updated create_table dtypes.
10/20/2023 - Updated clean tool for empty column names, replaced empty strings with NaN.
10/04/2023 - Updated Bigquery data type mapping.
09/19/2023 - Added port option for MySQL.
08/12/2023 - Update for applymap deprecation and upper env vars.
07/07/2023 - Reverted MSSQL and MySQL Run logic in order to pick up proc errors.
06/29/2023 - Reverted MySQL engine to con.
06/16/2023 - Updated SQL and MySQL modules for SQLAlchemy 2.0.
04/23/2023 - Updated bigquery to remove string length restrictions. Added pyarrow to required libraries for bigquery to_dataframe function. Added db-dtypes to required libraries for bigquery.
03/27/2023 - Updated clean column function to convert to string before cleaning.
03/14/2023 - Updated data type amounts for float columns in sql and mysql.
03/13/2023 - Added **kwargs to create_table function across all to eliminate error of passing missing variables.
02/23/2023 - Updated class names for consistency.
02/22/2023 - Fixed run logic in SQL and MySQL to use autocommit appropriately. Removed parameters in SQL class.
02/20/2023 - Updated bigquery module to allow connections from cloud resources.
02/17/2023 - Updated MySQL for reading multiple statement queries into DataFrame.
02/10/2023 - Added full functionality to BigQuery module.
02/06/2023 - Updated MySQL connector to automatically password parse if necessary.
01/31/2023 - Added option to not drop columns in clean function.
01/09/2023 - Added openpyxl to dependencies on install.
01/08/2023 - Fixed duplicate RowLoadDateTime issue in create_table function for sql and mysql.
01/06/2023 - Added BigQuery module with read function. Updated MySQL RowLoadDateTime for new and old MySQL server versions.
01/05/2023 - Adjusted env variable default names for multiple connection types.
01/04/2023 - Added password parse option to MySQL.
12/31/2022 - Added SSAS module.
12/22/2022 - Added clean_string function to tools.
12/14/2022 - Added mysql module.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file dbharbor-0.1.7.tar.gz
.
File metadata
- Download URL: dbharbor-0.1.7.tar.gz
- Upload date:
- Size: 340.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebe85d185b89f61623bc4834d61018583bc2ab8e785d47a4b9e3e75baf7538a0 |
|
MD5 | 5a0754c0c7d0a5903eee0cc335b62882 |
|
BLAKE2b-256 | 9cb5de22ac1041c31b2a108f0035eae175344406a162448ace7780f4ee81ccf5 |