Skip to main content

Standardized DB Connections.

Project description

Description

A Python library by Dire Analytics to standardize database connections across platforms for more efficient data engineering.

Installation

pip install dbharbor
pip install git+https://github.com/edire/dbharbor.git

Code

import os
import dbharbor

# Microsoft SQL Server
from dbharbor.sql import SQL
con = SQL(
    server=os.getenv('SQL_SERVER'),
    db=os.getenv('SQL_DB'),
    uid=os.getenv('SQL_UID'),
    pwd=os.getenv('SQL_PWD'),
    driver='ODBC Driver 17 for SQL Server'
)

# MySQL Server
from dbharbor.mysql import SQL
con = SQL(
    server=os.getenv('MYSQL_SERVER'),
    db=os.getenv('MYSQL_DB'),
    uid=os.getenv('MYSQL_UID'),
    pwd=os.getenv('MYSQL_PWD'),
    port=3306
)

# BigQuery
from dbharbor.bigquery import SQL
con = SQL(
    credentials_filepath = os.getenv('BIGQUERY_CRED')
)

# To request additional database connections please reach out to eric.dire@direanalytics.com

# Read
df = con.read("select * from [table]")

# Run
con.run("drop table if exists [table]")

# Clean Dataframe before inserting into database
df = dbharbor.clean(df)

# Write DataFrame to table
con.to_sql(df, "[table]", schema="dbo", index=False, if_exists='fail')

# Or use the connection directly with pandas dataframe for SQL or MySQL
df.to_sql("[table]", con=con.con, schema="dbo", index=False, if_exists='fail')

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT License

Updates

09/13/2024 - Updated NaN to nan for Numpy 2.0.
09/09/2024 - Update SQL Varchar datatype to use max when greater than 8000 characters.
08/12/2024 - Added datetime_us datatype and Bigquery storage library to setup for faster API.
11/06/2023 - Fixed dtype missing lower function in sql and mysql and now use python tempfile module.
11/03/2023 - fixed index name bigquery to_sql issue.
11/03/2023 - added clean_dtypes function and updated create_table dtypes.
10/20/2023 - Updated clean tool for empty column names, replaced empty strings with NaN.
10/04/2023 - Updated Bigquery data type mapping.
09/19/2023 - Added port option for MySQL.
08/12/2023 - Update for applymap deprecation and upper env vars.
07/07/2023 - Reverted MSSQL and MySQL Run logic in order to pick up proc errors.
06/29/2023 - Reverted MySQL engine to con.
06/16/2023 - Updated SQL and MySQL modules for SQLAlchemy 2.0.
04/23/2023 - Updated bigquery to remove string length restrictions. Added pyarrow to required libraries for bigquery to_dataframe function. Added db-dtypes to required libraries for bigquery.
03/27/2023 - Updated clean column function to convert to string before cleaning.
03/14/2023 - Updated data type amounts for float columns in sql and mysql.
03/13/2023 - Added **kwargs to create_table function across all to eliminate error of passing missing variables.
02/23/2023 - Updated class names for consistency.
02/22/2023 - Fixed run logic in SQL and MySQL to use autocommit appropriately. Removed parameters in SQL class.
02/20/2023 - Updated bigquery module to allow connections from cloud resources.
02/17/2023 - Updated MySQL for reading multiple statement queries into DataFrame.
02/10/2023 - Added full functionality to BigQuery module.
02/06/2023 - Updated MySQL connector to automatically password parse if necessary.
01/31/2023 - Added option to not drop columns in clean function.
01/09/2023 - Added openpyxl to dependencies on install.
01/08/2023 - Fixed duplicate RowLoadDateTime issue in create_table function for sql and mysql.
01/06/2023 - Added BigQuery module with read function. Updated MySQL RowLoadDateTime for new and old MySQL server versions.
01/05/2023 - Adjusted env variable default names for multiple connection types.
01/04/2023 - Added password parse option to MySQL.
12/31/2022 - Added SSAS module.
12/22/2022 - Added clean_string function to tools.
12/14/2022 - Added mysql module.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbharbor-0.1.7.tar.gz (340.7 kB view details)

Uploaded Source

File details

Details for the file dbharbor-0.1.7.tar.gz.

File metadata

  • Download URL: dbharbor-0.1.7.tar.gz
  • Upload date:
  • Size: 340.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for dbharbor-0.1.7.tar.gz
Algorithm Hash digest
SHA256 ebe85d185b89f61623bc4834d61018583bc2ab8e785d47a4b9e3e75baf7538a0
MD5 5a0754c0c7d0a5903eee0cc335b62882
BLAKE2b-256 9cb5de22ac1041c31b2a108f0035eae175344406a162448ace7780f4ee81ccf5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page