Common utility functions for Crumbl Data Team
Project description
.oooooo. .o8 oooo ooooooooo.
d8P' `Y8b "888 `888 `888 `Y88.
888 oooo d8b oooo oooo ooo. .oo. .oo. 888oooo. 888 888 .d88' oooo ooo
888 `888""8P `888 `888 `888P"Y88bP"Y88b d88' `88b 888 888ooo88P' `88. .8'
888 888 888 888 888 888 888 888 888 888 888 `88..8'
`88b ooo 888 888 888 888 888 888 888 888 888 888 `888'
`Y8bood8P' d888b `V88V"V8P' o888o o888o o888o `Y8bod8P' o888o o888o .8'
.o..P'
`Y8P'
CrumblPy
Overview
CrumblPy is a Python package designed to simplify complex data operations and enhance Crumbl data workflow. It offers a comprehensive set of tools and utilities that integrate seamlessly with Python projects, allowing you to focus on building and analyzing without unnecessary overhead.
Installation
You can install CrumblPy using pip:
pip install crumblpy
Features
CrumblPy provides three main modules:
- Email Module: Send emails with attachments through Gmail API
- Snowflake Module: Connect to and interact with Snowflake databases
- Slack Module: Send messages and files to Slack channels
Quickstart
import crumblpy
# Email functionality
from crumblpy import send_gmail, generate_token
# Snowflake functionality
from crumblpy import SnowflakeToolKit
# Slack functionality
from crumblpy import SlackToolKit
Email Module
The email module provides Gmail API integration for sending emails with attachments.
Functions
send_gmail(sender, recipient, subject, body, token, html_body=False, image_paths=None, attachment_paths=None)
Sends an email using the Gmail API.
Parameters:
sender(str): The email address of the senderrecipient(str): The email address of the recipientsubject(str): The subject of the emailbody(str): The body of the emailtoken(dict): The token data for authenticationhtml_body(bool, optional): Whether the body is HTML or plain text. Defaults to Falseimage_paths(List[str], optional): List of paths to images to attachattachment_paths(List[str], optional): List of paths to files to attach
Example:
import json
from crumblpy import send_gmail
# Load your token (generated using generate_token).
token = json.load(open('token.json'))
send_gmail(
sender='your-email@gmail.com',
recipient='recipient@example.com',
subject='Test Email',
body='This is a test email',
token=token,
html_body=True,
attachment_paths=['report.pdf', 'data.csv']
)
⚠️ Security Warning: The above example is for local development only. In production environments, use Doppler or Prefect blocks to securely manage credentials instead of storing them in JSON files.
generate_token(credential, scopes=['https://www.googleapis.com/auth/gmail.send'], write_to_file=False)
Generates authentication token for Gmail API access.
Parameters:
credential(dict): The credential data from Google Cloud Consolescopes(list, optional): List of OAuth scopes. Defaults to Gmail send scopewrite_to_file(bool, optional): Whether to write token to file. Defaults to False
Note: This function requires manual browser authorization.
Example:
import json
from crumblpy import generate_token
# Load your credentials from Google Cloud Console
credentials = json.load(open('credentials.json'))
generate_token(credentials, write_to_file=True)
⚠️ Security Warning: This example shows local development usage. In production, manage credentials securely using Doppler or Prefect blocks rather than storing them in JSON files.
Snowflake Module
The Snowflake module provides a toolkit for connecting to and interacting with Snowflake databases.
SnowflakeToolKit Class
__init__(prefect=False, user=None, password=None, role=None, schema='DATA_SCIENCE', warehouse='DATA_SCIENCE_TEAM')
Initialize the Snowflake connection.
Parameters:
prefect(bool, optional): Use Prefect secrets for authentication. Defaults to Falseuser(str, optional): Snowflake usernamepassword(str, optional): Snowflake passwordrole(str, optional): Snowflake roleschema(str, optional): Default schema. Defaults to 'DATA_SCIENCE'warehouse(str, optional): Snowflake warehouse. Defaults to 'DATA_SCIENCE_TEAM'
Methods
connect()
Establishes connection to Snowflake.
fetch_data(sql_query)
Fetch data from Snowflake using a SQL query.
Parameters:
sql_query(str): SQL query to execute
Returns:
pandas.DataFrame: Query results as a DataFrame
insert_data(df, table_name, auto_create_table=False)
Insert pandas DataFrame into Snowflake table.
Parameters:
df(pandas.DataFrame): DataFrame to inserttable_name(str): Target table nameauto_create_table(bool, optional): Whether to auto-create table. Defaults to False
execute_query(sql_query)
Execute a SQL query in Snowflake (useful for DML queries).
Parameters:
sql_query(str): SQL query to execute
Example:
from crumblpy import SnowflakeToolKit
import pandas as pd
# Initialize with environment variables.
sf = SnowflakeToolKit()
# Or initialize with explicit credentials (local development only)
sf = SnowflakeToolKit(
user='your_username',
password='your_password',
role='your_role'
)
# For production, use Prefect blocks
sf = SnowflakeToolKit(prefect=True)
# Fetch data
df = sf.fetch_data("SELECT * FROM your_table LIMIT 100")
# Insert data
new_data = pd.DataFrame({'col1': [1, 2, 3], 'col2': ['a', 'b', 'c']})
sf.insert_data(new_data, 'your_target_table', auto_create_table=True)
# Execute query
sf.execute_query("UPDATE your_table SET col1 = 0 WHERE col2 = 'a'")
⚠️ Security Warning: Explicit credentials shown above are for local experimentation only. In production environments, use
prefect=Trueparameter to leverage Prefect blocks or use Doppler for secure credential management.
Slack Module
The Slack module provides integration with Slack for sending messages and files.
SlackToolKit Class
__init__(prefect=False, token=None, default_channel='U04RAQM788L')
Initialize the Slack client.
Parameters:
prefect(bool, optional): Use Prefect secrets for authentication. Defaults to Falsetoken(str, optional): Slack bot tokendefault_channel(str, optional): Default channel ID. Defaults to 'U04RAQM788L'
Methods
post_message(message=None, channel=None, thread_id=None, blocks=None)
Send a message to a Slack channel.
Parameters:
message(str, optional): Message textchannel(str, optional): Channel ID or user IDthread_id(str, optional): Thread timestamp for threaded messagesblocks(list, optional): Slack Block Kit blocks
post_file(file_path, message, channel=None, thread_id=None)
Upload a file to Slack channel.
Parameters:
file_path(str): Path to the file to uploadmessage(str): Message to accompany the filechannel(str, optional): Channel ID or user IDthread_id(str, optional): Thread timestamp
Note: This method automatically deletes the file after upload.
get_thread_id(channel)
Get the timestamp of the most recent message in a channel.
Parameters:
channel(str): Channel ID
Returns:
str: Thread timestamp
push_notification(project=None, channel=None, e=None)
Send a notification about project status.
Parameters:
project(str, optional): Project namechannel(str, optional): Channel IDe(Exception, optional): Exception object if there was an error
Example:
from crumblpy import SlackToolKit
# Initialize with environment variable
slack = SlackToolKit()
# Or initialize with explicit token (local development only)
slack = SlackToolKit(token='your-slack-token')
# For production, use Prefect blocks
slack = SlackToolKit(prefect=True)
# Send a message
slack.post_message("Hello from CrumblPy!", channel='your-channel-id')
# Send a file
slack.post_file('report.pdf', 'Here is the daily report', channel='your-channel-id')
# Send notification
slack.push_notification(project='Data Pipeline', channel='your-channel-id')
# Send error notification
try:
# Some operation that might fail
pass
except Exception as e:
slack.push_notification(project='Data Pipeline', channel='#alerts', e=e)
⚠️ Security Warning: Examples showing explicit tokens are for local experimentation only. In production environments, use
prefect=Trueparameter to leverage Prefect blocks or use Doppler for secure credential management.
Environment Variables
CrumblPy uses the following environment variables when explicit credentials are not provided:
SNOWFLAKE_USER: Snowflake usernameSNOWFLAKE_PASSWORD: Snowflake passwordSLACK_TOKEN: Slack bot token
Authentication Setup
🔒 Production Security Note: The setup instructions below are primarily for local development and experimentation. For production deployments, always use secure credential management solutions like Doppler or Prefect blocks instead of environment variables or local credential files.
Gmail API Setup
- Go to Google Cloud Console
- Create a new project or select existing one
- Enable Gmail API
- Create credentials (OAuth 2.0 Client ID)
- Download credentials JSON file
- Use
generate_token()function to create authentication token
Snowflake Setup
Set environment variables or use explicit credentials:
export SNOWFLAKE_USER="your_username"
export SNOWFLAKE_PASSWORD="your_password"
Slack Setup
- Create a Slack app at api.slack.com
- Add bot token scopes:
chat:write,files:write,channels:history - Install app to workspace
- Copy Bot User OAuth Token
- Set environment variable:
export SLACK_TOKEN="xoxb-your-token-here"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crumblpy-1.1.2.tar.gz.
File metadata
- Download URL: crumblpy-1.1.2.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96976adb2bbc9aa142d1733ba1545f3dcb44b0adbaf6cddac371e497d4971c52
|
|
| MD5 |
be01bd24b05126b42603403623e68b33
|
|
| BLAKE2b-256 |
cf08e3c82305179c98ca99faf050623c7a1ac1dd91fee07762b5262d3fa7a471
|
File details
Details for the file CrumblPy-1.1.2-py3-none-any.whl.
File metadata
- Download URL: CrumblPy-1.1.2-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07de7e2a3387ea4f1aed6e9880ae23370a439ad008924ea9854ce87745a6e3e4
|
|
| MD5 |
790d1adc3fd38e351db8e299438d9405
|
|
| BLAKE2b-256 |
b2ede0adb7dc30906978a0ec8898ae0682d381aa22648e773917ad78a7b1db32
|