Skip to main content

A recipe for every data baker

Project description

DataRecipe

Table of Contents

  1. Overview
  2. Functions
  3. Contact Information

Overview

This toolkit provides a variety of Python functions to facilitate common data manipulation, data import/export, and database operations.

Functions

General Features

send_email

Sends an email using SMTP with SSL/TLS options, supporting attachments if provided.

  • Parameters:
    • subject: Email subject as a string.
    • body: Main content of the email.
    • send_email_address: Sender's email address.
    • send_email_password: Sender's email password for SMTP authentication.
    • receive_email_address: Recipient's email address.
    • attachment_path: Directory path where attachments are stored (optional).
    • attachment_list: List of filenames to be attached (optional).
    • smtp_address: SMTP server address (default: 'smtp.feishu.cn').
    • smtp_port: SMTP server port (default: 465).

Example with Attachments:

send_email(
    "Meeting Documents", 
    "Please see attached documents for the upcoming meeting.", 
    "sender@example.com", 
    "password123", 
    "receiver@example.com", 
    attachment_path="/path/to/documents", 
    attachment_list=["agenda.pdf", "minutes.docx"]
)

Data Validation and Cleaning

check_empty

Checks for empty entries in specified DataFrame columns.

  • Parameters:
    • df: DataFrame to check.
    • columns: Columns to check for missing values.
    • output_cols: Columns to include in the output.

Example:

empty_data = check_empty(df, columns=["name", "email"])

clean_dataframe

Cleans DataFrame by replacing infinite values with NaN.

  • Parameters:
    • df: DataFrame to clean.

Example:

clean_dataframe(df)

Data Import/Export

local_to_df

Converts files from a local directory to a pandas DataFrame.

  • Parameters:
    • path: Directory path to search for files.
    • partial_file_name: File name pattern to match.
    • skip_rows: Number of rows to skip at the start of each file.
    • keep_file_name: If True, adds a column with the file name.
    • sheet_num: For Excel files, specifies the sheet number to read.
    • encoding: Character encoding of the files.

Example with CSV files:

df = local_to_df("./data", "sample", keep_file_name=True)

Example with Excel files:

df = local_to_df("./data", "report", sheet_num=2, encoding='utf-8')

df_to_xlsx

Saves a DataFrame to an Excel file.

  • Parameters:
    • df: DataFrame to save.
    • directory_path: Path to directory where the file will be saved.
    • file_name: Name of the output file.

Example:

df_to_xlsx(df, "./output", "output_data")

df_to_csv

Saves a DataFrame to a CSV file.

  • Parameters:
    • df: DataFrame to save.
    • directory_path: Path to directory where the file will be saved.
    • file_name: Name of the output file.

Example:

df_to_csv(df, "./output", "output_data")

Database Operations

update

Updates records in a database table based on conditions.

  • Parameters:
    • raw_df: DataFrame containing new data to update.
    • database: Database name.
    • table: Table name.
    • yaml_file_name: YAML file name with DB configuration.
    • clause: SQL clause for record deletion.
    • date_col: Column name containing date data.
    • custom_path: Path to directory containing the YAML file.

Example:

update(df, "test_db", "user_data", clause="user_id > 10")

sql_query

Executes a SELECT SQL query and returns a DataFrame.

  • Parameters:
    • database: Database name.
    • sql: SQL SELECT statement.
    • yaml_file_name: YAML file name with DB configuration.
    • custom_path: Optional path to directory containing the YAML file.

Example:

result_df = sql_query("test_db", "SELECT * FROM users")

Contact Information

For any questions or suggestions regarding the toolkit, please contact us at:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datarecipe-2.0.8.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datarecipe-2.0.8-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file datarecipe-2.0.8.tar.gz.

File metadata

  • Download URL: datarecipe-2.0.8.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for datarecipe-2.0.8.tar.gz
Algorithm Hash digest
SHA256 6affd6b1e9b9704a13183a0419db3d950f319240f135306bd23880fc313a5074
MD5 8990f9c281cffd50df9e39c6500a7d54
BLAKE2b-256 c582f9c809ac3a21da614973e9ed2707809d230073db11fafcf6aa58acc03fdd

See more details on using hashes here.

File details

Details for the file datarecipe-2.0.8-py3-none-any.whl.

File metadata

  • Download URL: datarecipe-2.0.8-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for datarecipe-2.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 3ec4c27db308ea0288b7d9c545521a483a5fee28e44c081493078f14e13861d2
MD5 55c2afc8f1b4bd6eb3e5b813c4c7c68b
BLAKE2b-256 40d139c46f1df4287a45a94c0a638f4acc70896920ad54f469589f43cbda1a65

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page