Solution for DS Team
Project description
utilsds
Utilsds is a library that includes classes and functions used in data science projects such as:
-
algorithm:
Algorithm: Base class for fitting, training, and getting hyperparameters of machine learning models.
-
data_ops:
DataOperations: Handle data operations locally and with Google Cloud services (BigQuery and Cloud Storage).- BigQuery operations:
load_bq_data: Load data from tables, views, and SQL files.save_bq_view,save_bq_table: Save views and tables.load_bq_procedure: Execute stored procedures.load_bq_details: Get table/view details and schema.delete_bq_data: Delete data with safety confirmations.dry_run: Perform dry runs to estimate query costs.
- Cloud Storage operations:
save_gcs_bucket: Create buckets.save_gcs_file,load_gcs_file: Save and load files (.pkl, .json, .csv, .html, .sql).
- Local file operations:
save_local_file,load_local_file: Save and load files (.pkl, .json, .csv, .html, .sql).
-
data_processing:
SkewnessTransformer: Transform skewed data using various methods (IHS, neglog, Yeo-Johnson, quantile).NullReplacer: Replace null values in specified columns with configurable strategies.ColumnDropper: Drop specified columns from a DataFrame.OutliersCleaner: Clean outliers by clipping values outside specified percentile ranges.CategoricalMapper: Map values in categorical columns according to a specified mapping scheme.NumericalMapper: Convert numerical columns to categorical by binning.Encoder: One-hot encode categorical columns in the data.Normalizer: Normalize numerical columns using a provided scaler.
-
data_split:
train_test_validation_split: Split data into training, testing, and validation sets.resample_X_y: resample train data and target column.
-
ds_statistics:
test_kruskal_wallis: Perform the Kruskal-Wallis statistical test.test_agosto_pearsona: Test for normality using D'Agostino-Pearson test.
-
evaluate:
ModelEvaluator: Evaluate models and generate plots for diagnostics.ShapExplainer: Explain model predictions using SHAP values.
-
experiments:
VertexExperiment: Manage experiments with Vertex AI.
-
optuna:
Optuna: Optimize hyperparameters using Optuna.
-
metrics:
Metrics: Calculate metrics for both classification and regression models.
-
modeling:
Modeling: Manage modeling, metrics, and logging with Vertex AI.
-
Supervised:
LazyClassifier: A classifier that automatically trains and evaluates multiple models.LazyRegressor: A regressor that automatically trains and evaluates multiple models.get_card_split: Function to split data into card-like groups.adjusted_rsquared: Calculate adjusted R-squared for regression models.
-
visualization:
MetricsPlot: Compare metrics for different parameter values.Radar: Create radar plots for visualizing data.cluster_characteristics: Analyze cluster characteristics.comparison_density: Compare density distributions.elbow_visualisation: Visualize the elbow method for clustering.describe_clusters_metrics: Describe metrics for clusters.category_null_variables: Visualize null variables in categorical data.normal_distr_plots: Visualize normal distribution plots.distplot_limitations: Visualize limitations of distplot.boxplot_limitations: Visualize limitations of boxplot.violinplot_limitations: Visualize limitations of violinplot.countplot_limitations: Visualize limitations of countplot.categorical_variable_perc: Visualize percentage of categorical variables.spearman_correlation: Visualize spearman correlation.calculate_crammers_v: Calculate Crammer's V.
-
what_if_streamlit:
ShapSaver: Save SHAP explainer components for lazy loading in what-if analysis.ColumnMetadataGenerator: Generate column metadata from a DataFrame or CSV file.
-
monitoring:
mapping: Create column mapping from configuration file for Evidently.test_data: Test data for issues using Evidently test suites.check_data_drift: Check data for drift using Evidently metrics.send_email_with_table: Send email notifications with HTML tables for monitoring alerts.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
utilsds-2.0.9.tar.gz
(51.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
utilsds-2.0.9-py3-none-any.whl
(55.0 kB
view details)
File details
Details for the file utilsds-2.0.9.tar.gz.
File metadata
- Download URL: utilsds-2.0.9.tar.gz
- Upload date:
- Size: 51.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d95d1202e1266fab8c1444e333c2b8fa0b5f9c36eef6ec49e2ffd11bc09b3234
|
|
| MD5 |
82832e03becdd2a49ce2206a6d6aa4f8
|
|
| BLAKE2b-256 |
38c1dacd4e1b1086310c1ae34f0f83af1a2e49269f13f14377c172f458a21c0d
|
File details
Details for the file utilsds-2.0.9-py3-none-any.whl.
File metadata
- Download URL: utilsds-2.0.9-py3-none-any.whl
- Upload date:
- Size: 55.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ae0acd097d4a5777808d229d22d5a025c6a6baaeb21c099d0c37c2541e2d11e
|
|
| MD5 |
fc52bc049834347cb08a29d9965478b9
|
|
| BLAKE2b-256 |
b996bcd51163d4301a53179b738a1a881d4de8079c6744dec632e879cd242fc0
|