Solution for DS Team
Project description
utilsds
Utilsds is a library that includes classes and functions used in data science projects such as:
-
algorithm:
Algorithm: Base class for fitting, training, and getting hyperparameters of machine learning models.
-
data_ops:
DataOperations: Handle data operations locally and with Google Cloud services (BigQuery and Cloud Storage).- BigQuery operations:
load_bq_data: Load data from tables, views, and SQL files.save_bq_view,save_bq_table: Save views and tables.load_bq_procedure: Execute stored procedures.load_bq_details: Get table/view details and schema.delete_bq_data: Delete data with safety confirmations.dry_run: Perform dry runs to estimate query costs.
- Cloud Storage operations:
save_gcs_bucket: Create buckets.save_gcs_file,load_gcs_file: Save and load files (.pkl, .json, .csv, .html, .sql).
- Local file operations:
save_local_file,load_local_file: Save and load files (.pkl, .json, .csv, .html, .sql).
-
data_processing:
SkewnessTransformer: Transform skewed data using various methods (IHS, neglog, Yeo-Johnson, quantile).NullReplacer: Replace null values in specified columns with configurable strategies.ColumnDropper: Drop specified columns from a DataFrame.OutliersCleaner: Clean outliers by clipping values outside specified percentile ranges.CategoricalMapper: Map values in categorical columns according to a specified mapping scheme.NumericalMapper: Convert numerical columns to categorical by binning.Encoder: One-hot encode categorical columns in the data.Normalizer: Normalize numerical columns using a provided scaler.
-
data_split:
train_test_validation_split: Split data into training, testing, and validation sets.resample_X_y: resample train data and target column.
-
ds_statistics:
test_kruskal_wallis: Perform the Kruskal-Wallis statistical test.test_agosto_pearsona: Test for normality using D'Agostino-Pearson test.
-
evaluate:
ModelEvaluator: Evaluate models and generate plots for diagnostics.ShapExplainer: Explain model predictions using SHAP values.
-
experiments:
VertexExperiment: Manage experiments with Vertex AI.
-
hyperopt:
Hyperopt: Optimize hyperparameters using Hyperopt.
-
metrics:
Metrics: Calculate metrics for both classification and regression models.
-
modeling:
Modeling: Manage modeling, metrics, and logging with Vertex AI.
-
Supervised:
LazyClassifier: A classifier that automatically trains and evaluates multiple models.LazyRegressor: A regressor that automatically trains and evaluates multiple models.get_card_split: Function to split data into card-like groups.adjusted_rsquared: Calculate adjusted R-squared for regression models.
-
visualization:
MetricsPlot: Compare metrics for different parameter values.Radar: Create radar plots for visualizing data.cluster_characteristics: Analyze cluster characteristics.comparison_density: Compare density distributions.elbow_visualisation: Visualize the elbow method for clustering.describe_clusters_metrics: Describe metrics for clusters.category_null_variables: Visualize null variables in categorical data.normal_distr_plots: Visualize normal distribution plots.distplot_limitations: Visualize limitations of distplot.boxplot_limitations: Visualize limitations of boxplot.violinplot_limitations: Visualize limitations of violinplot.countplot_limitations: Visualize limitations of countplot.categorical_variable_perc: Visualize percentage of categorical variables.spearman_correlation: Visualize spearman correlation.calculate_crammers_v: Calculate Crammer's V.
-
what_if_streamlit:
ShapSaver: Save SHAP explainer components for lazy loading in what-if analysis.ColumnMetadataGenerator: Generate column metadata from a DataFrame or CSV file.
-
monitoring:
mapping: Create column mapping from configuration file for Evidently.test_data: Test data for issues using Evidently test suites.check_data_drift: Check data for drift using Evidently metrics.send_email_with_table: Send email notifications with HTML tables for monitoring alerts.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
utilsds-1.1.14.tar.gz
(43.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
utilsds-1.1.14-py3-none-any.whl
(50.8 kB
view details)
File details
Details for the file utilsds-1.1.14.tar.gz.
File metadata
- Download URL: utilsds-1.1.14.tar.gz
- Upload date:
- Size: 43.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6bca04d53c9ac1b187022281182faad63f89eece5c367f2ecc4e7b4a7b91991
|
|
| MD5 |
1951f72166e954cb6063258405b1b0f7
|
|
| BLAKE2b-256 |
10f9be2ba849813acb709c1accd6af7a73c9d7508c1a7d36fff874fb915b55b0
|
File details
Details for the file utilsds-1.1.14-py3-none-any.whl.
File metadata
- Download URL: utilsds-1.1.14-py3-none-any.whl
- Upload date:
- Size: 50.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f83e5df6aeb0bdfba390ddb3a46f9b576a647d7f7e823f4bcd80b6ac057a259
|
|
| MD5 |
912bb4dcadeeecd02fda13272ab69ee6
|
|
| BLAKE2b-256 |
8e26736a31f87e1380bd41b878e4d188d47661baf182df1b0172e762078afc4d
|