Solution for DS Team
Project description
utilsds
Utilsds is a library that includes classes and functions used in data science projects such as:
-
algorithm:
Algorithm: Base class for fitting, training, and getting hyperparameters of machine learning models.
-
data_ops:
DataOperations: Handle data operations locally and with Google Cloud services (BigQuery and Cloud Storage).- BigQuery operations:
load_bq_data: Load data from tables, views, and SQL files.save_bq_view,save_bq_table: Save views and tables.load_bq_procedure: Execute stored procedures.load_bq_details: Get table/view details and schema.delete_bq_data: Delete data with safety confirmations.dry_run: Perform dry runs to estimate query costs.
- Cloud Storage operations:
save_gcs_bucket: Create buckets.save_gcs_file,load_gcs_file: Save and load files (.pkl, .json, .csv, .html, .sql).
- Local file operations:
save_local_file,load_local_file: Save and load files (.pkl, .json, .csv, .html, .sql).
-
data_processing:
SkewnessTransformer: Transform skewed data using various methods (IHS, neglog, Yeo-Johnson, quantile).NullReplacer: Replace null values in specified columns with configurable strategies.ColumnDropper: Drop specified columns from a DataFrame.OutliersCleaner: Clean outliers by clipping values outside specified percentile ranges.CategoricalMapper: Map values in categorical columns according to a specified mapping scheme.NumericalMapper: Convert numerical columns to categorical by binning.Encoder: One-hot encode categorical columns in the data.Normalizer: Normalize numerical columns using a provided scaler.
-
data_split:
train_test_validation_split: Split data into training, testing, and validation sets.resample_X_y: resample train data and target column.
-
ds_statistics:
test_kruskal_wallis: Perform the Kruskal-Wallis statistical test.test_agosto_pearsona: Test for normality using D'Agostino-Pearson test.
-
evaluate:
ModelEvaluator: Evaluate models and generate plots for diagnostics.ShapExplainer: Explain model predictions using SHAP values.
-
experiments:
VertexExperiment: Manage experiments with Vertex AI.
-
hyperopt:
Hyperopt: Optimize hyperparameters using Hyperopt.
-
metrics:
Metrics: Calculate metrics for both classification and regression models.
-
modeling:
Modeling: Manage modeling, metrics, and logging with Vertex AI.
-
Supervised:
LazyClassifier: A classifier that automatically trains and evaluates multiple models.LazyRegressor: A regressor that automatically trains and evaluates multiple models.get_card_split: Function to split data into card-like groups.adjusted_rsquared: Calculate adjusted R-squared for regression models.
-
visualization:
MetricsPlot: Compare metrics for different parameter values.Radar: Create radar plots for visualizing data.cluster_characteristics: Analyze cluster characteristics.comparison_density: Compare density distributions.elbow_visualisation: Visualize the elbow method for clustering.describe_clusters_metrics: Describe metrics for clusters.category_null_variables: Visualize null variables in categorical data.normal_distr_plots: Visualize normal distribution plots.distplot_limitations: Visualize limitations of distplot.boxplot_limitations: Visualize limitations of boxplot.violinplot_limitations: Visualize limitations of violinplot.countplot_limitations: Visualize limitations of countplot.categorical_variable_perc: Visualize percentage of categorical variables.spearman_correlation: Visualize spearman correlation.calculate_crammers_v: Calculate Crammer's V.
-
what_if_streamlit:
ShapSaver: Save SHAP explainer components for lazy loading in what-if analysis.ColumnMetadataGenerator: Generate column metadata from a DataFrame or CSV file.
-
monitoring:
mapping: Create column mapping from configuration file for Evidently.test_data: Test data for issues using Evidently test suites.check_data_drift: Check data for drift using Evidently metrics.send_email_with_table: Send email notifications with HTML tables for monitoring alerts.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
utilsds-1.1.11.tar.gz
(43.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
utilsds-1.1.11-py3-none-any.whl
(50.7 kB
view details)
File details
Details for the file utilsds-1.1.11.tar.gz.
File metadata
- Download URL: utilsds-1.1.11.tar.gz
- Upload date:
- Size: 43.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
259d99b758d041f8c4bfaf2b53d0969122ee15b3e003c8e22e8597e0ab280ae1
|
|
| MD5 |
9fa1007158d1af96920a8db5b67cce00
|
|
| BLAKE2b-256 |
abdb85cc26bf8d64ff224273d3725cd652939420492cc70a8aa29c705da314d8
|
File details
Details for the file utilsds-1.1.11-py3-none-any.whl.
File metadata
- Download URL: utilsds-1.1.11-py3-none-any.whl
- Upload date:
- Size: 50.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
012a892528421a82b2139f01d47155508994dac33f4ae89d0f91f81649e64a2c
|
|
| MD5 |
af28b52ed902b6d54b130bd1650d97e0
|
|
| BLAKE2b-256 |
cd111f47310603873c6f973c5c17ceae8ddaa2bbe152599f554f006f09f4864f
|