Tools for ML projects and data management
Project description
ML Analytics Tools
Utilities for common analytics and machine learning workflows: Redshift, S3, Google Sheets, Slack, MLflow, model evaluation, and SQL pipelines.
The package is intentionally infrastructure-neutral. Buckets, credentials, MLflow hosts, and tokens are provided by your environment or by explicit arguments.
What Is Included
DataConnector: run Redshift SQL, load SQL files, unload/load data through S3, and create Redshift tables from DataFrames.S3Connector: read, write, list, delete, and query S3 data with DuckDB.GSheet: read, write, share, and export Google Sheets data.SlackConnector: send messages, upload files, and manage simple Slack interactions.ModelManager: create MLflow experiments, log models, register versions, manage aliases, and handle permissions.model_tools: classification, regression, survival analysis, CatBoost helpers, plotting, and reporting utilities.utils: project-root discovery, SQL file loading, logging, credentials, and YAML SQL pipelines.
Install
From PyPI, after a release is available:
uv add ml-analytics-tools
Directly from GitHub:
uv add git+https://github.com/sdaza/ml-analytics-tools
For local development:
uv sync --all-groups
Configuration
The package loads a .env file from the project root when it is imported.
Only configure the services you use.
# Redshift
BI_REDSHIFT_HOST=redshift-cluster.example.com
BI_REDSHIFT_DB=analytics
BI_REDSHIFT_USER=analytics_user
BI_REDSHIFT_PASSWORD=secret
BI_REDSHIFT_PORT=5439
# S3
ML_ANALYTICS_S3_BUCKET=my-analytics-bucket
# MLflow
MLFLOW_TRACKING_URI=https://mlflow.example.com
MLFLOW_TRACKING_USERNAME=user@example.com
MLFLOW_TRACKING_PASSWORD=secret
# Google Sheets
GSHEET_SPREADSHEET_ID=optional-default-sheet-id
GOOGLE_CREDENTIALS='{"type":"service_account", ...}'
# Slack
SLACK_BOT_TOKEN=xoxb-your-token
S3 buckets are never hard-coded. Pass bucket=... or s3_bucket=..., or set
ML_ANALYTICS_S3_BUCKET.
AWS Authentication
Use the CLI helper for AWS SSO:
ml-analytics-auth
You can also call it from Python:
from ml_analytics import ensure_aws_authenticated
ensure_aws_authenticated()
See AWS Authentication and CLI Commands for details.
Quick Examples
Query Redshift
from ml_analytics import DataConnector
dc = DataConnector()
df = dc.sql("SELECT * FROM analytics.customer_features LIMIT 100")
df_polars = dc.sql("queries/features.sql", format="polars", country="es")
Create A Redshift Table From A DataFrame
dc.create_table_from_dataframe(
df,
table="model_scores",
schema="analytics",
drop_existing_table=True,
)
Work With S3
from ml_analytics import S3Connector
s3 = S3Connector(bucket="my-analytics-bucket", s3_root="projects/churn")
s3.save_dataframe(df, directory="outputs", file_name="scores")
summary = s3.query(
"""
SELECT segment, count(*) AS rows
FROM read_parquet('s3://my-analytics-bucket/projects/churn/outputs/*.parquet')
GROUP BY segment
"""
)
Read And Write Google Sheets
from ml_analytics import GSheet
gsheet = GSheet(credentials_path="gsheet_credentials.json")
df = gsheet.read_sheet(spreadsheet_id="...", sheet_name="Input")
gsheet.write_sheet(df, spreadsheet_id="...", sheet_name="Results")
Log To MLflow
from ml_analytics import ModelManager
manager = ModelManager(model_name="churn-model", user="user@example.com")
manager.start_run("training")
manager.log_metric("auc", 0.91)
manager.end_run()
Send A Slack Message
from ml_analytics import SlackConnector
slack = SlackConnector()
slack.send_message(channel="#ml-alerts", text="Training finished")
Detailed Guides
| Guide | Use It For |
|---|---|
| AWS Authentication | AWS SSO setup and Python helpers |
| CLI Commands | Available console commands |
| Google Sheets | Sheets setup, sharing, exports, and examples |
| Slack | Slack token setup and message/file examples |
| Tunnel Manager | SSH tunnel configuration and CLI usage |
Development
Run the standard checks before opening a PR:
uv run ruff check
uv run pytest
CI runs Ruff and pytest on Python 3.11 and 3.12.
Releases
This repository uses Release Please. Conventional commits on main create or
update a release PR with the next version and changelog. When that PR is merged,
the release workflow builds the package and publishes it to PyPI through Trusted
Publishing using the pypi GitHub environment.
Contributing
Keep changes small, covered by tests when behavior changes, and free of environment-specific defaults. Prefer explicit configuration over hidden infrastructure assumptions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ml_analytics_tools-0.2.1.tar.gz.
File metadata
- Download URL: ml_analytics_tools-0.2.1.tar.gz
- Upload date:
- Size: 109.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b4f397a165137de51a50296e8bc43938e93c405260ee35ee48e9057420d0813
|
|
| MD5 |
8189956f17a5021aca5d90e42bff0dd0
|
|
| BLAKE2b-256 |
bfede86c3d5e62416e8a6bd00808ccc91fee49a8957f41adc83efd566a36e03f
|
Provenance
The following attestation bundles were made for ml_analytics_tools-0.2.1.tar.gz:
Publisher:
release-please.yml on sdaza/ml-analytics-tools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ml_analytics_tools-0.2.1.tar.gz -
Subject digest:
8b4f397a165137de51a50296e8bc43938e93c405260ee35ee48e9057420d0813 - Sigstore transparency entry: 1581301574
- Sigstore integration time:
-
Permalink:
sdaza/ml-analytics-tools@130edcc31a516318965f2d78f86eb29a3af8ec1c -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sdaza
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@130edcc31a516318965f2d78f86eb29a3af8ec1c -
Trigger Event:
push
-
Statement type:
File details
Details for the file ml_analytics_tools-0.2.1-py3-none-any.whl.
File metadata
- Download URL: ml_analytics_tools-0.2.1-py3-none-any.whl
- Upload date:
- Size: 88.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00f7f16f55d87036c37d377bb9b0d8d839d7423152a6e35a65c70ac0b1d96ef8
|
|
| MD5 |
b68756fd9902f684b81fde0fece186dd
|
|
| BLAKE2b-256 |
61289e4f553678dd2b4da8d86f1cc371c7ac37cafdd9ae6407c83665d5789d77
|
Provenance
The following attestation bundles were made for ml_analytics_tools-0.2.1-py3-none-any.whl:
Publisher:
release-please.yml on sdaza/ml-analytics-tools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ml_analytics_tools-0.2.1-py3-none-any.whl -
Subject digest:
00f7f16f55d87036c37d377bb9b0d8d839d7423152a6e35a65c70ac0b1d96ef8 - Sigstore transparency entry: 1581301692
- Sigstore integration time:
-
Permalink:
sdaza/ml-analytics-tools@130edcc31a516318965f2d78f86eb29a3af8ec1c -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sdaza
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@130edcc31a516318965f2d78f86eb29a3af8ec1c -
Trigger Event:
push
-
Statement type: