Reusable analytics and ML utilities for transport risk detection and streaming dashboards.
Project description
srx-lib-ml
Reusable, production-grade pandas and scikit-learn utilities extracted from multiple SRX analytics apps. The library is domain-agnostic: pass your column names, not ours. Focus areas:
- fast data loading and feature engineering for trip/order/event data
- anomaly detection and clustering using robust defaults
- risk profiling across vehicles/assets, routes, and temporal dimensions
- optional Streamlit-friendly helpers for dashboards
Quick start
pip install -e ./srx-lib-ml
import pandas as pd
from srx_lib_ml import features, anomaly, geo, risk, routes
# 1) Normalize your columns to a canonical schema
mapping = {
"Start Time": "start_time",
"Stop Time": "end_time",
"Distance (Km)": "distance_km",
"Avg Speed": "avg_speed_kmh",
"StartLat": "start_lat",
"StartLon": "start_lon",
"StopLat": "stop_lat",
"StopLon": "stop_lon",
}
df = features.standardize_columns(pd.read_csv("journeys.csv"), mapping)
# 2) Feature engineering + anomaly scoring
df = features.enrich_journey_frame(df)
df = anomaly.detect_isolation_forest(df)
# 3) Geo clustering (DBSCAN in degrees)
df = geo.assign_dbscan_clusters(df, lat_col="start_lat", lon_col="start_lon", label_col="start_cluster")
# 4) Route/vehicle rollups with your chosen ids
vehicle_profile = risk.vehicle_risk_profile(df, vehicle_id_col="VID", vehicle_name_col="VName", distance_col="distance_km")
route_profile = risk.route_risk_analysis(df, start_label_col="Start Location", stop_label_col="Stop Location", vehicle_id_col="VID", distance_col="distance_km")
# Procurement-style routing (haversine) with fully custom columns
orders = routes.parse_latlon_column(pd.read_csv("orders.csv"), source_col="Pickup Location", lat_col="pickup_lat", lon_col="pickup_lon")
orders["pickup_zone"] = routes.dbscan_haversine(orders.dropna(subset=["pickup_lat", "pickup_lon"]), lat_col="pickup_lat", lon_col="pickup_lon")
orders = routes.add_route_pairs(orders, origin_col="pickup_zone", dest_col="dropoff_zone", route_col="route_id")
perf = routes.actor_route_performance(
orders,
route_col="route_id",
actor_col="Partner",
id_col="External Id",
success_flag_col="Pickup Actual Time",
distance_col="Distance (KM)",
)
The modules stay parameterized so they can be reused across transport, procurement, logistics, or other journey/order/event datasets—pass your own column names to avoid recoding.
Module guide
featuresstandardize_columns(df, mapping): rename columns into a canonical schema.ensure_columns(df, required): add missing columns as NaN.enrich_journey_frame(df, ...): derive durations, speed deviation, rule-based flags (long stop, slow, zero distance, high deviation).time_category(hour): shared time bucketer.
anomalydetect_isolation_forest(df, config=None, use_enhanced_features=False, feature_override=None): add anomaly scores/flags.
geohaversine_distance(lat1, lon1, lat2, lon2): km distance.assign_dbscan_clusters(df, lat_col, lon_col, label_col="cluster", config=None).assign_kmeans_zones(df, lat_col, lon_col, label_col="zone", config=None).
location_zonesapply_location_risk_zones(df, location_sheets, lat_col="latitude_deg", lon_col="longitude_deg", radius_km=0.5, risk_col="risk_score", start_lat_col="start_lat", start_lon_col="start_lon", stop_lat_col="stop_lat", stop_lon_col="stop_lon", zone_score_mapping=None).
riskvehicle_risk_profile(df, vehicle_id_col="vehicle_id", vehicle_name_col="vehicle_name", risk_col="risk_score", anomaly_flag_col="is_anomaly", anomaly_score_col="anomaly_score_normalized", distance_col="distance_km", ...).route_risk_analysis(df, start_label_col="start_label", stop_label_col="stop_label", journey_id_col="journey_id", vehicle_id_col="vehicle_id", risk_col="risk_score", anomaly_flag_col="is_anomaly", distance_col=None, min_journeys=3).
temporaltemporal_breakdown(df, risk_col="risk_score", anomaly_flag_col="is_anomaly", journey_id_col="journey_id"): hourly/daily/time-category aggregates.
routesparse_latlon_column(df, source_col, lat_col, lon_col).dbscan_haversine(df, lat_col, lon_col, eps_km=5.0, min_samples=5).add_route_pairs(df, origin_col, dest_col, route_col="route_id").actor_route_performance(df, route_col, actor_col, id_col, success_flag_col, distance_col, ontime_flag_col=None, min_orders=5).route_complexity_breakdown(df, distance_col, origin_zone_col, dest_zone_col, id_col, ontime_flag_col=None).reallocation_recommendations(perf_df, route_col="route_id", actor_col="actor", success_col="success_rate_pct", total_orders_col="total_orders", min_orders=10, min_gap=15.0).describe_clusters(labels): basic cluster stats.
viz(install with extraviz→pip install -e ./srx-lib-ml[viz])hero_metric(label, value, delta=None, help_text=None, cols=3, col_idx=0).hero_card(label, value, subtext=None, help_text=None, background="#0f172a", text_color="#e2e8f0", cols=3, col_idx=0).card_container(header, help_text=None, background="#0f172a", text_color="#e2e8f0", border_color="#1f2937", padding="16px", radius="12px", body=None): returns a body container for children.badge(label, color="#2563eb", text_color="#ffffff", padding="4px 10px"): returns HTML string.alert_box(message, tone="info"): info/success/warning/danger banner.section_header(title, description=None, divider=True).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file srx_lib_ml-0.1.1.tar.gz.
File metadata
- Download URL: srx_lib_ml-0.1.1.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7268a0d7411d003a6c3b322ecfd72a62fdffa43c3ab2d1470fb63816df01972e
|
|
| MD5 |
f5fd51878d576fb1d14aaced06b5407c
|
|
| BLAKE2b-256 |
21d9610a1db53c797d8fbcced259f00ef49a3e3cfc2fd3f3ea51a8b767457508
|
File details
Details for the file srx_lib_ml-0.1.1-py3-none-any.whl.
File metadata
- Download URL: srx_lib_ml-0.1.1-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25f36444dcf04a543fef790f04ae707230469f21fee74b9d596fcedb71dd5af6
|
|
| MD5 |
1111ec0bf495536be0a378c863ae961b
|
|
| BLAKE2b-256 |
f0da96c4e44406d222388a43867a43b11d65f84b817765cf6d55b144a9e6d7e3
|