Pure analytics engine for statistical analysis and insight generation
Project description
Xelytics-Core
Pure analytics engine for statistical analysis and insight generation.
Current Version: 0.2.0-alpha.1 (In Development)
Status: Alpha - Phase 1, 2, 3 Complete
What's New in v0.2.0
✅ Phase 1 (Foundation): Complete
- Extended schemas for v0.2.0 features
- Backward compatibility guaranteed
✅ Phase 2 (Time Series Analysis): Complete
- Time series detection and validation
- Trend and seasonality decomposition (STL, classical)
- ARIMA and Exponential Smoothing forecasting
- Anomaly detection (Z-score, IQR, MAD, Isolation Forest)
- Change point detection
✅ Phase 3 (Clustering): Complete
- K-Means clustering with optimal K selection
- DBSCAN density-based clustering
- Hierarchical/Agglomerative clustering
- Cluster profiling and characterization
🚧 Coming Soon (Phase 4-7):
- Performance optimization (parallel processing, caching)
- Enhanced LLM providers (Anthropic, Azure, Gemini)
- Database connectors (PostgreSQL, MySQL, BigQuery)
- Interactive HTML report generation
Installation
pip install -e .
Quick Start
from xelytics import analyze, AnalysisConfig
import pandas as pd
# Load your data
df = pd.read_csv("data.csv")
# Run automated analysis
result = analyze(df, mode="automated")
# Access results
print(f"Analyzed {result.metadata.row_count} rows")
print(f"Found {len(result.statistics)} statistical tests")
print(f"Generated {len(result.visualizations)} visualizations")
print(f"Produced {len(result.insights)} insights")
# Export to JSON
json_output = result.to_json()
API Contract
from xelytics import analyze, AnalysisConfig, AnalysisResult
result = analyze(
data=df,
mode="automated", # or "semi-automated"
config=AnalysisConfig(
significance_level=0.05,
enable_llm_insights=True,
max_visualizations=10,
)
)
Output Schema
AnalysisResult(
summary=DatasetSummary(...),
statistics=[StatisticalTestResult(...), ...],
visualizations=[VisualizationSpec(...), ...],
insights=[Insight(...), ...],
metadata=RunMetadata(...),
)
v0.2.0 Features
Time Series Analysis
from xelytics import analyze, AnalysisConfig
config = AnalysisConfig(
enable_time_series=True,
datetime_column='date',
forecast_periods=30
)
result = analyze(df, config=config)
# Access time series results
for ts in result.time_series_analysis:
print(f"Trend: {ts.has_trend}")
print(f"Seasonality: {ts.has_seasonality}")
print(f"Period: {ts.seasonal_period}")
Clustering
from xelytics import analyze, AnalysisConfig
config = AnalysisConfig(
enable_clustering=True,
clustering_algorithm='kmeans', # or 'dbscan', 'hierarchical'
max_clusters=5
)
result = analyze(df, config=config)
# Access cluster results
for cluster in result.clusters:
print(f"Cluster {cluster.cluster_id}: {cluster.size} samples")
print(f"Silhouette score: {cluster.silhouette_score}")
Standalone Modules
# Time Series
from xelytics.timeseries import decompose_time_series, forecast_time_series, detect_anomalies
decomposition = decompose_time_series(df, 'value', datetime_column='date', period=12)
forecast = forecast_time_series(df, 'value', periods=30, method='arima')
anomalies = detect_anomalies(df, 'value', method='zscore', threshold=3.0)
# Clustering
from xelytics.clustering import cluster_kmeans, cluster_dbscan, profile_clusters
kmeans_result, df_clustered = cluster_kmeans(df, n_clusters=5)
dbscan_result, df_clustered = cluster_dbscan(df, eps=0.5, min_samples=5)
profiles = profile_clusters(df, cluster_labels)
Design Principles
- Pure analytics engine - No HTTP, no database, no auth
- Deterministic - Same input = same output
- LLM is optional - Rule-based insights work without LLM
- Type-safe - All inputs/outputs are typed dataclasses
- Backward compatible - v0.1.0 code works in v0.2.0
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xelytics_core-0.2.0.tar.gz.
File metadata
- Download URL: xelytics_core-0.2.0.tar.gz
- Upload date:
- Size: 146.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15f9d8040fd886d89403c1bf600843c127aa0bacd9ad4c26ff80738474b912d1
|
|
| MD5 |
1478f24242ad31b315fc57441c9dba1e
|
|
| BLAKE2b-256 |
957521a1823d6f2ddf0690c8e7b8f0e820e9f901fbba4b5b2f30c8ea9df506d5
|
File details
Details for the file xelytics_core-0.2.0-py3-none-any.whl.
File metadata
- Download URL: xelytics_core-0.2.0-py3-none-any.whl
- Upload date:
- Size: 138.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb8267eaf20b65adfa3abb5c2183c702d6265bf7794f713b91f4f7e48dc20887
|
|
| MD5 |
20a389b6699be4671db64ef3ca6b524e
|
|
| BLAKE2b-256 |
c29972820a12de8d125da9c268ff0e7aa96c54adb815b3e14f1b6bc337964506
|