Skip to main content

From raw data to model insights — for Earth and beyond.

Project description

🌌 cosmicml

From raw data to model insights — for Earth and beyond.

Python License: MIT PyPI

A Python toolkit for ML practitioners and space data enthusiasts. cosmicml handles the full pipeline — data loading, preprocessing, model benchmarking, and SHAP explainability — in clean, importable modules.

Built by @Deepali-07 — ML Engineer & Astrophysics enthusiast.


✨ Features

Module What it does
DataLoader Load CSV, JSON, HDF5, and FITS (astronomy) files
DataSplitter Stratified train/val/test splitting
DataCleaner Imputation, outlier clipping, label encoding
SmartScaler Standard/MinMax/Robust scaling with .revert()
DataBalancer SMOTE, ADASYN, undersampling, SMOTEENN
ModelBenchmarker Run N models → ranked comparison table
HyperparamTuner GridSearch / RandomSearch wrapper
SHAPExplainer One-line SHAP summary, beeswarm, waterfall
ModelReporter Auto-generate clean model performance report
timer Decorator to time any function
TimeIt Context manager for timing code blocks

🚀 Installation

pip install cosmicml

# With all optional extras
pip install cosmicml[all]

# Astronomy FITS support only
pip install cosmicml[astronomy]

⚡ Quick Start

from cosmicml import (
    DataCleaner, SmartScaler, DataBalancer,
    ModelBenchmarker, SHAPExplainer, ModelReporter
)
from cosmicml.data.splitter import DataSplitter

# 1. Split
splitter = DataSplitter(test_size=0.2, val_size=0.1, stratify=True)
X_train, X_val, X_test, y_train, y_val, y_test = splitter.split(X, y)

# 2. Clean
cleaner = DataCleaner(strategy="median", outlier_method="iqr")
X_train = cleaner.fit_transform(X_train)
X_test  = cleaner.transform(X_test)

# 3. Scale
scaler = SmartScaler(method="standard")
X_train = scaler.fit_transform(X_train)
X_test  = scaler.transform(X_test)

# 4. Balance
balancer = DataBalancer(strategy="smote")
X_train, y_train = balancer.fit_resample(X_train, y_train)

# 5. Benchmark
bench = ModelBenchmarker(task="classification")
print(bench.run(X_train, y_train, X_test, y_test))

# 6. Explain
explainer = SHAPExplainer(bench.best_model_, X_train)
explainer.summary(X_test)

# 7. Report
reporter = ModelReporter(bench.best_model_, task="classification")
reporter.report(X_test, y_test)

🔭 Astronomy / FITS Support

from cosmicml import DataLoader

loader = DataLoader("observations.fits")
df = loader.load()  # Returns a clean pandas DataFrame

📁 Project Structure

cosmicml/
├── data/           # DataLoader, DataSplitter
├── preprocess/     # DataCleaner, SmartScaler, DataBalancer
├── training/       # ModelBenchmarker, HyperparamTuner
├── explainability/ # SHAPExplainer, ModelReporter
└── utils/          # logger, timer

🤝 Contributing

Pull requests are welcome! Please open an issue first to discuss what you'd like to change.

git clone https://github.com/Deepali-07/cosmicml
cd cosmicml
pip install -e ".[dev]"
pytest tests/

📄 License

MIT © Deepali

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cosmicml-0.1.0.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cosmicml-0.1.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file cosmicml-0.1.0.tar.gz.

File metadata

  • Download URL: cosmicml-0.1.0.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for cosmicml-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8befabea2e84f8b67c4eaf80b5b85afeddb19cf28dae01d70b16e47c7646eeb5
MD5 15af20db382a08b0876e2c98a57a1fc1
BLAKE2b-256 216f568b181b5a9b3821fa8781b924675665a549f84d62d037a934cd2b047b49

See more details on using hashes here.

File details

Details for the file cosmicml-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cosmicml-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for cosmicml-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9018d8b633917531297061641dcba849e7a6a1b72b44e03baae501a72e8b101b
MD5 9b0e9ce0b0966a25e86958eae259d62d
BLAKE2b-256 6ef1d23b53702475a189e900a85a722858cdc5daa006c834dc6a7d324900217a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page