Intelligent serverless orchestration for data platforms - eliminate idle cluster waste and cold start latency

These details have not been verified by PyPI

Project links

Project description

Ghost Compute

Intelligent Serverless Orchestration for Data Platforms

Ghost eliminates the $44.5B annual waste from idle clusters and cold start latency in enterprise data infrastructure. Drop-in optimization for Databricks, EMR, Synapse, and Spark workloads.

The Problem

Enterprises face an impossible tradeoff:

Option	Problem
Keep clusters warm	Pay for 24/7 idle compute (~30% waste)
Cold start on-demand	5-35 minute startup delays, missed SLAs
Vendor serverless	Premium pricing, vendor lock-in, limited control

Ghost solves this by providing intelligent cluster orchestration that delivers sub-second start times while eliminating idle waste.

Key Features

🔮 Ghost Predict

ML-powered predictive provisioning that pre-warms resources before you need them.

💤 Ghost Hibernate

State preservation that snapshots clusters to object storage for instant resume.

🏊 Ghost Pool

Cross-workload resource sharing that maximizes utilization across teams.

⚡ Ghost Spot

Autonomous spot/preemptible instance management with graceful failover.

📊 Ghost Insight

Real-time cost attribution and optimization recommendations.

Quick Start

Installation

One command to install Ghost Compute with all platforms:

pip install ghost-compute

This single install includes support for:

Databricks (Azure, AWS, GCP)
Amazon EMR
Azure Synapse Analytics
Google Cloud Dataproc

Install from source:

git clone https://github.com/ghost-ai-dev/ghost-compute.git
cd ghost-compute
pip install -e .

Basic Usage

from ghost import GhostClient

# Initialize Ghost
ghost = GhostClient(
    platform="databricks",
    credentials_path="~/.ghost/credentials.json"
)

# Enable intelligent orchestration
ghost.optimize(
    workspace_id="your-workspace",
    strategies=["predict", "hibernate", "spot"],
    target_savings=0.40  # 40% cost reduction target
)

# Monitor savings
stats = ghost.get_stats()
print(f"Monthly savings: ${stats.savings_usd:,.2f}")
print(f"Cold starts eliminated: {stats.cold_starts_prevented}")

CLI Usage

# View supported platforms
ghost platforms

# Connect to your platform
# Databricks
ghost connect databricks --workspace-url https://xxx.cloud.databricks.com --token YOUR_TOKEN

# AWS EMR
ghost connect emr --profile default --region us-east-1

# Azure Synapse
ghost connect synapse --subscription-id YOUR_SUB_ID --resource-group YOUR_RG

# Google Dataproc
ghost connect dataproc --project-id YOUR_PROJECT --region us-central1

# Analyze current waste
ghost analyze --output report.json

# Enable optimization
ghost optimize --strategies predict,hibernate,spot

# List clusters
ghost clusters

# View optimization insights
ghost insights

Supported Platforms

All platforms are included in the single pip install ghost-compute package.

Platform	Status	Features
Databricks	✅ GA	Predict, Hibernate, Pool, Spot, Insight
Amazon EMR	✅ GA	Predict, Hibernate*, Spot, Pool, Insight
Azure Synapse	✅ GA	Predict, Hibernate (auto-pause), Pool, Insight
Google Dataproc	✅ GA	Predict, Hibernate*, Preemptible VMs, Pool, Insight
Cloudera CDP	🚧 Alpha	Insight only (coming soon)
Self-managed Spark	🚧 Alpha	Pool, Spot (coming soon)

*EMR and Dataproc hibernation works via cluster termination with state preservation for fast recreation.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    YOUR APPLICATION                          │
│         (Databricks / EMR / Synapse / Dataproc)             │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      GHOST LAYER                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │   Predict   │  │  Hibernate  │  │        Spot         │  │
│  │  Scheduler  │  │   Manager   │  │    Orchestrator     │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │    Pool     │  │   Insight   │  │     Multi-Cloud     │  │
│  │   Manager   │  │   Engine    │  │     Abstraction     │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                  CLOUD INFRASTRUCTURE                        │
│              (AWS / Azure / GCP / On-Prem)                  │
└─────────────────────────────────────────────────────────────┘

How It Works

1. Predictive Provisioning

Ghost analyzes your workload patterns to predict when clusters will be needed:

# Ghost learns from historical patterns
# - Scheduled jobs (cron patterns)
# - User activity (login times, query patterns)
# - Data arrival (streaming triggers)
# - Seasonal trends (end of month, quarterly)

# Pre-warms clusters 30-60 seconds before needed
# Result: Sub-second perceived start time

2. State Hibernation

Instead of terminating clusters, Ghost preserves state:

# Traditional approach:
# Terminate → Cold start (5-35 min) → Re-initialize

# Ghost approach:
# Hibernate → Snapshot to S3 → Resume in <5 sec

3. Intelligent Pooling

Share warm resources across workloads:

# Team A finishes job at 2:00 PM
# Team B starts job at 2:05 PM
# Ghost transfers warm instances → Zero cold start for Team B

4. Spot Orchestration

Maximize savings with automatic spot management:

# Ghost automatically:
# - Uses spot instances for interruptible workloads
# - Monitors interruption signals
# - Checkpoints state before termination
# - Fails over to on-demand gracefully

Configuration

Environment Variables

GHOST_API_KEY=your-api-key
GHOST_PLATFORM=databricks
GHOST_WORKSPACE_URL=https://xxx.cloud.databricks.com
GHOST_LOG_LEVEL=INFO

Configuration File

# ghost.yaml
platform: databricks
workspace_url: https://xxx.cloud.databricks.com

strategies:
  predict:
    enabled: true
    lookahead_minutes: 60
    confidence_threshold: 0.8

  hibernate:
    enabled: true
    idle_timeout_minutes: 10
    storage_backend: s3
    storage_bucket: ghost-hibernate-states

  spot:
    enabled: true
    max_spot_percentage: 80
    fallback_to_ondemand: true
    interruption_buffer_seconds: 120

  pool:
    enabled: true
    cross_team_sharing: true
    max_idle_instances: 10

  insight:
    enabled: true
    cost_alerts: true
    alert_threshold_usd: 1000

exclusions:
  - cluster_name: "production-critical-*"
  - tag: "ghost:exclude"

Pricing

Ghost operates on a savings-share model:

Tier	Monthly Compute Spend	Ghost Fee
Starter	< $50K	25% of savings
Growth	$50K - $250K	20% of savings
Enterprise	> $250K	Custom

No savings = No payment. We only charge when we deliver results.

Benchmarks

Metric	Before Ghost	After Ghost	Improvement
Average cold start	8.5 min	0.8 sec	99.8% faster
Idle compute waste	32%	4%	87% reduction
Monthly spend ($100K baseline)	$100,000	$58,000	42% savings
SLA misses (5-min threshold)	23/month	0/month	100% eliminated

Documentation

Examples

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

# Development setup
git clone https://github.com/ghost-ai-dev/ghost-compute.git
cd ghost-compute
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
pytest

Security

SOC 2 Type II certified
No data leaves your cloud environment
All state stored in your own S3/Blob/GCS buckets
Role-based access control
Audit logging

Report security issues to security@ghost-compute.io

License

Apache License 2.0 - see LICENSE

Support

📧 Email: support@ghost-compute.io
💬 Slack: ghost-compute.slack.com
📖 Docs: docs.ghost-compute.io
🐛 Issues: GitHub Issues

Built by Ghost AI | Eliminating waste in enterprise data infrastructure

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghost_compute-0.1.0.tar.gz (74.7 kB view details)

Uploaded Jan 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ghost_compute-0.1.0-py3-none-any.whl (75.0 kB view details)

Uploaded Jan 24, 2026 Python 3

File details

Details for the file ghost_compute-0.1.0.tar.gz.

File metadata

Download URL: ghost_compute-0.1.0.tar.gz
Upload date: Jan 24, 2026
Size: 74.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ghost_compute-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1dc30db7dfe48208c8b453ee586a2afe9961fdaef68f50a37bed506495fc8a61`
MD5	`77a17a94c56725cb4fe3cc62dddc5f0d`
BLAKE2b-256	`c63c338f58916a47ff7a10428687b33a1cf5cc1529ead42223a0a206506563c0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ghost_compute-0.1.0.tar.gz:

Publisher: publish.yml on CruiseAI/Ghost

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ghost_compute-0.1.0.tar.gz
- Subject digest: 1dc30db7dfe48208c8b453ee586a2afe9961fdaef68f50a37bed506495fc8a61
- Sigstore transparency entry: 850164547
- Sigstore integration time: Jan 24, 2026
Source repository:
- Permalink: CruiseAI/Ghost@ce8e73a9715a4f9241227ce9b1f80ed4a773c64d
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/CruiseAI
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ce8e73a9715a4f9241227ce9b1f80ed4a773c64d
- Trigger Event: release

File details

Details for the file ghost_compute-0.1.0-py3-none-any.whl.

File metadata

Download URL: ghost_compute-0.1.0-py3-none-any.whl
Upload date: Jan 24, 2026
Size: 75.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ghost_compute-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2f445c601f4bb19a9182e86959a63fd5dfa19d9308708951b8018fe3216ded45`
MD5	`13e0f1c850ecbacf4631859c5326bc30`
BLAKE2b-256	`c9633e55a73f38e2ef88417b41d736c083ee5f961d813fb9bff30ea4b8f8690b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ghost_compute-0.1.0-py3-none-any.whl:

Publisher: publish.yml on CruiseAI/Ghost

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ghost_compute-0.1.0-py3-none-any.whl
- Subject digest: 2f445c601f4bb19a9182e86959a63fd5dfa19d9308708951b8018fe3216ded45
- Sigstore transparency entry: 850164549
- Sigstore integration time: Jan 24, 2026
Source repository:
- Permalink: CruiseAI/Ghost@ce8e73a9715a4f9241227ce9b1f80ed4a773c64d
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/CruiseAI
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ce8e73a9715a4f9241227ce9b1f80ed4a773c64d
- Trigger Event: release

ghost-compute 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Ghost Compute

The Problem

Key Features

🔮 Ghost Predict

💤 Ghost Hibernate

🏊 Ghost Pool

⚡ Ghost Spot

📊 Ghost Insight

Quick Start

Installation

Basic Usage

CLI Usage

Supported Platforms

Architecture

How It Works

1. Predictive Provisioning

2. State Hibernation

3. Intelligent Pooling

4. Spot Orchestration

Configuration

Environment Variables

Configuration File

Pricing

Benchmarks

Documentation

Examples

Contributing

Security

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance