Skip to main content

Simple Python components for launching and managing servers on a running Spark cluster

Project description

# Spark Partition Server

spark-partition-server is a set of light-weight Python components to launch servers on the executors of a Spark cluster.

## Overview

Spark is designed for manipulating and distributing data within the cluster, but not for allowing clients to interact with the data directly. spark-partition-server provides primitives for launching arbitrary servers on partitions of an RDD, registering and managing the partitions servers on the driver, and collecting any resulting RDD after the partition servers are shutdown.

There are many use-cases such as building ad hoc search clusters to query data more quickly by skipping Spark’s job planning, allowing external services to interact directly with in-memory data on Spark as part of a computing pipeline, and enabling distributed computations amongst executors involving direct communication. Spark Partition Server itself provides building blocks for these use cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark-partition-server-0.1.5.tar.gz (8.0 kB view hashes)

Uploaded source

Built Distribution

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page