Simple Python components for launching and managing servers on a running Spark cluster
# Spark Partition Server
spark-partition-server is a set of light-weight Python components to launch servers on the executors of a Spark cluster.
Spark is designed for manipulating and distributing data within the cluster, but not for allowing clients to interact with the data directly. spark-partition-server provides primitives for launching arbitrary servers on partitions of an RDD, registering and managing the partitions servers on the driver, and collecting any resulting RDD after the partition servers are shutdown.
There are many use-cases such as building ad hoc search clusters to query data more quickly by skipping Spark’s job planning, allowing external services to interact directly with in-memory data on Spark as part of a computing pipeline, and enabling distributed computations amongst executors involving direct communication. Spark Partition Server itself provides building blocks for these use cases.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for spark-partition-server-0.1.5.tar.gz
Hashes for spark_partition_server-0.1.5-py2-none-any.whl