Skip to main content

R clusters on AWS

Project description

rcluster makes launching and accessing an R cluster on AWS simple and
accessible.

This repository will:

* Create a connection to your AWS account
* Create a R cluster AMI saved to your AWS registry
* Allow you to launch a master and a stated number of worker nodes, automating
the network connections between them and hosting a common NFS-based home
folder under the default `cluster` user:
* /home/cluster is shared between master and all workers
* /home/cluster/hostfile contains the IPs of accessible worker nodes and
spare master nodes, repeated based on the number of available worker cores
* `cluster` user's .Rprofile defines an R function (`defaultCluster()`)
which will reference the hostfile to create a PSOCK-based cluster

After that, login to RStudio Server as normal on the master, run
`defaultCluster()`, and use the returned parallel cluster object with
`parLapply()` and its peers.


# Getting Started

First, you must create and save locally your AWS access key ID and secret access
key ([instructions](http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSGettingStartedGuide/AWSCredentials.html)).

Next, run `rcluster-config` from your command line. Note that this function
will, by default, write your AWS access key and secret access key to a hidden
folder in your user directory.

There are currently three functions to launch and manage an R cluster:

* `rcluster` - Launch an R cluster on AWS using the default configuration file.
This function will open your default browser to the RStudio Server login
page on the master instance.
* `rcluster-open` - Access an active R cluster (opens a new tab in your web
browser to the RStudio Server instance, if available).
* `rcluster-terminate` - Terminate all instances associated with your `rcluster`
configuration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rcluster-0.2.0.zip (15.5 kB view hashes)

Uploaded Source

Built Distribution

rcluster-0.2.0-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page