Skip to main content

StarCluster is a utility for creating and managing computing clusters hosted on Amazon's Elastic Compute Cloud (EC2).

Project description

Author: Justin Riley (justin.t.riley@gmail.com)
Team: Software Tools for Academics and Researchers (http://web.mit.edu/star)

Description:

StarCluster is a utility for creating and managing computing clusters hosted on Amazon’s Elastic Compute Cloud (EC2). StarCluster utilizes Amazon’s EC2 web service to create and destroy clusters of Linux virtual machines on demand.

To get started, the user creates a simple configuration file with their AWS account details and a few cluster preferences (e.g. number of machines, machine type, ssh keypairs, etc). After creating the configuration file and running StarCluster’s “start” command, a cluster of Linux machines configured with the Sun Grid Engine queuing system, an NFS-shared /home directory, and OpenMPI with password-less ssh is created and ready to go out-of-the-box. Running StarCluster’s “stop” command will shutdown the cluster and stop paying for service. This allows the user to only pay for what they use.

StarCluster can also utilize Amazon’s Elastic Block Storage (EBS) volumes to provide persistent data storage for a cluster. EBS volumes allow you to store large amounts of data in the Amazon cloud and are also easy to back-up and replicate in the cloud. StarCluster will mount and NFS-share any volumes specified in the config. StarCluster’s “createvolume” command provides the ability to automatically create, format, and partition new EBS volumes for use with StarCluster.

StarCluster provides a Ubuntu-based Amazon Machine Image (AMI) in 32bit and 64bit architectures. The AMI contains an optimized NumPy/SciPy/Atlas/Blas/Lapack installation compiled for the larger Amazon EC2 instance types. The AMI also comes with Sun Grid Engine (SGE) and OpenMPI compiled with SGE support. The public AMI can easily be customized by launching a single instance of the public AMI, installing additional software on the instance, and then using StarCluster’s “createimage” command to completely automate the process of creating a new AMI from an EC2 instance.

Getting Started:

Install StarCluster using easy_install:

$ sudo easy_install StarCluster

or to install StarCluster manually:

$ (Download StarCluster from http://web.mit.edu/starcluster)
$ tar xvzf starcluster-X.X.X.tar.gz  (where x.x.x is a version number)
$ cd starcluster-X.X.X
$ sudo python setup.py install

After the software has been installed, the next step is to setup the configuration file:

$ starcluster help
StarCluster - (http://web.mit.edu/starcluster) (v. 0.92rc2)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

cli.py:87 - ERROR - config file /home/user/.starcluster/config does not exist

Options:
--------
[1] Show the StarCluster config template
[2] Write config template to /home/user/.starcluster/config
[q] Quit

Please enter your selection:

Select the second option by typing 2 and pressing enter. This will give you a template to use to create a configuration file containing your AWS credentials, cluster settings, etc. The next step is to customize this file using your favorite text-editor:

$ vi ~/.starcluster/config

This file is commented with example “cluster templates”. A cluster template defines a set of configuration settings used to start a new cluster. The example config provides a ‘smallcluster’ template that is ready to go out-of-the-box. However, first, you must fill in your AWS credentials and keypair info:

[aws info]
aws_access_key_id = #your aws access key id here
aws_secret_access_key = #your secret aws access key here
aws_user_id = #your 12-digit aws user id here

The next step is to fill in your keypair information. If you don’t already have a keypair you can create one from StarCluster using:

$ starcluster createkey mykey -o ~/.ssh/mykey.rsa

This will create a keypair called ‘mykey’ on Amazon EC2 and save the private key to ~/.ssh/mykey.rsa. Once you have a key the next step is to fill-in your keypair info in the StarCluster config file:

[key key-name-here]
key_location = /path/to/your/keypair.rsa

For example, the section for the keypair created above using the createkey command would look like:

[key mykey]
key_location = ~/.ssh/mykey.rsa

After defining your keypair in the config, the next step is to update the default cluster template ‘smallcluster’ with the name of your keypair on EC2:

[cluster smallcluster]
keyname = key-name-here

For example, the ‘smallcluster’ template would be updated to look like:

[cluster smallcluster]
keyname = mykey

Now that the config file has been set up we’re ready to start using StarCluster. Next we start a 2-node cluster named mycluster using the default cluster template smallcluster in the example config:

$ starcluster start mycluster

The default_template setting in the [global] section of the config specifies the default cluster template and is automatically set to smallcluster in the example config. You can customize the smallcluster template to change the number of nodes, specify EBS volumes to attach, enable additional plugins, etc.

After the start command completes you should now have a working cluster. You can login to the master node as root by running:

$ starcluster sshmaster mycluster

Once you’ve finished using the cluster and wish to terminate paying for it:

$ starcluster terminate mycluster

Have a look at the rest of StarCluster’s commands:

$ starcluster --help

Dependencies:

  • Amazon AWS Account

  • Python 2.5+

  • Boto 2.0

  • Paramiko 1.7.7.1

  • WorkerPool 0.9.2

  • Jinja2 2.5.5

  • decorator 3.3.1

Learn more…

Watch an ~8min screencast @ http://web.mit.edu/stardev/cluster

To learn more have a look at the rest of the documentation: http://web.mit.edu/stardev/cluster/docs/0.92rc2

The docs explain the configuration file in detail, how to create/use EBS volumes with StarCluster, and how to use the Sun Grid Engine queueing system to submit jobs on the cluster.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

StarCluster-0.92.tar.gz (5.4 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page