Elasticluster

aims to provide a user-friendly command line tool to create, manage and setup computing clusters hosted on cloud infrastructures like Amazon's Elastic Compute Cloud EC2, Google Compute Engine or a private OpenStack cloud. Its main goal is to get your compute cluster up and running with just a few commands.

How it works

The architecture of elasticluster is quite simple: a configuration file defines a set of cluster configurations and information on how to access a specific cloud webservice.

Using the command line (or, very soon, a simple API), you can start a cluster. Elasticluster will connect to the desired cloud, start the virtual machines and wait until they are accessible via ssh.

After all the virtual machines are up and running, elasticluster will use ansible to configure them.

Features

elasticluster offers the following features at the current state:

  • Simple configuration file to define cluster templates
  • Can start and manage multiple independent clusters at the same time
  • Automated setup

Demo Video

Sample Usage

The following sample shows how a cluster is configured with elasticluster and the basic commands to interact with elasticluster through the command line interface. For a full description of the configuration and command line interface see the documentation. It's also possible to specify the examples below in one configuration file.

[cloud/hobbes]
provider=openstack
auth_url=http://hobbes.gc3.uzh.ch:5000/v2.0
username=****REPLACE WITH YOUR USERNAME****
password=****REPLACE WITH YOUR PASSWORD****
project_name=****REPLACE WITH YOUR PROJECT/TENANT NAME****

[login/ubuntu]
image_user=ubuntu
image_user_sudo=root
image_sudo=True
user_key_name=elasticluster
user_key_private=~/.ssh/id_rsa
user_key_public=~/.ssh/id_rsa.pub

[setup/ansible-slurm]
provider=ansible
frontend_groups=slurm_master
compute_groups=slurm_clients

[cluster/slurm]
cloud=hobbes
login=ubuntu
setup_provider=ansible-slurm
security_group=default
# Ubuntu image
image_id=16618a82-92fd-4615-86e6-d354f9f66af5
flavor=m1.small
frontend_nodes=1
compute_nodes=2
ssh_to=frontend
                            
Start slurm cluster with the name `mycluster`
$ elasticluster start slurm -n mycluster
List nodes
$ elasticluster list-nodes mycluster
Grow cluster by 10 compute nodes
$ elasticluster resize mycluster -a 10:compute
SSH into frontend node
$ elasticluster ssh mycluster
SFTP connect
$ elasticluster sftp mycluster
Destroy cluster
$ elasticluster stop mycluster
[cloud/amazon-us-east-1]
provider=ec2_boto
ec2_url=https://ec2.us-east-1.amazonaws.com
ec2_access_key=****REPLACE WITH YOUR ACCESS ID****
ec2_secret_key=****REPLACE WITH YOUR SECRET KEY****
ec2_region=us-east-1

[login/ubuntu]
image_user=ubuntu

[setup/ansible]
provider=ansible
hadoop-name_groups=hadoop_namenode
hadoop-jobtracker_groups=hadoop_jobtracker
hadoop-task-data_groups=hadoop_tasktracker,hadoop_datanode

[cluster/hadoop]
cloud=amazon-us-east-1
login=ubuntu
setup_provider=ansible
security_group=all_tcp_ports
image_id=ami-00000048
flavor=m1.small
hadoop-name_nodes=1
hadoop-jobtracker_nodes=1
hadoop-task-data_nodes=4
ssh_to=hadoop-name


                            
Start hadoop cluster with the name `mycluster`
$ elasticluster start hadoop -n mycluster
Start another hadoop cluster just for the fun of it
$ elasticluster start hadoop -n mysecondcluster
List all clusters
$ elasticluster list
List nodes
$ elasticluster list-nodes mycluster
Grow cluster by 10 data nodes
$ elasticluster resize mycluster -a 10:hadoop-task-data
SSH into frontend node
$ elasticluster ssh mycluster
SFTP connect
$ elasticluster sftp mycluster
Destroy cluster
[cloud/google]
provider=google
gce_project_id=****REPLACE WITH YOUR PROJECT ID****
gce_client_id=****REPLACE WITH YOUR CLIENT ID****
gce_client_secret=****REPLACE WITH YOUR SECRET KEY****

[login/google]
image_user=****REPLACE WITH YOUR GOOGLE ID****
image_user_sudo=root
image_sudo=True
user_key_name=elasticluster
user_key_private=~/.ssh/id_rsa
user_key_public=~/.ssh/id_rsa.pub

[setup/ipython]
provider=ansible
controller_groups=ipython_controller,ipython_engine
worker_groups=ipython_engine

[cluster/ipython]
cloud=google
login=google
setup_provider=ansible_ipython
security_group=all_tcp_ports
image_id=debian-7-wheezy-v20130723
flavor=n1-standard-1
controller_nodes=1
worker_nodes=10
ssh_to=controller
image_userdata=


                            
Start ipython cluster with the name `mycluster`
$ elasticluster start ipython -n mycluster
SSH into controller node
$ elasticluster ssh mycluster
Run some python code in parallel
In [1]: import os

In [2]: from IPython.parallel import Client

In [3]: rc = Client()

In [4]: rc.ids
Out[4]: [0, 1, 2, 3, 4]

In [5]: dview = rc[:]

In [6]: dview.apply_sync(os.getpid)
Out[6]: [18988, 19583, 19003, 20128, 20444]

In [7]: dview.map_sync(lambda x: x**10, range(32))
Out[7]: 
[0,
 1,
 1024,
 59049,
 1048576,
 9765625,
 60466176,
 282475249,
 1073741824,
 3486784401,
 10000000000,
 25937424601,
 61917364224,
 137858491849,
 289254654976,
 576650390625,
 1099511627776,
 2015993900449,
 3570467226624,
 6131066257801,
 10240000000000,
 16679880978201,
 26559922791424,
 41426511213649,
 63403380965376,
 95367431640625,
 141167095653376,
 205891132094649,
 296196766695424,
 420707233300201,
 590490000000000,
 819628286980801]

                        
Grow cluster by 10 new nodes
$ elasticluster resize mycluster -a 10:worker
Destroy cluster
$ elasticluster stop mycluster

This project is an effort of the S3IT: Services and support for Science IT at the University of Zurich, licensed under the GNU General Public License version 3.

University of Zurich
Fork me on GitHub