SequenceIQ Cloudbreak Hadoop Cluster Deployments

I saw activity on Twitter talking about Cloudbreak from SequenceIQ as a method for deploying Hadoop clusters into several public cloud providers. My interest was peaked and I started reading through the Cloudbreak documentation on Saturday night. About 20 minutes later, I had an account setup, tied it to my AWS account and started deploying a cluster. Approximately 15 minutes later, the cluster was deployed and Ambari was setup running and I could log into the UI.

The interface is truly brilliant! I am pretty impressed with the functionality it provides, the ease of setup and the blueprinting functionality it provides. The project is only in a public beta state right now, but I plan to keep my eyes on the project and see how it develops. If you are interested in reading more about my experience with the very first cluster and some screenshots of the interface, keep reading!

Cloudbreak UI & Cluster Deployment

Upon deployment of the cluster, the Dashboard UI immediately began showing me status updates on the ribbon and a small square widget. When the widget was clicked, it expanded to show further details of the cluster and the status.

sequenceiq cloudbreak

cloudbreak-screen-02

 

I logged into Ambari after the cluster reported that it was deployed and Ambari was accessible. The Cloudbreak dashboard had a link sending me to the Ambari UI. Once logged in, I was pleased to see some basic status items displayed, along with a list of services available in the cluster.

Note: The Ambari login credentials for the Ambari UI are the defaults (admin/admin). The Cloudbreak UI did not tell me this anywhere that I saw, so I had to Google what the defaults were.

cloudbreak-screen-03Poking around the Ambari interface, it is pretty easy to see what nodes the Cloudbreak service had deployed inside AWS and their relevant information.

cloudbreak-screen-04

Noticing the metric widgets were reporting they were unable to report/collect data because the Ganglia service was not deployed, I started through the process of adding services to the running cluster. The Ambari interface went through  a multi-step process, allowing me to select the services I wanted and checking prerequisites for those services. In my case, the cluster also needed the Tez service in order to deploy the services I select.

cloudbreak-screen-05

cloudbreak-screen-06

cloudbreak-screen-07

Here is where things went a little sideways — the deployment of the new services stalled at 2/10.

cloudbreak-screen-08

After about 30 minutes and several attempts to refresh the page, I closed the page and went back into Ambari through the Cloudbreak dashboard. When Ambari loaded, I was presented with the following screen and no way to interact with Ambari — it appeared wedged.

cloudbreak-screen-09

This allowed me to test the termination functionality of Cloudbreak and seeing how it removes all of the objects from my AWS account.

cloudbreak-screen-10

cloudbreak-screen-11

All-in-all a great first experience with Cloudbreak. I am going to continue to play with it — especially since I am interested in how they are using Docker containers in the service.

 

Upgrading Apache Mesos & Marathon on VMware Big Data Extensions

MesosphereLogo

There are a couple new versions for both Apache Mesos and Mesosphere Marathon. Since one of the great things about using VMware Big Data Extensions to handle the deployment and management of clusters is the ability to ensure control of the packages and versions within the SDLC. There were a few changes to be made within Chef to support Mesos v0.22.0 and Marathon 0.8.0* within the Big Data Extensions management server.

First thing, you need to grab the new RPM files and place them into your management server under /opt/serengeti/www/yum/repos/mesos/0.21.0 directory. For my own installation, I created a new directory called mesos/0.22.0 and copied the packages that had not been updated into it — but that decision is up to you. Once the files are there, you can choose whether to increment the version number in /opt/serengeti/www/distros/manifest file:

44   {
45     "name": "mesos",
46     "vendor": "MESOS",
47     "version": "0.22.0",
48     "packages": [
49       {
50         "package_repos": [
51           "https://bde.localdomain/yum/mesos.repo"
52         ],
53         "roles": [
54           "zookeeper",
55           "mesos_master",
56           "mesos_slave",
57           "mesos_docker",
58           "mesos_chronos",
59           "mesos_marathon"
60         ]
61       }
62     ]
63   },

If you choose to update the manifest file, make sure you restart tomcat on the BDE management server. You normally would be done at this point, but there were changes in where Mesos and Marathon install two files that we’ll need to account for. The old files were in subdirectories under /usr/local and are now placed in subdirectories of /usr.

/opt/serengeti/chef/cookbooks/mesos/templates/default/marathon.conf.erb:

  9 exec /usr/bin/marathon

/opt/serengeti/chef/cookbooks/mesos/default/slave.rb:

 78 template '/usr/etc/mesos/mesos-slave-env.sh' do
 79   source 'mesos-slave-env.sh.erb'
 80   variables(
 81     zookeeper_server_list: zk_server_list,
 82     zookeeper_port: zk_port,
 83     zookeeper_path: zk_path,
 84     logs_dir: node['mesos']['common']['logs'],
 85     work_dir: node['mesos']['slave']['work_dir'],
 86     isolation: node['mesos']['isolation'],
 87   )
 88   notifies :run, 'bash[restart-mesos-slave]', :delayed
 89 end

Update the files in the Chef server by running ‘knife cookbook upload -a‘ and you are all set.

Go forth and deploy your updated Apache Mesos 0.22.0 and Marathon 0.8.0 clusters!

*Marathon 0.8.1 has been released but there has not been a RPM file created by Mesosphere yet.

HAProxy support for Mesos in vSphere Big Data Extensions

I realized late last night the current vSphere Big Data Extensions fling does not have HAProxy built into it for the Mesos cluster deployments. After a bit of reading and testing new pieces inside the Chef recipes, I have added support so that HAProxy is running on all of the Mesos nodes. The first thing is to add the HAProxy package to the /opt/serengeti/chef/cookbooks/mesos/recipes/install.rb file:

 72   %w( unzip libcurl haproxy ).each do |pkg|
 73     yum_package pkg do
 74       action :install
 75     end
 76   end

There is also a script that Mesosphere provides to modify the HAProxy configuration file and reload the rules when changes occur. You can find instructions on the file and how to incorporate it on the Mesosphere page.

Note: I had to edit ‘sudo’ out of the lines inside the script in order for Chef to execute it properly.

After copying the file haproxy-marathon-bridge into my Chef server, I added the following code to the same install.rb file to get things all setup and configured properly:

 82   directory "/etc/haproxy-marathon-bridge" do
 83     owner 'root'
 84     group 'root'
 85     mode '0755'
 86     action :create
 87   end
 88 
 89   template '/usr/local/bin/haproxy-marathon-bridge' do
 90     source 'haproxy-marathon-bridge.erb'
 91     action :create
 92   end
 93 
 94   master_ips = mesos_masters_ip
 95   slave_ips = mesos_slaves_ip
 96 
 97   all_ips = master_ips
 98   all_ips += slave_ips
 99   
100   template '/etc/haproxy-marathon-bridge/marathons' do
101     source 'marathons.erb'
102     variables(
103       haproxy_server_list: all_ips
104     )
105     action :create
106   end
107   
108   execute 'configure haproxy' do
109     command 'chkconfig haproxy on; service haproxy start'
110   end
111   
112   execute 'setup haproxy-marathon-bridge' do
113     command 'chmod 755 /usr/local/bin/haproxy-marathon-bridge; /usr/local/bin/haproxy-marathon-bridge install_cronjob'
114   end

There is also a bit of supporting code needed for lines 94-98 above that were added to /opt/serengeti/chef/cookbooks/mesos/libraries/default.rb:

  1 module Mesosphere
  2 
  3   def mesos_masters_ip
  4     servers = all_providers_fqdn_for_role("mesos_master")
  5     Chef::Log.info("Mesos master nodes in cluster #{node[:cluster_name]} are: #{servers.inspect}")
  6     servers
  7   end
  8 
  9   def mesos_slaves_ip
 10     servers = all_providers_fqdn_for_role("mesos_slave")
 11     Chef::Log.info("Mesos slave nodes in cluster #{node[:cluster_name]} are: #{servers.inspect}")
 12     servers
 13   end
 14 
 15 end
 16 
 17 class Chef::Recipe; include Mesosphere; end

The last thing needed is a new template file for the /etc/haproxy-marathon-bridge/marathons file that is needed by the script provided by Mesosphere. I created the file /opt/serengeti/chef/cookbooks/mesos/templates/default/marathons.erb:

  1 # Configuration file for haproxy-marathon-bridge script
  2 <%
  3   ha_url_list = []
  4   @haproxy_server_list.each do |ha_server|
  5     ha_url_list << "#{ha_server}"
  6   end
  7 %>
  8 <%= ha_url_list.join(":8080\n") + ":8080" %>

At this point, all of the modifications can be uploaded to the Chef server with the command knife cookbook upload -a and a new cluster can be deployed with HAProxy support.

After deploying a nginx workload, you scale it out and check the /etc/haproxy/haproxy.cfg file on a master node and see entries like:

[root@hadoopvm388 haproxy]# cat haproxy.cfg global
  daemon
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  maxconn 4096
defaults
  log            global
  retries             3
  maxconn          2000
  timeout connect  5000
  timeout client  50000
  timeout server  50000
listen stats
  bind 127.0.0.1:9090
  balance
  mode http
  stats enable
  stats auth admin:admin
listen nginx-80
  bind 0.0.0.0:80
  mode tcp
  option tcplog
  balance leastconn
  server nginx-10 hadoopvm382.localdomain:31000 check
  server nginx-9 hadoopvm390.localdomain:31000 check
  server nginx-8 hadoopvm387.localdomain:31000 check
  server nginx-7 hadoopvm389.localdomain:31000 check
  server nginx-6 hadoopvm386.localdomain:31000 check
  server nginx-5 hadoopvm383.localdomain:31000 check
  server nginx-4 hadoopvm378.localdomain:31001 check
  server nginx-3 hadoopvm381.localdomain:31000 check
  server nginx-2 hadoopvm385.localdomain:31000 check
  server nginx-1 hadoopvm378.localdomain:31000 check

Enjoy!