Exposing Apache Mesos on VMware Big Data Extensions v2.2

The VMware Big Data Extensions v2.2 release included the cookbooks for Apache Mesos and Kubernetes from the Fling released this past spring. However, those cookbooks are not exposed when you deploy the new version. Fortunately, unlocking them only takes a few minutes before they can be made available! I will cover exactly what is needed in order to begin using these Cloud Native App cluster deployments below.

If you jump onto your v2.2 management server and look in the /opt/serengeti/chef/cookbooks directory, you will see all of the Cloud Native App additions.


A quick look to be sure the Chef roles are still defined tells us that they are.



They even did us the favor of including the JSON spec files in the /opt/serengeti/www/specs/Ironfan directory.


The missing pieces are the entries in the /opt/serengeti/www/specs/map and /opt/serengeti/www/distros/manifest files. Those are rather easy to copy out of the VMware Fling itself or re-create manually. If you want to edit the files yourself, here is what needs to be added to the files.


  "vendor" : "Kubernetes",
  "version" : "^(\\d)+(\\.\\w+)*",
  "type" : "Basic Kubernetes Cluster",
  "appManager" : "Default",
  "path" : "Ironfan/kubernetes/basic/spec.json"
  "vendor" : "Mesos",
  "version" : "^(\\d)+(\\.\\w+)*",
  "type" : "Basic Mesos Cluster",
  "appManager" : "Default",
  "path" : "Ironfan/mesos/basic/spec.json"


  "name" : "kubernetes",
  "vendor" : "KUBERNETES",
  "version" : "0.5.4",
  "packages" : [
      "tarball": "kubernetes/kubernetes-0.5.4.tar.gz",
      "roles": [
  "name" : "mesos",
  "vendor" : "MESOS",
  "version" : "0.21.0",
  "packages" : [
      "package_repos": [ ""],
      "roles" : [

The repos built into the Fling are not present (unfortunately) on the management server. This was the only tedious portion of the entire process. The easiest method is to grab the files out of an existing BDE Fling management server and copy them into the new one. The other option is find the latest RPMs on the Internet and add them to the management server manually. In either case, you’ll need to run the CentOS syntax for creating the repository.

Create local repo for Apache Mesos

# su - serengeti
$ cd /opt/serengeti/www/yum
$ vim mesos.repo
name=Apache Mesos

$ mkdir -p repos/mesos/current/RPMS
$ cd repos/mesos/current

The Fling included the following files:
- bigtop-utils-
- chronos-2.3.0-0.1.20141121000021.x86_64.rpm
- docker-io-1.3.1-2.el6.x86_64.rpm
- marathon-0.7.5-1.0.x86_64.rpm
- mesos-0.21.0-1.0.centos65.x86_64.rpm
- subversion-1.6.11-10.el6_5.x86_64.rpm
- zookeeper-
- zookeeper-server-

$ createrepo .

A restart of Tomcat is all that is needed and then you will be able to start deploying Apache Mesos and Kubernetes clusters through BDE v2.2.

If you want to take advantage of the Instant Clone functionality, you will need to be running vSphere 6.0 and BDE v2.2. There are also a couple adjustments to the /opt/serengeti/conf/serengeti.properties files that will be need to be made. I will be going over those in a future post discussing how to use the Photon OS as the template for BDE to deploy.

Cluster Service Discovery through Chef

Cluster service discovery is an integral part of running any distributed system. The problem of cluster service discovery can be solved through a number of different methods. Until recently, a common approach to solving cluster service discovery incorporated manual intervention on the part of a developer or systems administrator. The larger the cluster, or distributed application, the greater the lift became for configuring all of the nodes with the proper IP and port information for the running service. At scale (100’s or 1000’s) of nodes, this methodology simply does not work.

So how can a developer or system engineer automate the process of service discovery in a manner that is not limited to a single application, but instead work for any distributed application? Here are a few options for solving service discovery:

  • Mesosphere offers a solution using HAProxy on top of the Mesos slaves and then periodically checking which services are running an generating a new HAProxy configuration. They offer instructions on how to implement this method here.
  • Zookeeper can be used for cluster service discovery. There are some that believe it isn’t a viable solution due to its eventually consistent state.
  • Eureka from Netflix is another option for service discovery.

Any of these can be viable solutions for solving the service discovery issue. The option that I have been using, from an operations perspective, is accomplished through a Chef library. It also happens that the Chef recipe is included in VMware Big Data Extensions and was one of the early reasons why I became so engrossed in BDE.

The library itself includes multiple methods that can be used for solving the service discovery problem in a unique way. Essentially, when a set of nodes connect to the Chef server and run their respective recipes based on their role, they register themselves with the Chef server. The registration allows them to identify which cluster, or application deployment, they are being deployed for. In addition to registering which cluster they are apart of, the nodes also register what role they are running. This then allows the library to build out configuration files based on those roles using IP address, short hostnames, FQDN and port information.

For example, when I was building out the Apache Cassandra option within my BDE environment the cluster service discovery library helped me to automate the configuration of the cassandra.yaml file. The configuration file needed to know which nodes were registered as the seeds within the cluster. I accomplished the configuration by writing a Chef library that utilized the cluster-service-discovery.rb library. The code included the following:

  1 module Cassandra
  3   def is_seed
  4     node.role?("cassandra_seed")
  5   end
  7   def cassandra_seeds_ip
  8     servers = all_providers_fqdn_for_role("cassandra_seed")
  9     Chef::Log.info("Cassandra seed nodes in cluster #{node[:cluster_name]} are: #{servers.inspect}")
 10     servers
 11   end
 13   def cassandra_nodes_ip
 14     servers = all_providers_fqdn_for_role("cassandra_node")
 15     Chef::Log.info("Cassandra worker nodes in cluster #{node[:cluster_name]} are: #{servers.inspect}")
 16     servers
 17   end
 19   def wait_for_cassandra_seeds(in_ruby_block = true)
 20     return if is_seed
 22     run_in_ruby_block __method__, in_ruby_block do
 23       set_action(HadoopCluster::ACTION_WAIT_FOR_SERVICE, node[:cassandra][:seed_service_name])
 24       seed_count = all_nodes_count({"role" => "cassandra_seed"})
 25       all_providers_for_service(node[:cassandra][:seed_service_name], true, seed_count)
 26       clear_action
 27     end
 28   end
 30 end
 32 class Chef::Recipe; include Cassandra; end

Using the Chef library, I was able to access three methods — all_providers_fqdn_for_roleall_nodes_count, and all_providers_for_service. The first allowed me to gather the fully qualified domain name (FQDN) for each node that was registered with the seed role on the Chef server. The second allowed me to receive an integer for the total number of seed nodes. The third allowed me to wait for the total number of nodes to register themselves with the Chef server before proceeding with the configuration. The last bit there is key, you don’t want to write a configuration file with only a portion of the nodes.

The inclusion of the Chef library on the BDE management server also allows for clusters deployed through it to be automatically updated when nodes are added or removed from the deployed cluster. This helps with dynamic scaling — you can monitor for a trigger and then scale the cluster up/down and not have to worry about manually updating the configuration files for the application service. Ultimately the goal is to have service discovery happen automatically.

The question then becomes how can you leverage a Chef library, running on a Chef server, to assist in service discovery for applications running within a container? The team at Chef have made available tools for managing containers, including Docker, using a Chef server. More information can be seen here.

There is no magic bullet for solving service discovery within a distributed application. Depending on what you are trying to do, existing mind and skill set within the organization or preferred technologies, the problem can be solved with various methods. The Chef library described above is one method I am using today to assist in the automated deployment of distributed applications.


Upgrading Apache Mesos & Marathon on VMware Big Data Extensions


There are a couple new versions for both Apache Mesos and Mesosphere Marathon. Since one of the great things about using VMware Big Data Extensions to handle the deployment and management of clusters is the ability to ensure control of the packages and versions within the SDLC. There were a few changes to be made within Chef to support Mesos v0.22.0 and Marathon 0.8.0* within the Big Data Extensions management server.

First thing, you need to grab the new RPM files and place them into your management server under /opt/serengeti/www/yum/repos/mesos/0.21.0 directory. For my own installation, I created a new directory called mesos/0.22.0 and copied the packages that had not been updated into it — but that decision is up to you. Once the files are there, you can choose whether to increment the version number in /opt/serengeti/www/distros/manifest file:

44   {
45     "name": "mesos",
46     "vendor": "MESOS",
47     "version": "0.22.0",
48     "packages": [
49       {
50         "package_repos": [
51           "https://bde.localdomain/yum/mesos.repo"
52         ],
53         "roles": [
54           "zookeeper",
55           "mesos_master",
56           "mesos_slave",
57           "mesos_docker",
58           "mesos_chronos",
59           "mesos_marathon"
60         ]
61       }
62     ]
63   },

If you choose to update the manifest file, make sure you restart tomcat on the BDE management server. You normally would be done at this point, but there were changes in where Mesos and Marathon install two files that we’ll need to account for. The old files were in subdirectories under /usr/local and are now placed in subdirectories of /usr.


  9 exec /usr/bin/marathon


 78 template '/usr/etc/mesos/mesos-slave-env.sh' do
 79   source 'mesos-slave-env.sh.erb'
 80   variables(
 81     zookeeper_server_list: zk_server_list,
 82     zookeeper_port: zk_port,
 83     zookeeper_path: zk_path,
 84     logs_dir: node['mesos']['common']['logs'],
 85     work_dir: node['mesos']['slave']['work_dir'],
 86     isolation: node['mesos']['isolation'],
 87   )
 88   notifies :run, 'bash[restart-mesos-slave]', :delayed
 89 end

Update the files in the Chef server by running ‘knife cookbook upload -a‘ and you are all set.

Go forth and deploy your updated Apache Mesos 0.22.0 and Marathon 0.8.0 clusters!

*Marathon 0.8.1 has been released but there has not been a RPM file created by Mesosphere yet.

Apache Kafka Installation Guide in vSphere Big Data Extensions


Let me start off by saying that adding Apache Kafka into the framework of VMware vSphere Big Data Extensions (BDE) has been the most challenging of them all. Not from a framework perspective, but from a Chef cookbook and configuration one. There were a few resources for me to rely on for the overall configuration of Kafka, however many of them had contradicting statements within them. It took a good 8 solid hours of testing and re-testing the recipes before I was able to get a working multi-node Kafka cluster online.

All that being said, it was important for me to get a standardized method for deploying Apache Kafka clusters within the BDE framework. I am aware of several teams that are manually configuring Kafka within an environment today, each with their own insights on how that should be accomplished and few of them are sharing their methods with one another. Frankly, I feel the lack of collaboration between teams is the biggest challenge for any large-scale organization to overcome. Very rarely is a problem too difficult to solve with technology, it is generally difficult to solve because of a lack of knowledge-sharing between teams and/or organizations.

As I hope all of my readers have come to expect, the proceeding will include the JSON files necessary to add Apache Kafka into BDE, the Chef recipes|templates|libraries and a link to the GitHub repository for Virtual Elephant where you can download all of the pieces to add within your own deployments of BDE to further expand your service catalog.

Continue reading “Apache Kafka Installation Guide in vSphere Big Data Extensions”

Apache Cassandra Tutorial for VMware vSphere Big Data Extensions

Apache Cassandra

Apache Cassandra support has been one of the additional clustered software projects I have wanted to add into the vSphere Big Data Extensions framework. In an effort to build out a robust and diverse service catalog for our private cloud environment, adding Cassandra is one more service we can make available to our customers (which is currently in production use).

For those who are unaware of Apache Cassandra, it is a distributed database management system and the Apache Cassandra Project website states that it is:

“…the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.”

Below you will find the tutorial for how to implement Apache Cassandra, with the associated JSON files, Chef cookbooks and Chef roles in vSphere Big Data Extensions. All of the files I will be going over are available in GitHub here.

Continue reading “Apache Cassandra Tutorial for VMware vSphere Big Data Extensions”