Flocker Data Volumes for Docker in VMware vSphere

flocker

At VMworld 2015 in San Francisco, support for Flocker data volumes inside a VMware vSphere environment was announced. The announcement was one of the items I was most excited about hearing during the conference. The challenge of data persistence when Dockerizing workloads is prevalent in many organizations today. There are a few projects like Flocker and Rexray from EMC {Code} that are working to address this challenge. As I am working on building my own Cloud Native Application stack for a personal project, being able to maintain persistent data across the stack is key.

For those unfamiliar with Flocker, let me provide a quick overview. In short, it provides data volumes that can be attached to a Docker container that allow the container to be moved across hosts without losing data. Flocker describes the solution in a rather neat graphic.

flocker

The way it works is rather simple too. There is a controller node — referred to as the Flocker Control Service — and agents that get installed on the compute nodes running the Docker containers. The vSphere driver for Flocker enables the use of a shared datastore as the place where the provisioning of the Flocker data volumes takes place. Thus allowing you to utilize a familiar virtual storage construct within your environment to provide the data persistence necessary within a Cloud Native Application.

VMware has made a driver for integrating Flocker into a vSphere environment available on GitHub. The page includes basic instructions on how to load the driver into a Flocker Agent node and begin utilizing it within a Docker container. As I looked into Flocker and how to run it within my vSphere environment, my specific use-case called for it to become part of my VMware Big Data Extensions framework, so that it could be tied to an Apache Mesos cluster with Marathon.

This project, with Flocker + Apache Mesos, is the reason I spent a good part of the past weekend working on building out a CentOS 7 template for BDE to use. I needed to be able to support running a newer version of Docker in order to support Flocker. The details on my effort to build a CentOS 7 template VM for BDE are covered in this post. Between that effort and adding Flocker support, the pieces are all starting to come together.

 Flocker Support in VMware Big Data Extensions

When you look at the architecture slide on the ClusterHQ Flocker site, it becomes rather clear that adding Flocker to an Apache Mesos cluster deployment is a natural next step. The controller node could be deployed within the same VM running the Apache Mesos master or as a standalone VM. The agents can be installed and configured on the Apache Mesos slaves with Docker.

Side-Note: A few weeks back, I tried using the Debian 7 template that was released alongside the BDE Fling almost a year ago. Unfortunately it failed with a large number of Chef configuration errors when it was used to deploy an Apache Mesos cluster. Rather than fight it, I went back to using a combination of Photon and CentOS 6 for my deployments within my lab environment until I got the new CentOS 7 template working.

The ClusterHQ Flocker documentation provides directions on how to get Flocker running on an Ubuntu or CentOS 7 node. I used that documentation to help me construct the Chef recipes I needed for BDE. The first decision I made, at least for now, is to install the Flocker Control Service on a dedicated VM. By doing so, it allowed me to create a new cluster definition for a Mesos cluster with Flocker support, while leaving the original cluster specification file untouched.

/opt/serengeti/www/specs/Ironfan/mesos/flocker/spec.json

  1 {
  2   "nodeGroups":[
  3     {
  4       "name": "Zookeeper",
  5       "roles": [
  6         "zookeeper"
  7       ],
  8       "groupType": "zookeeper",
  9       "instanceNum": "[3,3,3]",
 10       "instanceType": "[SMALL]",
 11       "cpuNum": "[1,1,64]",
 12       "memCapacityMB": "[7500,3748,min]",
 13       "storage": {
 14         "type": "[SHARED,LOCAL]",
 15         "sizeGB": "[2,2,min]"
 16       },
 17       "haFlag": "on"
 18     },
 19     {
 20       "name": "Master",
 21       "description": "The Mesos master node",
 22       "roles": [
 23         "mesos_master",
 24         "mesos_chronos",
 25         "mesos_marathon"
 26       ],
 27       "groupType": "master",
 28       "instanceNum": "[2,1,2]",
 29       "instanceType": "[MEDIUM,SMALL,LARGE,EXTRA_LARGE]",
 30       "cpuNum": "[1,1,64]",
 31       "memCapacityMB": "[7500,3748,max]",
 32       "storage": {
 33         "type": "[SHARED,LOCAL]",
 34         "sizeGB": "[1,1,min]"
 35       },
 36       "haFlag": "on"
 37     },
 38     {
 39       "name": "Flocker Control",
 40       "description": "Flocker control node",
 41       "roles": [
 42         "flocker_control"
 43       ],
 44       "groupType": "master",
 45       "instanceNum": "[1,1,1]",
 46       "instanceType": "[SMALL,MEDIUM]",
 47       "cpuNum": "[1,1,16]",
 48       "memCapacityMB": "[3748,3748,max]",
 49       "storage": {
 50         "type": "[SHARED,LOCAL]",
 51         "sizeGB": "[1,1,min]"
 52       },
 53       "haFlag": "on"
 54     },
 55     {
 56       "name": "Slave",
 57       "description": "The Mesos slave node",
 58       "roles": [
 59         "mesos_slave",
 60         "mesos_docker",
 61         "flocker_agent"
 62       ],
 63       "instanceType": "[MEDIUM,SMALL,LARGE,EXTRA_LARGE]",
 64       "groupType": "worker",
 65       "instanceNum": "[3,1,max]",
 66       "cpuNum": "[1,1,64]",
 67       "memCapacityMB": "[7500,3748,max]",
 68       "storage": {
 69         "type": "[SHARED,LOCAL]",
 70         "sizeGB": "[1,1,min]"
 71       },
 72       "haFlag": "off"
 73     }
 74   ]
 75 }

Note: I enabled HA support for the Flocker Control Node within the specification file since it is only going to be deploying a single VM within the cluster.

The corresponding MAP entry included the following lines.

/opt/serengeti/www/specs/map

 30   {
 31     "vendor" : "Mesos",
 32     "version" : "^(\\d)+(\\.\\w+)*",
 33     "type" : "Mesos with Flocker Cluster",
 34     "appManager" : "Default",
 35     "path" : "Ironfan/mesos/flocker/spec.json"
 36   },

With the JSON cluster specification file created and the MAP entry added, the next step is to create two new Chef roles — flocker_control and flocker_agent.

/opt/serengeti/chef/roles/flocker_control.rb

 1 name        'flocker_control'
 2 description 'Deploy the Flocker Control Node to support an Apache Mesos cluster.'
 3 
 4 run_list *%w[
 5   role[basic]
 6   flocker::default
 7   flocker::control
 8 ]
 9

/opt/serengeti/chef/roles/flocker_agent.rb

 1 name        'flocker_agent'
 2 description 'Deploy the Flocker agent on Apache Mesos worker node.'
 3 
 4 run_list *%w[
 5   role[basic]
 6   flocker::default
 7   flocker::agent
 8 ]
 9

The flocker-control Chef role will be used to install and configure the standalone VM running the Flocker Control Service. The flocker-agent will be used to configure the Flocker Agent on each of the Apache Mesos worker nodes.

Chef Recipes for Flocker

I broke the Chef recipes into two — one for the Control node and one for the Agents. In addition to the primary recipes for installation and configuration, I created a library recipe, attributes file and several templates. I have included the recipes for the Control and Agent roles below, the remaining files can be downloads from the GitHub repository.

/opt/serengeti/chef/cookbooks/flocker/reciptes/default.rb

  1 # Cookbook Name:: flocker
  2 # Recipe:: default
  3 
  4 include_recipe "java::sun"
  5 include_recipe "hadoop_common::pre_run"
  6 include_recipe "hadoop_common::mount_disk"
  7 include_recipe "hadoop_cluster::update_attributes"
  8 
  9 set_bootstrap_action(ACTION_INSTALL_PACKAGE, 'flocker', true)
 10 
 11 # Setup the new repositories
 12 template '/etc/yum.repos.d/clusterhq.repo' do
 13   source 'clusterhq.repo.erb'
 14   action :create
 15 end
 16 
 17 # Dependency packages
 18 %w{clusterhq-flocker-node clusterhq-flocker-cli}.each do |pkg|
 19   package pkg do
 20     action :install
 21   end
 22 end
 23 
 24 clear_bootstrap_action

/opt/serengeti/chef/cookbooks/flocker/recipes/control.rb

  1 #
  2 # Cookbook Name:: flocker
  3 # Recipe:: control
  4 #
  5 # Licensed under the Apache License, Version 2.0 (the "License");
  6 # you may not use this file except in compliance with the License.
  7 # You may obtain a copy of the License at
  8 #
  9 #       http://www.apache.org/licenses/LICENSE-2.0
 10 #
 11 # Unless required by applicable law or agreed to in writing, software
 12 # distributed under the License is distributed on an "AS IS" BASIS,
 13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 # See the License for the specific language governing permissions and
 15 # limitations under the License
 16
 17 include_recipe 'flocker::install'
 18
 19 set_bootstrap_action(ACTION_INSTALL_PACKAGE, 'flocker_control', true)
 20
 21 conf_dir = node[:flocker][:conf_dir]
 22 temp_dir = node[:flocker][:temp_dir]
 23 directory conf_dir do
 24   owner 'root'
 25   group 'root'
 26   mode '0700'
 27   action :create
 28 end
 29
 30 template '/etc/flocker/cluster.key' do
 31   source 'cluster.key.erb'
 32   mode '0600'
 33   owner 'root'
 34   group 'root'
 35   action :create
 36 end
 37
 38 template '/etc/flocker/cluster.crt' do
 39   source 'cluster.crt.erb'
 40   mode '0600'
 41   owner 'root'
 42   group 'root'
 43   action :create
 44 end
 45
 46 template '/etc/flocker/control-service.crt' do
 47   source 'control-service.crt.erb'
 48   mode '0600'
 49   owner 'root'
 50   group 'root'
 51   action :create
 52 end
 53
 54 template '/etc/flocker/control-service.key' do
 55   source 'control-service.key.erb'
 56   mode '0600'
 57   owner 'root'
 58   group 'root'
 59   action :create
 60 end
 61
 62 # Generate authentication certificates
 63 #execute 'generate_system_certs' do
 64 #  cwd conf_dir
 65 #  command 'flocker-ca initialize $CLUSTERNAME'
100 # Start the Flocker Control service
101 is_control_running = system("systemctl status #{node[:flocker][:control_service_name]}")
102 service "restart-#{node[:flocker][:control_service_name]}" do
103   service_name node[:flocker][:control_service_name]
104   supports :status => true, :restart => true
105
106 end if is_control_running
107
108 service "start-#{node[:flocker][:control_service_name]}" do
109   service_name node[:flocker][:control_service_name]
110   action [ :enable, :start ]
111   supports :status => true, :restart => true
112
113 end
114
115 # Register with cluster_service_discovery
116 provide_service(node[:flocker][:control_service_name])
117 clear_bootstrap_action

/opt/serengeti/chef/cookbooks/flocker/recipes/agent.rb

  1 #
  2 # Cookbook Name:: flocker
  3 # Recipe:: agent
  4 #
  5 # Licensed under the Apache License, Version 2.0 (the "License");
  6 # you may not use this file except in compliance with the License.
  7 # You may obtain a copy of the License at
  8 #
  9 #       http://www.apache.org/licenses/LICENSE-2.0
 10 #
 11 # Unless required by applicable law or agreed to in writing, software
 12 # distributed under the License is distributed on an "AS IS" BASIS,
 13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 # See the License for the specific language governing permissions and
 15 # limitations under the License
 16
 17 include_recipe 'flocker::install'
 18
 19 set_bootstrap_action(ACTION_INSTALL_PACKAGE, 'flocker_agent', true)
 20
 21 # Wait for the Flocker control node to be setup and registered
 22 wait_for_flocker_control
 23
 24 conf_dir = node[:flocker][:conf_dir]
 25 directory conf_dir do
 26   owner 'root'
 27   group 'root'
 28   mode '0700'
 29   action :create
 30 end
 31
 32 controls_ip = flocker_control_ip
 33
 34 template '/etc/flocker/agent.yml' do
 35   source 'agent.yml.erb'
 36   action :create
 37   variables(
 38     control_node: controls_ip
 39   )
 40 end
 41
 42 template '/etc/flocker/node.crt' do
 43   source 'node.crt.erb'
 44   action :create
 45   mode '0600'
 46   owner 'root'
 47   group 'root'
 48 end
 49
 50 template '/etc/flocker/node.key' do
 51   source 'node.key.erb'
 52   action :create
 53   mode '0600'
 54   owner 'root'
 55   group 'root'
 56 end
 57
 58 template '/etc/flocker/cluster.crt' do
 59   source 'cluster.crt.erb'
 60   action :create 61   mode '0600'
 62   owner 'root'
 63   group 'root'
 64 end
 65
 66 # Fix agent.yml hostname string
 67 #   hostname: ["blah.local.domain"]
 68 execute 'fix agent.yml hostname' do
 69   cwd conf_dir
 70   command 'sed -i \'s/\["/"/g\' /etc/flocker/agent.yml && sed -i \'s/"\]/"/g\' /etc/flocker/agent.yml'
 71 end
 72
 73 # Install additional packages
 74 %w{git}.each do |pkg|
 75   package pkg do
 76     action :install
 77   end
 78 end
 79
 80 execute "install-python-pip" do
 81   command 'curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py" && python get-pip.py'
 82 end
 83
 84 # Install the VMware vsphere-flocker-driver
 85 execute 'install vmware-flocker-driver' do
 86   command 'pip install git+https://github.com/vmware/vsphere-flocker-driver.git'
 87 end
 88
 89 # Start the two Flocker agent services
 90 service 'flocker-dataset-agent' do
 91   supports :status => true, :restart => true, :reload => false
 92   action [ :enable, :start ]
 93 end
 94
 95 service 'flocker-container-agent' do
 96   supports :status => true, :restart => true, :reload => false
 97   action [ :enable, :start ]
 98 end
 99
100 provide_service(node[:flocker][:agent_service_name])
101 clear_bootstrap_action

VMware vSphere Flocker Driver

The necessary bits for utilizing the Flocker driver are built into the Chef recipes themselves. However, since the GitHub page does a good job of providing an overview, I would like to highlight the specific bits of the Chef recipes that coincide with that documentation.

I started off my modifying my CentOS 7 VM template for BDE to included the advanced setting in the VMX file.

disk.EnableUUID​ = "TRUE"

Screen Shot 2015-10-28 at 8.49.24 AM

The next step was to mark a datastore for Flocker to use for the shared volumes. Because I already had BDE in my environment and it was currently pointing to three datastores, I went ahead modified my implementation a bit. The three datastores I did have BDE utilizing were actually part of a single Storage DRS cluster. I went ahead and remove the Storage DRS cluster and left the three datastores alone. From there, I selected just a single datastore for BDE to use going forward and created the necessary Flocker folder.

Deploying an Apache Mesos cluster with Flocker support

With all of the pieces in place, the environment is ready for a new Apache Mesos cluster to be deployed through the BDE framework. Using the vSphere Web Client, I deployed a new cluster for testing. Once the cluster was deployed through BDE and fully configured using the new Chef recipes, I verified that the Apache Mesos, Mesosphere Marathon and Chronos interfaces were all online.

flocker

flocker

 

mesos-flocker-bde-1

marathon-flocker-1

The cluster is ready for a workload to be deployed into it.

Conclusion

VMware is making significant strides into the Cloud Native Apps  space with the ability to support Docker, Apache Mesos, Kubernetes and now also Flocker. In my opinion, being able to provide a simple deployment framework for these applications through VMware Big Data Extensions is a key success factor. Tying these pieces all together tells a very compelling story for organizations already using the VMware SDDC within their environments and transitioning to the Cloud Native Apps arena. Future posts will revolve around my own work in the space to build a Cloud Native distributed application running entirely on a vSphere environment.

I am very excited about this project and getting Flocker in place was a critical factor for my success. As always, all of the necessary files to add support for Flocker into VMware Big Data Extensions are available on the Virtual Elephant GitHub repository.

Stay tuned and if you have any questions about what I’ve covered in these posts, please reach out to me over Twitter, LinkedIn or email. Enjoy!

 

 

Docker Minecraft Containers to the Rescue!

docker

My 11-year old son has been after me to get a Minecraft server setup locally that he could connect to and play off of any of the computers in the house. Fortunately for him, my home lab environment is running a few different Docker container hosts, and I figured loading up a couple servers via Docker would be the simplest solution. A quick search on Docker Hub yielded several results for Minecraft and after looking through the most popular options, I settled on the itzg/minecraft-server image.

I deployed a new Photon VM to act as the container host and after a slow 180 second install, I was pulling the image down to the host. I wanted to run a couple different options for him, so that he could choose what type of server to log into. I created a couple simple shell scripts to load the different server types.

cre-minecraft.sh

  1 docker run -d -it \
  2 -e EULA=TRUE -e DIFFICULTY=normal -e VERSION=LATEST \
  3 -e MODE=creative -e PVP=false -e LEVEL_TYPE=LARGEBIOMES \
  4 -e 'JVM_OPTS=-Xmx4096M -Xms4096M' \
  5 -p 25566:25565 --name cre-minecraft \
  6 -v /opt/minecraft2:/data itzg/minecraft-server

adv-minecraft.sh

  1 docker run -d -it \
  2 -e EULA=TRUE -e DIFFICULTY=normal -e VERSION=LATEST \
  3 -e MODE=adventure -e PVP=false -e LEVEL_TYPE=AMPLIFIED \
  4 -e 'JVM_OPTS=-Xmx8192M -Xms8192M' \
  5 -p 25565:25565 --name adv-minecraft \
  6 -v /opt/minecraft:/data itzg/minecraft-server

After that it was just a matter of starting both containers and checking that they were listening on the correct host ports.

docker-minecraft-1

docker-minecraft-2

The Docker Minecraft containers worked wonderfully this evening. He was excited to see that I was able to spin up as many servers as he wanted to run and that he could easily hop between them in his Minecraft client. It was a fun little experiment with Docker volumes for me as well — I had not used them yet — resulting in a very happy kid!

CentOS 7 Template for VMware Big Data Extensions

cantos-logo-long

The CentOS 6 template within the VMware Big Data Extensions was becoming  bit long in the tooth and needed to be updated to CentOS 7 for a variety of reasons. As I began to look at using some of the newer features in Docker, it became apparent CentOS 6 was no longer going to be a useful template VM. I tried using the Debian 7 template included as a part of the VMware Fling for BDE released last year, however it had several problems with the Chef recipes. The effort required to get a CentOS 7 template built and working with BDE took a bit of trial-and-error, this post will simplify it for others to get them going in a more timely manner.

The documentation for building an alternate VM template for BDE are a bit outdated, specifically referring to building a CentOS 6 template when it was still using the CentOS 5 branch. I started with those directions and pieced them together with some of the work I had done previously getting Photon to support Apache Mesos. Let’s get started with a base CentOS 7 VM.

CentOS 7 VM Installation

Start by downloading the CentOS 7 minimal ISO file from a local mirror. Once you have ISO go ahead and create a new VM in your vCenter environment. Make sure you provide a 40GB disk drive — I allocated a single vCPU and 1GB of memory to my template VM. Mount the ISO file and power on the VM. Once you’ve gone through the installation, reboot the VM and SSH into it.

The first thing I like to do is make sure the OS is up-to-date, followed by installing VMware Tools.

# yum -y update
# yum -y install open-vm-tools
# cat <<-EOF >> /etc/yum.repos.d/vmware-tools.repo
> [vmware-tools]
> name = VMware Tools
> baseurl = http://packages.vmware.com/packages/rhel7/x86_64/
> enabled = 1
> gpgcheck = 1
> EOF
# curl -o vmware-dsa.pub http://packages.vmware.com/tools/keys/VMWARE-PACKAGING-GPG-DSA-KEY.pub
# curl -o vmware-rsa.pub http://packages.vmware.com/tools/keys/VMWARE-PACKAGING-GPG-RSA-KEY.pub
# rpm --import vmware-dsa.pub
# rpm --import vmware-rsa.pub
# yum -y install open-vm-tools-debloypkg
# systemctl restart vmtoolsd
# ECHO localhost.localdomain > /etc/hostname
# shutdown -h now

The above steps will create a basic CentOS 7 VM that can be cloned to a template for use within your environment going forward. Once you are satisfied with the VM, the next step is to install the BDE specific bits for it to function properly.

Configure CentOS 7 Template for BDE

Power on the VM and SSH back into it. The next things to do are to install the JDK and install several customization packages that are provided on the BDE management server. There are a few minor modifications that have to be made in order to get it working.

# systemctl disable firewalld
# systemctl stop firewalld
# mkdir os && cd os
# curl --insecure -o custos.tar.gz https://BDE_MGMT_SERVER/custos/custos.tar.gz
# tar zxf custos.tar.gz
# vim installer.sh
 48 #reduce grub boot waiting time
 49 #sed -i 's|^timeout=.*$|timeout=0|' /boot/grub/grub.conf
...
143 #stop firewall
144 #service iptables stop
145 #chkconfig iptables off
# ./installer.sh

When the installer completes, the terminal screen should look like the following:

centos-7-template

 

One of the packages installed is chef-client. I prefer to run a quick check to make sure the binary is installed properly.

# which chef-client
/usr/bin/chef-client
# sed -i 's/enforcing/disabled/g' /etc/selinux/config /etc/selinux/config
# shutdown -h now

Notice that I turned off selinux as it was interfering with the latest version of Docker when the service was trying to be started.

The template can now be placed in the vApp for BDE and Tomcat on the management server can be restarted to see the new template.

Chef Recipe Modifications

Simply building a new CentOS 7 template VM and throwing it into the vApp is not all that is required. The next steps took me through quite a bit of trial-and-error before I had a cluster deploying properly again. Many of the Chef recipes need to be modified to account for newer package version configuration changes and other configurations performed within the recipes. I had to step through each service role one-by-one and make sure they were all working properly.

Rather than go through each and every recipe included on the BDE Management server, I will merely say there were a myriad of changes and you can download my updated Chef recipes from the GitHub repo for Virtual Elephant.

I would strongly encourage you to take a backup of the entire BDE Management VM, snapshot the VM or create an off-site copy of the /opt/serengeti/chef/cookbooks directory before pulling my changes into your environment.

Once all of the recipes are updated, be sure to run the ‘knife cookbook upload -a’ command on the BDE Management server. Then the template will be fully ready to be utilized within your environment.

Getting a CentOS7 VM template was a necessity for me with some of the work I am doing in my lab environments. The next few posts on the site will be focused around these efforts and they would not have been possible if I had not done this work up-front. When my wife asked what I had been working on for the past few nights, I had to explain to her that I had gotten into a bit of a rabbit-hole and I’ve finally come back out…just to start on the work I wanted to begin several days ago.

Docker Container for IO Benchmarking

docker

I had a need this week for a quick-and-easy IO benchmarking tool and decided to create a Docker container to achieve my goals. The Docker container itself is rather simple and is available for use at your leisure. The Docker container is based off of Ubuntu latest, installs the FIO package and grabs a simple test file from my website for generating load and benchmarking results.

Dockerfile

  1 # FIO benchmark on Ubuntu:latest
  2 
  3 FROM ubuntu:latest
  4 MAINTAINER Chris Mutchler <chris@virtualelephant.com>
  5 
  6 RUN apt-get update
  7 RUN apt-get -y install fio wget
  8 RUN wget http://virtualelephant.com/wp-content/uploads/2015/10/threads.txt
  9 
 10 CMD [ "/bin/bash" ]

Once I had the Docker image uploaded to my repo, I created a simple JSON file for deploying the container into my Mesos cluster.

iobench.json

{
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "chrismutchler/iobench"
    }    
  },
  "id": "iobench",
  "instances": 1,
  "cpus": 0.25,
  "mem": 256,
  "uris": [],
  "cmd": "while true; do date; /usr/bin/fio /threads.txt; sleep 10; done"
}

As noted, the Docker container will continue to run the FIO test infinitely until the app is destroyed in Marathon. The app can also be scaled to run across multiple Apache Mesos nodes, each running the FIO test independently. Be careful when running any sort of load or benchmarking test in your environments, it may have adverse effects.

Launching the application in Marathon was simple enough from the command line.

# curl -A POST http://mesos.local.domain:8080/v2/apps -d @iobench.json -H "Content-Type: application/json"

I used the test to create a noisy neighbor issue in my environment to test out VMware Storage IO Control (SIOC) settings and it worked adequately in this role. Once the application had been launched, the results are available for downloading or viewing in the Mesos UI, by selecting ‘Sandbox’ and then the STDOUT log file. To understand what FIO is performing for the IO benchmark, please read the blog post by Ben Martin where I copied the FIO test file from.

stdout

--container="mesos-20151009-234723-2583865536-5050-2901-S1.c0052620-2322-4d12-aa49-1e3b24402f50" --docker="docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/20151009-234723-2583865536-5050-2901-S1/frameworks/20151009-234723-2583865536-5050-2901-0001/executors/iobench.cb93c618-6eea-11e5-9209-0050569a4da8/runs/c0052620-2322-4d12-aa49-1e3b24402f50" --stop_timeout="0ns"
--container="mesos-20151009-234723-2583865536-5050-2901-S1.c0052620-2322-4d12-aa49-1e3b24402f50" --docker="docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/20151009-234723-2583865536-5050-2901-S1/frameworks/20151009-234723-2583865536-5050-2901-0001/executors/iobench.cb93c618-6eea-11e5-9209-0050569a4da8/runs/c0052620-2322-4d12-aa49-1e3b24402f50" --stop_timeout="0ns"
Registered docker executor on dhcp2-157.local.domain
Starting task iobench.cb93c618-6eea-11e5-9209-0050569a4da8
Sat Oct 10 01:04:40 UTC 2015
bgwriter: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
queryA: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
queryB: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
bgupdater: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=16
fio-2.1.3
Starting 4 processes
bgwriter: Laying out IO file(s) (1 file(s) / 256MB)
queryA: Laying out IO file(s) (1 file(s) / 256MB)
queryB: Laying out IO file(s) (1 file(s) / 256MB)
bgupdater: Laying out IO file(s) (1 file(s) / 32MB)

bgwriter: (groupid=0, jobs=1): err= 0: pid=8: Sat Oct 10 01:05:32 2015
  write: io=262144KB, bw=11578KB/s, iops=2894, runt= 22641msec
    slat (usec): min=16, max=4713.1K, avg=334.93, stdev=20011.69
    clat (usec): min=9, max=4750.2K, avg=10711.15, stdev=112034.10
     lat (usec): min=75, max=4750.6K, avg=11048.59, stdev=113823.09
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    6],
     | 30.00th=[    6], 40.00th=[    7], 50.00th=[    7], 60.00th=[    7],
     | 70.00th=[    8], 80.00th=[    8], 90.00th=[    9], 95.00th=[    9],
     | 99.00th=[   16], 99.50th=[   31], 99.90th=[  947], 99.95th=[ 1582],
     | 99.99th=[ 4752]
    bw (KB  /s): min=    5, max=39208, per=100.00%, avg=15073.13, stdev=7536.85
    lat (usec) : 10=0.01%, 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.05%, 4=12.47%, 10=85.23%, 20=1.46%, 50=0.34%
    lat (msec) : 250=0.05%, 500=0.19%, 750=0.05%, 1000=0.05%, 2000=0.05%
    lat (msec) : >=2000=0.05%
  cpu          : usr=2.60%, sys=8.83%, ctx=126969, majf=0, minf=26
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=65536/d=0, short=r=0/w=0/d=0
queryA: (groupid=0, jobs=1): err= 0: pid=9: Sat Oct 10 01:05:32 2015
  read : io=262144KB, bw=12106KB/s, iops=3026, runt= 21654msec
    clat (usec): min=26, max=4713.2K, avg=322.55, stdev=20018.65
     lat (usec): min=26, max=4713.2K, avg=322.77, stdev=20018.66
    clat percentiles (usec):
     |  1.00th=[   31],  5.00th=[   71], 10.00th=[   77], 20.00th=[  115],
     | 30.00th=[  126], 40.00th=[  155], 50.00th=[  159], 60.00th=[  165],
     | 70.00th=[  183], 80.00th=[  207], 90.00th=[  253], 95.00th=[  314],
     | 99.00th=[  724], 99.50th=[ 1012], 99.90th=[ 2576], 99.95th=[ 4384],
     | 99.99th=[272384]
    bw (KB  /s): min=    7, max=24840, per=67.92%, avg=16686.43, stdev=7338.70
    lat (usec) : 50=4.04%, 100=11.27%, 250=74.04%, 500=8.29%, 750=1.42%
    lat (usec) : 1000=0.45%
    lat (msec) : 2=0.36%, 4=0.09%, 10=0.03%, 20=0.01%, 50=0.01%
    lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2000=0.01%, >=2000=0.01%
  cpu          : usr=2.93%, sys=7.36%, ctx=131487, majf=65536, minf=31
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=65536/w=0/d=0, short=r=0/w=0/d=0
queryB: (groupid=0, jobs=1): err= 0: pid=10: Sat Oct 10 01:05:32 2015
  read : io=262144KB, bw=11910KB/s, iops=2977, runt= 22010msec
    clat (usec): min=27, max=4714.3K, avg=325.85, stdev=20020.01
     lat (usec): min=27, max=4714.3K, avg=326.11, stdev=20020.01
    clat percentiles (usec):
     |  1.00th=[   32],  5.00th=[   72], 10.00th=[   82], 20.00th=[  117],
     | 30.00th=[  139], 40.00th=[  155], 50.00th=[  159], 60.00th=[  169],
     | 70.00th=[  187], 80.00th=[  213], 90.00th=[  262], 95.00th=[  318],
     | 99.00th=[  700], 99.50th=[  980], 99.90th=[ 2608], 99.95th=[ 4768],
     | 99.99th=[272384]
    bw (KB  /s): min=    2, max=23384, per=65.75%, avg=16152.90, stdev=6882.55
    lat (usec) : 50=2.90%, 100=9.19%, 250=76.15%, 500=9.50%, 750=1.34%
    lat (usec) : 1000=0.43%
    lat (msec) : 2=0.32%, 4=0.09%, 10=0.03%, 20=0.01%, 50=0.01%
    lat (msec) : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%, 2000=0.01%
    lat (msec) : >=2000=0.01%
  cpu          : usr=3.81%, sys=6.74%, ctx=131493, majf=65536, minf=30
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=65536/w=0/d=0, short=r=0/w=0/d=0
bgupdater: (groupid=0, jobs=1): err= 0: pid=11: Sat Oct 10 01:05:32 2015
  read : io=16416KB, bw=3611.1KB/s, iops=902, runt=  4545msec
    slat (usec): min=81, max=285023, avg=520.41, stdev=5536.65
    clat (usec): min=1, max=2841, avg= 5.40, stdev=80.88
     lat (usec): min=115, max=285031, avg=528.39, stdev=5537.20
    clat percentiles (usec):
     |  1.00th=[    2],  5.00th=[    2], 10.00th=[    2], 20.00th=[    2],
     | 30.00th=[    2], 40.00th=[    2], 50.00th=[    2], 60.00th=[    2],
     | 70.00th=[    2], 80.00th=[    3], 90.00th=[    3], 95.00th=[    3],
     | 99.00th=[    3], 99.50th=[    6], 99.90th=[ 1448], 99.95th=[ 2384],
     | 99.99th=[ 2832]
    bw (KB  /s): min= 2096, max= 4472, per=14.60%, avg=3587.00, stdev=1065.00
  write: io=16352KB, bw=3597.9KB/s, iops=899, runt=  4545msec
    slat (usec): min=74, max=254089, avg=489.47, stdev=3971.19
    clat (usec): min=1, max=2233, avg= 5.13, stdev=66.52
     lat (usec): min=77, max=254096, avg=497.09, stdev=3971.57
    clat percentiles (usec):
     |  1.00th=[    2],  5.00th=[    2], 10.00th=[    2], 20.00th=[    2],
     | 30.00th=[    2], 40.00th=[    2], 50.00th=[    2], 60.00th=[    2],
     | 70.00th=[    2], 80.00th=[    3], 90.00th=[    3], 95.00th=[    3],
     | 99.00th=[    3], 99.50th=[    5], 99.90th=[ 1256], 99.95th=[ 1816],
     | 99.99th=[ 2224]
    bw (KB  /s): min= 2024, max= 4504, per=28.58%, avg=3514.88, stdev=1085.81
    lat (usec) : 2=0.04%, 4=99.26%, 10=0.24%, 50=0.18%, 100=0.04%
    lat (usec) : 250=0.06%, 500=0.02%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.07%, 4=0.06%
  cpu          : usr=8.30%, sys=4.71%, ctx=16460, majf=0, minf=25
  IO depths    : 1=99.8%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=4104/w=4088/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=540704KB, aggrb=24566KB/s, minb=3611KB/s, maxb=12106KB/s, mint=4545msec, maxt=22010msec
  WRITE: io=278496KB, aggrb=12300KB/s, minb=3597KB/s, maxb=11578KB/s, mint=4545msec, maxt=22641msec

Disk stats (read/write):
    dm-2: ios=135188/67188, merge=0/0, ticks=39356/28285, in_queue=67640, util=97.96%, aggrios=135217/70245, aggrmerge=0/0, aggrticks=1492/4673, aggrin_queue=6169, aggrutil=15.06%
    dm-0: ios=135217/70245, merge=0/0, ticks=1492/4673, in_queue=6169, util=15.06%, aggrios=0/0, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
  loop1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
  loop0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

If you would like to grab the Docker image and use it for your own needs, you can pull it from Docker Hub:

# docker pull chrismutchler/iobench

One of the next little Docker containers I will be building is a web page traffic generator — I’ve used all sorts of traffic testing applications in the past, but being able to spin up a test inside a Docker container and then scale it across a large Mesos cluster will simplify things even more!

Enjoy!

VMware Photon TP2 + Mesos Cluster Deployment

photon-dfad9617

VMware Photon TP2 was released on August 27. The new version contains native support for running Mesos and therefore should have allowed the Photon OS to run as a Mesos slave immediately after installation. I would like to think my earlier blog post detailing how to deploy Mesos on-top of Photon influenced this functionality.

Download the ISO here.

After conversations with people involved in the project, the idea is for Photon to act only as a Mesos slave, with external Mesos masters and Zookeeper running on an Ubuntu/CentOS/Red Hat nodes. Logically the architecture of a Mesos cluster with Photon OS would look like the following.

 

mesos-photon-cluster

 

In order to deploy the cluster in this fashion, I wanted to find a method for automating as much of it as possible. Currently, one limitation with VMware Big Data Extensions is the single template VM limit. How awesome would it be if you could have multiple template VMs within the vApp and choose which template to deploy based on a pre-defined role? Definitely something to look into.

Regardless, working within the current limitations of BDE, I will describe in detail how I am now deploying Photon OS nodes into a Mesos cluster as automated as possible.

Configuring Big Data Extensions

I decided to create a new cluster map for a Mesos cluster that only deployed the Zookeeper and Mesos master nodes. The idea is similar to a Compute-only Hadoop or HDFS-only Hadoop cluster deployment through BDE. All that is required to accomplish this is a JSON file with the new cluster definition and an entry in the /opt/serengeti/www/specs/map file.

/opt/serengeti/www/specs/Ironfan/mesos/master/spec.json

  1 {
  2   "nodeGroups":[
  3     {
  4       "name": "Zookeeper",
  5       "roles": [
  6         "zookeeper"
  7       ],
  8       "groupType": "zookeeper",
  9       "instanceNum": "[3,3,3]",
 10       "instanceType": "[SMALL]",
 11       "cpuNum": "[1,1,64]",
 12       "memCapacityMB": "[7500,3748,min]",
 13       "storage": {
 14         "type": "[SHARED,LOCAL]",
 15         "sizeGB": "[2,2,min]"
 16       },
 17       "haFlag": "on"
 18     },
 19     {
 20       "name": "Master",
 21       "description": "The Mesos master node",
 22       "roles": [
 23         "mesos_master",
 24         "mesos_chronos",
 25         "mesos_marathon"
 26       ],
 27       "groupType": "master",
 28       "instanceNum": "[2,1,2]",
 29       "instanceType": "[MEDIUM,SMALL,LARGE,EXTRA_LARGE]",
 30       "cpuNum": "[1,1,64]",
 31       "memCapacityMB": "[7500,3748,max]",
 32       "storage": {
 33         "type": "[SHARED,LOCAL]",
 34         "sizeGB": "[1,1,min]"
 35       },
 36       "haFlag": "on"
 37     }
 38   ]
 39 }

/opt/serengeti/www/specs/map

 17     "vendor" : "Mesos",
 18     "version" : "^(\\d)+(\\.\\w+)*",
 19     "type" : "Mesos Master-Only Cluster",
 20     "appManager" : "Default",
 21     "path" : "Ironfan/mesos/master/spec.json"
 22   },

Normally, editing the two files would have been all that was required, however I have modified the Chef cookbooks to include the HAProxy package. I had included it in the install.rb cookbook for Mesos and this causes a problem if there are no slave nodes. I moved the code to the master.rb cookbook and updated the Chef server.

/opt/serengeti/chef/cookbooks/mesos/recipes/master.rb

166 directory "/etc/haproxy-marathon-bridge" do
167   owner 'root'
168   group 'root'
169   mode '0755'
170   action :create
171 end
172 
173 template '/usr/local/bin/haproxy-marathon-bridge' do
174   source 'haproxy-marathon-bridge.erb'
175   action :create
176 end
177 
178 all_ips = mesos_masters_ip
179 
180 template '/etc/haproxy-marathon-bridge/marathons' do
181   source 'marathons.erb'
182   variables(
183     haproxy_server_list: all_ips
184   )
185   action :create
186 end
187 
188 execute 'configure haproxy' do
189   command 'chkconfig haproxy on; service haproxy start'
190 end
191 
192 execute 'setup haproxy-marathon-bridge' do
193   command 'chmod 755 /usr/local/bin/haproxy-marathon-bridge; /usr/local/bin/haproxy-marathon-bridge install_cronjob'
194 end
195 
196 template '/usr/local/bin/haproxy-marathon-bridge' do
197   source 'haproxy-marathon-bridge.erb'
198   action :create
199 end

Restart Tomcat on the management server and then the new cluster definition is available for use.

My new cluster, minus the slave nodes looks like this now.

mesos-no-slaves

Using the new deployment option to deploy the Apache Mesos cluster. Once the cluster is configured and available, note the IP addresses of the two Mesos master nodes. We are going to use those IP addresses within the Photon nodes to pre-populate configuration files so the Photon nodes automatically join the cluster.

Photon Node Configuration

The next step is to configure a Photon node template that will automatically join the Mesos cluster deployed previously. After installing a node with the new TP2 release of Photon, I enabled root login over SSH so that I could quickly configure the node — be sure to turn it back off after you perform the following tasks.

Unfortunately, the version of Mesos that shipped in the ISO file released is 0.22.0 and there is a known conflict with the newer versions of Docker. The Photon TP2 ISO included Docker version 1.8.1 and it threw the following error when I tried to start the node as a Mesos slave:

root [ /etc/systemd/system ]# /usr/sbin/mesos-slave --master=zk://192.168.1.126:2181,192.168.1.127:2181,192.168.1.128:2181/mesos_cell --hostname=$(/usr/bin/hostname) --log_dir=/var/log/mesos_slave --containerizers=docker,mesos --docker=/usr/bin/docker --executor_registration_timeout=5mins --ip=$(/usr/sbin/ip -o -4 addr list | grep eno | grep global | awk 'NR==1{print $4}' | cut -d/ -f1)
I0905 18:42:16.588754  4269 logging.cpp:172] INFO level logging started!
I0905 18:42:16.591898  4269 main.cpp:156] Build: 2015-08-20 20:33:22 by 
I0905 18:42:16.592162  4269 main.cpp:158] Version: 0.22.1
Failed to create a containerizer: Could not create DockerContainerizer: Insufficient version of Docker! Please upgrade to >= 1.0.0

The bug was already noted in the updated code on the Photon GitHub repo, however there is not an update ISO available. That meant I needed to build my own ISO file from the latest code on the repo.

Note: Make sure the Ubuntu node has plenty of CPU and memory for compiling the ISO image. I was using a 1vCPU and 1GB memory VM in my lab and it took a long time to build the ISO.

photon-iso

After successfully building an updated ISO image, I used it to build a new VM. I really enjoy how quickly the Photon OS builds, even in my limited home lab environment.

photon-build-time

I wanted to configure the mesos-slave service to start each time the VM is booted and automatically join the master-only Mesos cluster I deployed above using BDE. That meant I needed to configure the mesos-slave.service file on the Photon node.

/etc/systemd/system/mesos-slave.service

  1 [Unit]
  2 Description=Photon Mesos Slave node
  3 After=network.target,docker.service
  4 
  5 [Service]
  6 Restart=on-failure
  7 RestartSec=10
  8 TimeoutStartSec=0
  9 ExecStartPre=/usr/bin/rm -f /tmp/mesos/meta/slaves/latest
 10 ExecStart=/bin/bash -c "/usr/sbin/mesos-slave \
 11 --master=zk://192.168.1.126:2181,192.168.1.127:2181,192.168.1.128:2181/mesos_cell \
 12 --hostname=$(/usr/bin/hostname) \
 13 --log_dir=/var/log/mesos_slave \
 14 --containerizers=docker,mesos \
 15 --docker=/usr/bin/docker \
 16 --executor_registration_timeout=5mins \
 17 --ip=$(/usr/sbin/ip -o -4 addr list | grep eno | grep global | awk 'NR==1{print $4}' | cut -d/ -f1)"
 18 
 19 [Install]
 20 WantedBy=multi-user.target

After creating the service file for systemd, it was then possible to start service and see it join the Mesos cluster in the UI.

meson-running

mesos-cluster-1

I shutdown the VM and cloned it to a template for use with the next step.

Final step is now to run a workload on the cluster, with Photon providing the Docker containers.

Workload Deployment

Launching a container workload on the new cluster was rather straightforward. I used a simple NGiNX container and exposed it over port 80.

meson-running-workload

marathon-running-workload

 

A few things, like automatic hostname configuration within Photon based on the DHCP address, are still left to do. But this is a working solution and let’s me do some next-level deployment testing using Photon as the mechanism for deploying the Docker containers.

If you have any questions on what I did here, feel free to reach out to me over Twitter.