Two of my team members (@jfvanrooyen and @tgelter) have been heads-down and hard at work for the past 6+ months working on architecting and building our OpenStack private cloud environment for the Adobe Digital Marketing Cloud. The efforts of their hard work have been rewarded by becoming the official OpenStack reference architecture for VMware Integrated OpenStack!
Congratulations to them both! Reach out to any of us if you have questions around our efforts to implement OpenStack in a robust, large-scale enterprise environment.
Day 1 of the Adobe 2015 Tech Summit wrapped up a few hours ago and it was an outstanding day. Beyond the great speakers, they have provided a room filled with whiteboards on every surface — walls and tables — to allow groups of individuals to get together and brainstorm new ideas. I had the opportunity today to get together with several different architects and engineers to go over the work I have been doing — and documenting — here on Virtual Elephant over the past few months. I received some great feedback from several people which has fueled me further to continue running ahead to get these pieces all sorted out.
Currently today, I have a working lab environment where I can deploy everything from a persistent HDFS data warehouse layer, Mesosphere clusters, Apache Hadoop (compute-only) clusters and Apache Storm clusters through the VMware Big Data Extensions framework. From there, I have begun work on creating or using pre-existing Docker containers and loading them into the Mesosphere cell through the API. These containers are then able to access the persistent HDFS layer and write data down to be consumed by Hadoop clusters accessing the same layer.
The logical design looks like this:
I have been focused on the layers above the Cloud Management Layer (CML) for the last six months, while my teammates have been focused on the CML layers down — specifically VIO, VSAN, NSX and vSphere. The best part of the journey has been the opportunity to see our efforts combining to create an end-to-end solution. In order to draw the focus away from the diagram, I have avoided calling out specific solutions for the CML, Monitoring and Cluster Deployment Framework. That being said, the work I have done with VMware Big Data Extensions has been leveraged in this work.
The interesting part that has come out of our combined efforts revolve around the capabilities of Heat templates and the JSON files that Big Data Extensions is leveraging today to define clusters. I tweeted about this briefly a bit ago and after having several good discussions, I believe the future roadmap for the work will be to add functionality to Big Data Extensions. I am hoping VMware will take the lead on this effort, as I firmly believe it will add great value to the product itself.
For a large part of 2014, I was involved in a Proof-of-Concept (POC) at work with EMC and a great Adobe storage engineer, Jason (@jason_farns), working on using Isilon as the HDFS layer for virtualized Hadoop clusters. After many hours, long weekends and serious amounts of trial-and-error, a whitepaper has been published on the work we did. This is the first paper I have seen published where I was involved, and it is really exciting to see it out there for everyone to read.
There is always more work to be done, but this was a great start.
With the growing popularity of the “datacenter as an OS” concept and the newer application-level, or container-level, resource managers (Mesos, YARN, Pivotal CF), what role does VMware vSphere Distributed Resource Scheduler (DRS) play? As I spend each day working in a multi-tenant cloud environment, being able to answer this question effectively is becoming more important each day. Engineers want to take advantage of the new application methodology that a Mesos resource manager offers them as they look to re-architecting legacy applications.
First off, when we talk about these application clusters, there are two terms that can be mixed together and make discussing the topic confusing with others. For that reason, I prefer to rely on these two definitions to help clarify the conversation:
- Cluster – A set of compute resources presented externally as an IaaS layer.
- Cell – An application-level or container-level cluster that consumes resources from the IaaS cluster.
In the most basic sense, DRS exists at the cluster level and Mesos exists at the cell level. The next piece to address is the perception of how a cell operates and the reality of how a cell operates.
Continue reading “Application-Level Resource Managers and VMware DRS”