preloader
Virtual Elephant
Articles

Cloud Frameworks - Identifying Similarities and Emphasizing Operational Excellence

The AWS Well-Architected Framework, Microsoft Azure Well-Architected Framework, and TOGAF architecture framework share a common focus on identifying requirements, mitigating risks, and achieving operational excellence. Operational excellence involves designing architectures that are efficient, reliable, and easy to manage, with a focus on implementing automation and monitoring systems to reduce the risk of errors and improve efficiency. By leveraging these frameworks, cloud architects and SREs can design and implement cloud infrastructures that are efficient, reliable, and easy to manage, helping to drive business success.

Cloud Architecture Frameworks

As cloud computing continues to gain popularity, cloud architects and SREs are tasked with designing and managing complex cloud infrastructures that meet the needs of their organizations. To help ensure that these infrastructures are well-designed and efficient, many architects and SREs turn to cloud architecture frameworks such as the AWS Well-Architected Framework, the Microsoft Azure Well-Architected Framework, and the TOGAF framework.

As a VMware Certified Design Expert, I focus much of my time discussing architecture and design principles with many different organizations. As such, it is important to understand the common architecture frameworks that exist today. While these frameworks differ in some ways, they share a number of commonalities, particularly in the requirement and risk gathering phases of cloud architecture design. Additionally, they all emphasize the importance of operational excellence, which refers to designing and implementing cloud architectures that are efficient, reliable, and easy to manage. In this article, we’ll explore these commonalities in more detail and discuss how the following frameworks can be leveraged to achieve operational excellence.

  • Amazon Web Services (AWS) Well-Architected Framework – This framework is designed to help cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications and workloads on AWS.
  • Microsoft Azure Well-Architected Framework – This framework is similar to the AWS Well-Architected Framework, but it’s tailored to the Azure cloud platform. It helps architects build and deploy solutions that are scalable, performant, secure, and cost-effective.
  • The Open Group Architecture Framework (TOGAF) – This framework is a widely-used enterprise architecture framework that provides a common language, methodology, and tools for designing and managing complex systems. It includes a four-step architecture development process, as well as a set of best practices and reference models.

Framework Commonalities

Cloud architecture frameworks, such as the AWS Well-Architected Framework, Microsoft Azure Well-Architected Framework, and TOGAF framework, all share a common goal of guiding cloud architects and SREs in designing and deploying cloud infrastructures that meet business and technical requirements while minimizing risk and maximizing efficiency. While there may be differences in the focus and details of each framework, they share several commonalities.

One such commonality is the emphasis on designing and architecting cloud solutions based on established principles and best practices. These principles and practices include modularity, loose coupling, high availability, security, and performance optimization. By adhering to these principles, cloud architects can ensure that their solutions are designed to be resilient, secure, and efficient.

Another commonality is the focus on building cloud solutions that are scalable, secure, highly available, performant, and cost-effective. These requirements are critical for any cloud-based solution, as they ensure that the solution can grow and adapt to changing business needs while remaining secure and efficient.

Finally, cloud architecture frameworks provide guidance on how to manage and optimize cloud resources and costs. By understanding how to manage and optimize resources, cloud architects can ensure that they are building solutions that are cost-effective and efficient. This is a significant different between the cloud frameworks from AWS, Azure, and GCE compared to the VMware Certified Design Expert framework. This helps organizations save money and allocate resources more effectively, ultimately driving better business outcomes.

Cloud architecture frameworks also provide guidance on how to assess both business and technical requirements and map them to appropriate cloud services and technologies. By understanding the requirements, cloud architects can ensure that they are building solutions that meet the needs of the business while leveraging the right tools and technologies to achieve the desired outcomes.

Requirements Gathering

One of the first steps in designing a cloud architecture is gathering requirements. This involves identifying the business and technical requirements of the organization, as well as any constraints or limitations that may impact the design. Similar to the VMware Certified Design Expert (VCDX) framework, each of the AWS, Azure, and TOGAF frameworks provides guidance on requirements gathering, although they differ in their specific approaches.

Just like the VCDX framework, these cloud frameworks also differentiate between a technical and non-technical requirement. It is critical to understand the importance of identifying the key technical and non-technical requirements and their relationships to one another.

Non-technical requirements provide the business context for technical requirements. They define what the solution is supposed to achieve, how it should support the business, and what stakeholders expect from it. Technical requirements then translate those non-technical requirements into specific technical features and capabilities needed to implement the solution.

Here are the requirements categories in the AWS, Azure, and TOGAF frameworks:

  • AWS Well-Architected Framework: The AWS Well-Architected Framework identifies five pillars of cloud architecture: operational excellence, security, reliability, performance efficiency, and cost optimization. Under each of these pillars, the framework provides a set of best practices and questions for gathering and prioritizing both technical and non-technical requirements.
  • Microsoft Azure Well-Architected Framework: The Microsoft Azure Well-Architected Framework identifies five pillars of architecture: cost optimization, operational excellence, performance efficiency, reliability, and security. Under each of these pillars, the framework provides guidance on how to gather and prioritize both technical and non-technical requirements.
  • TOGAF: The TOGAF framework provides a comprehensive framework for gathering and analyzing both technical and non-technical requirements. The framework includes four domains: business, data, application, and technology. Each domain includes a set of artifacts, such as business goals, data models, application portfolios, and technology standards, that can be used to identify and prioritize requirements.
In general, all three frameworks provide guidance on how to gather and prioritize technical and non-technical requirements, and they emphasize the importance of working closely with stakeholders to ensure that all requirements are captured and understood. They also recognize the importance of a clear and consistent understanding of requirements across different stakeholder groups, and they provide guidance on how to communicate requirements effectively to different audiences.

Operational Excellence

One major area where the AWS and Azure Well-Architected Frameworks differ from the VCDX and TOGAF frameworks is the inclusion of the operational excellence pillar or category. Operational excellence refers to designing and implementing cloud architectures that are efficient, reliable, and easy to manage. This involves automating operational tasks, implementing monitoring and alerting systems, and ensuring that architectures are designed to be easily managed and maintained. The VCDX framework’s closest comparison would be the manageability and reliability categories, however I feel the AWS and Azure frameworks place additional emphasis on this concept which should be adapted into our architecture designs and organizations.

AWS and Azure both similarly define operational excellence and provide some recommendations for key performance indicators (KPIs) which will assist an organization in assessing their maturity.

The operational excellence pillar of the AWS Well-Architected Framework includes best practices for optimizing operations, monitoring, and continually improving processes and procedures. Some of the key KPIs that are used to measure operational excellence in AWS include:

  • Time to detect and resolve incidents: This measures the time it takes to identify and resolve issues that impact the performance or availability of the workload.
  • Change success rate: This measures the percentage of changes that are successfully deployed without causing any service disruptions or errors.
  • Mean time to recover (MTTR): This measures the average time it takes to restore a service or application after an outage or incident.
  • Percentage of manual tasks: This measures the percentage of operations tasks that are performed manually, which can indicate opportunities for automation and process improvement.
The operational excellence pillar of the Microsoft Azure Well-Architected Framework focuses on ensuring that cloud workloads are designed to support efficient and effective operations. Some of the key KPIs that are used to measure operational excellence in Azure include:
  • Change failure rate: This measures the percentage of changes that result in a service outage or error.
  • Time to detect and resolve incidents: This measures the time it takes to identify and resolve issues that impact the performance or availability of the workload.
  • Deployment frequency: This measures the frequency of deployments, which can indicate the efficiency of the deployment process and the ability to deliver new features and updates quickly.
  • Service level agreements (SLAs): This measures the percentage of time that the workload meets the SLAs for availability, response time, and other key performance indicators.
While the TOGAF framework does not have a pillar specifically dedicated to operational excellence like the AWS and Azure Well-Architected Frameworks. However, TOGAF does provide guidance on ensuring that enterprise architecture is designed to support efficient and effective operations. This includes considerations such as:
  • Defining and documenting the organization’s business processes and IT operations procedures
  • Establishing performance metrics and key performance indicators (KPIs) to measure the effectiveness of operations
  • Ensuring that systems are designed to be scalable, reliable, and maintainable
  • Implementing automated monitoring and alerting systems to detect and respond to incidents and issues
  • Developing and maintaining disaster recovery and business continuity plans
  • Providing training and support to IT operations teams to ensure that they can effectively manage and maintain the organization’s systems and applications.
In general, the operational excellence category of the AWS and Azure Well-Architected Frameworks focuses on ensuring that cloud workloads are designed to support efficient and effective operations, with a particular emphasis on automation, monitoring, and continuous improvement. The KPIs used to measure operational excellence can help architects and SREs identify areas for improvement and optimize the performance, availability, and reliability of their cloud workloads.

Conclusion

Cloud computing has become an essential part of modern enterprise IT, and cloud architects play a critical role in designing, deploying, and managing cloud solutions. To help architects succeed in this role, several cloud architecture frameworks have been developed, including AWS Well-Architected Framework, Azure Well-Architected Framework, and TOGAF. While there are some differences in the focus and details of each framework, they all emphasize designing and architecting cloud solutions based on well-established principles and best practices. They also focus on building cloud solutions that are scalable, secure, highly available, performant, and cost-effective, and they provide guidance on assessing business and technical requirements and mapping them to appropriate cloud services and technologies.

One commonality among these frameworks is the emphasis on achieving “operational excellence” in cloud solutions. This involves designing and operating cloud solutions to deliver business value, focusing on areas such as managing and automating operations, monitoring, and optimization. Achieving operational excellence is critical to maximizing the value of cloud solutions while minimizing costs and risks.

In conclusion, cloud architecture frameworks such as AWS Well-Architected Framework, Azure Well-Architected Framework, and TOGAF provide valuable guidance and best practices for cloud architects. By following these frameworks, architects can design, deploy, and manage cloud solutions that meet the needs of the business and technical stakeholders, while achieving operational excellence.