While working on the VMware Private Cloud Architecture team, we were regularly making architectural design changes to the many environments we were responsible for. As part of the design decision process, we implemented an Architecture Review Board (ARB) and a Design Decision Process, including a Design Decision Template. Since leaving the Private Cloud team and joining the Advanced Customer Engagement team, I’ve found myself discussing this same process with other enterprise architects to assist organizations further mature their processes.
Documenting our design decisions are important for several reasons:
- They allow our operation teams to understand how each environment should be deployed and configured.
- Documents changes based on requirements and constraints in a specific point-of-time. The design decisions provide context to a specific challenge we were solving for.
- Provides new employees the ability to understand what the environment looks like and operational challenges faced over time.
The template can vary from organization to organization, however there are several critical pieces that I believe every design decision should include. Feel free to adapt this template as needed for your organization.
Summarize the problem or issue the design decision is addressing. It can include information around the business use-case, operational issue, or existing risk the design decision is solving.
Describe why the decision is necessary. What is driving the need to fix the problem summarized in the Problem Statement?
The most important foundational part of any design decision. Provide clearly defined requirements based on the business use-case you are required to meet. A requirement is a necessity or pre-requisite that must be met in order for the decision to have merit.
If there are no requirements specified, then there is no reason to be writing a design decision.
Nearly every design decision will have assumptions, although there are rare cases where there may be none. An assumption is a preconceived thought or belief you as the author is making based on little or no evidence. The assumptions are generally accepted as truths until proven otherwise.
If you are making a large number of assumptions for the design decision, you are encouraged to spend time investigating those assumptions beforehand to determine if they are true or false. Once proven true, they become part of the design decision as constraints or justifications.
Every design decision will be susceptible to some number of constraints. A constraint is a limitation or restriction the design decision must abide by. These constraints may be environmental, historical or due to an existing standard that must be adhered to. Most previous design decisions eventually become a constraint to the future design decisions.
Provide a detailed description of the design changes, including any logical or physical diagrams to support the decision.
Explain the rationale behind each architecture design made above. Describe the technical reasons that demonstrate correct or reasonable decisions have been made.
Think about how the architecture decision(s) made in the document will affect other components with the environment. The implications should identify risks, new requirements and/or constraints to the design decision. If there are new requirements as a result of the decision(s) made, address those here. In addition, if the decision(s) made will make other previously ratified design decisions obsolete, that too should be noted.
Note any risks that have been introduced as a result of the design decision(s) made above. Provide a description of the risk and how it will be mitigated.
Explain how the new design decision will be addressed by the operational teams. Include documentation on how operational procedures will be modified to ensure the design decision is being adhered to.
Explain how the design decision(s) documented will be implemented into new and existing environments. What teams will be involved and what steps will need to be taken to properly implement. The section does not need to include every detail of the implementation, but it should demonstrate the broad strokes and show thought has been put into the process.
Identify areas where automation is possible, including the tools that can be leveraged to do so (PowerCLI, Ansible, API, etc).
Monitoring & Alerting
Describe any new metrics that need to be monitored, including thresholds for each metric. Document what, if any, configuration checks need to be in place to verify the implementation was done properly and/or the system is operating as expected. In addition, describe the mitigation or action plan when an alert threshold is crossed, identifying any tools required to address issues within the environment.
Discuss the alternatives that were considered in place of the decisions made in the document. Clearly document the alternatives so that it can easily be understood what other options were not only considered, but why they were not chosen. Provide details on how the decisions asserted are a better choice than the alternatives, based on the requirements, constraints and business needs.
If an alternative should be reconsidered in the future, note what requirements or business use-cases would need to change to invoke reconsideration.
Note any reference materials, official documentation, blog articles, etc. that were leveraged within the document to provide additional context.