Architect is the third step in the SAFe DevOps Health Radar, part of Continuous Exploration. In this video, I explain how we define the minimal architecture needed to prove a hypothesis and enable continuous delivery of value to customers.
Where Architect Fits in the Pipeline#
In the previous steps, we gathered ideas from the business and customer side and transformed them into epics with hypothesis statements (Hypothesize). We then analyzed real customer needs, conducted market research, and updated the business model (Collaborate & Research). Now in the Architect step, we take these epics and define the minimal architecture needed to prove the hypothesis.
Why Minimal Architecture?#
The purpose of this step is not to design a complete, final architecture. We want to define just enough architecture to enable continuous delivery of value. When a hypothesis turns out to be true, we want to be ready to deliver value continuously, without having to stop and build a continuous delivery pipeline or fix accumulated technical debt first.
Separation of Deployment and Release#
One of the fundamental concepts in DevOps is the separation of deployment and release. A deployment is the act of bringing compiled code into production with a feature toggle switched off. A release is the act of switching that feature toggle on in production. By separating these two activities, we enable continuous deployment of code into production, which is a key enabler for continuously delivering value to our customers.
Architect for Testability#
As architects and developers, we need to design our systems so they can be tested properly. This means:
- Having proper tests and proper test data
- Building loosely coupled architectures with small components or services that each serve a single purpose
- Testing through the API layer, which provides stable and reliable test interfaces
- Using the microservice architectural style where small, independent services run in separate containers
Loosely coupled systems are not only easier to test but also easier to change, deploy, and scale independently.
Architect for Releasability#
In a large system, you typically have multiple components, services, or microservices. Each of these should have its own deployment and release cycle. For example, one component might release every two weeks, another only on demand for security patches, and a third roughly every month. The overall system might have a quarterly release cycle.
By designing for independent deployment and release cycles, teams can deliver value at their own pace without being blocked by other teams or components.
Architect for Operations#
When architecting a system, we need to consider the operational needs from the start. This means thinking about how the system will be monitored and operated in production without logging into the production server directly. To achieve this, we need:
- A good telemetry and logging system that extracts all relevant data from the production environment
- Not just application data but also business data, because we need to prove our hypothesis and measure the value we deliver
- Feature toggles that allow us to quickly switch off problematic features, which massively improves operability
Feature toggles also enable us to switch on features for a subset of users first, so we can observe how the system behaves in production before a full rollout.
Architect for Fast Recoverability#
We need a plan for when something goes wrong in production. This includes:
- A clear incident response plan for recovering from major incidents
- The ability to analyze incidents quickly to identify root causes
- Ensuring business continuity can be maintained even during outages
Designing for fast recoverability means the architecture itself supports quick rollback, isolation of failures, and rapid recovery.
Security: Threat Modelling#
Security must be taken into account during the architecture phase. We apply threat modelling, which involves:
- Identifying all threats that could affect your application
- Analyzing potential attackers who might target your system
- Mapping attack vectors that these attackers could use
- Addressing the security concerns that emerge from this analysis
This applies to both internet-facing applications and internal applications. It is important to involve security experts from your organization, because they know the threats, attackers, and attack vectors best.
What the Architect Step Produces#
The output of this step is:
- A solution intent (also called solution blueprint or solution design): an architectural idea for the minimal architecture needed to prove the hypothesis
- A clear set of non-functional requirements (the “-ilities”): availability, usability, reliability, testability, releasability, and others
These non-functional requirements are constraints that apply to every user story in the backlog. They must be tested and verified alongside the functional requirements.
The Maturity Levels#
The SAFe DevOps Health Radar provides a maturity assessment for the Architect step. You rate your team’s effectiveness at architecting for continuous delivery:
- Sit: Architecture is monolithic and fragile. It is difficult to change and involves managing complex dependencies across many components and systems.
- Crawl: Architecture is predominantly monolithic, but some applications and systems are loosely coupled.
- Walk: Architecture is mostly decoupled but does not allow release on demand.
- Run: Architecture is aligned around value delivery with few dependencies across components and systems.
- Fly: Architecture is built for release on demand and operability.
Key Takeaways#
- Define only the minimal architecture. Do not over-architect. Build just enough to prove the hypothesis and enable continuous delivery.
- Separate deployment from release. Feature toggles allow you to deploy continuously while controlling when features become visible to users.
- Design for testability. Loosely coupled architectures with API-level testing and containerized microservices make verification straightforward.
- Design for operations from the start. Telemetry, logging, and feature toggles are not afterthoughts. They are architectural decisions.
- Plan for failure. Architect for fast recoverability so that major incidents do not turn into prolonged outages.
- Apply threat modelling early. Identify threats, attackers, and attack vectors during the architecture phase, and involve your security experts.
