Agile 7 – Commentary on “Agile Architecture in the Digital Age”
Copyright Graham Berrisford. One of several hundred papers at http://avancier.website. Last updated 04/04/2019 17:24
This is a supplement to a series of mostly short papers.
2. On agile software development
3. On agile businesses and systems
4. On systems thinking ideas used in agile
5. What is agile architecture?
6. EA in the world of agile architecture
Some
comments made below presume the ideas in Agile papers
5 and 6 are understood.
This paper
is a commentary on “Agile Architecture in the
Digital Age” a white paper published by The Open Group at www.opengroup.org.
The quotes in blue below are from that paper.
Executive Summary
The effectiveness of agile processes is too often
jeopardized because the architecture and organizational pre-requisites of
agility are neglected.
Since the 1990s, development teams
have expressed frustration at things that prevent them
using agile methods.
Some frustrations are reasonable,
others are naive.
The fact is,
some agile processes do not suit some development projects.
Projects should be assessed for
suitability before they start (we have a nine-point score chart).
This White Paper proposes a new Architecture Framework (AAF), that meets the needs of the digital enterprise.
It is debatable whether the white
paper is about “agile architecture”, or what that means.
It is certainly about design patterns
that facilitate the “scaling up” of agile software development, particularly in
the business context of a digital enterprise.
It [the white paper] develops a vision that combines in a
unique manner:
• Methods for decomposing the
system, and the organization that designs it, into loosely-coupled services and
autonomous teams
Here
“services” are application components or subsystems (not discretely requestable behaviours as in TOGAF and ArchiMate).
These
subsystems are loosely-coupled so they can work relatively
independently.
But if teams
and subsystems were wholly autonomous, there would be no enterprise
architecture at all.
They must be
integrated to support the wider data integrity and reporting requirements of a
business.
And to
support the longer, higher level, business processes of the enterprise.
• Alignment mechanisms rooted in
business strategy that promote a shared culture
that becomes
the glue that keeps empowered organizations from falling apart
The white
paper draws on some questionable socio-cultural thinking and analogies.
Intervening
in a disorderly (aka complex) situation is one thing; building an orderly (aka
complex) system is another.
• Architecture patterns that
leverage the latest software innovations
in
distributed computing, autonomous systems, data streaming, and artificial
intelligence
Some/many of
the principles and patterns are decades old.
• Validated learnings
from very large enterprises that have started their agile-at-scale journey a
few years ago
Most learnings come from FANGS. See
discussion of FANGS above.
Introduction
We observe that current architecture practices and skills
come under scrutiny because they are typically anti-patterns of a lean and
agile culture.
They are too often perceived to stand in the way of
iterative development, Minimum Viable Products (MVPs), and collaboration.
Some of what follows seems a reaction
against rigid top-down EA practices of (for example) large banks.
But some of those practices may be
needed, e.g. to mitigate the risk of failing to processing money properly
And most of the organisations whose
architects attend our training courses have never had such rigid EA practices.
The Architect today will pull on years of tradition, but
will have to operate in a different way,
creating new
artifacts, learning new skills, and working as members of cross-functional
teams.
Architecture needs to create usable assets that resonate
with engineering and operations teams.
This White Paper aims to address these issues and start to
lay out the new architecture framework that the digital enterprise needs.
A recent McKinsey survey 1 shows that organizational agility
is on the rise: “the need for companies to demonstrate agility is top of
mind”.
Some agile development gurus side with
sociologists in recommending new business organisation structures.
However, the organisation structure is normally the responsibility of
business directors and senior managers
It is unclear how far enterprise, solution and software architects can
influence this.
In any case, most of the guidance in the white paper is written from a
software development organisation perspective.
Much appears to presume bespoke code rather than COTS packages.
An increasing number of large firms are deploying
agile-at-scale frameworks, such as:
-
the Scaled Agile Framework® (SAFe®),2
-
Large-Scale Scrum (LeSS™),3 or
-
the Spotify®
Model. 4
Agile software development methods
were initially written for use by a single team.
Agile architecture frameworks usually
include a selection of regular agile software development principles.
What is meant by architecture, and
agile architecture, is not always clear.
Does agile mean the architecture is
flexible? Or one architecture will facilitate all
agile software development?
And what does scaling up mean?
It sometimes means scaling up a system
to handle extreme volumes of transactions/operations.
But it usually means widening the
scope of agile methods to several teams working on related systems/subsystems.
Read this paper for some commentary on SAFe.
A paper published in IEEE Software 5 claims that process
improvement alone cannot fix the root causes of poor agility:
“Agile practitioners have focused intensely on improving
software development processes and not so much on technical health …
We’ve worked with several large organizations in which the
application of lean principles produced underwhelming results …
This is because velocity measurement, planning poker,
attacking defect backlogs, Kanban cards, pair programming,
or
sprint-based planning do little to attack the root cause of problems that are
inherently structural.”
Experiences from the field confirm the aforementioned: root
causes are not limited to technical health,
they also
originate in the organization and culture of the enterprise.
The verbatims below illustrates
this:
The quotes below express frustration
with EA practices and deprecate them.
Some is frustration at what seems bad
EA practice.
Some suggests agile teams can or
should be released from oversight and governance.
Some might be read to undermine the
whole idea of EA.
• “Legacy methods tend to slow
down the initial sprint … Subsequent sprints often change architecture models
defined in previous sprints.”
Even agile
gurus presume at least a little design up front.
Good EAs
don’t prescribe software architecture down to a level likely to be volatile.
They do
accept the need to maintain their higher level models in line with
changes.
• “If we’re going to have to do
a heavy architecture which plans for a year or two or five years into the
future
on every
one of those experiments, we’re in serious trouble. We cannot be agile.”
Good EAs do
not preclude innovative and experimental projects.
They give
waivers from enterprise-wide standards for those purposes.
• “... We want to be able to
put in the smallest, simplest, minimum viable experiment, prove an assumption,
beef it up
if we want to or follow it wherever it goes, pivot and follow it wherever it
goes.
That means our entire architecture
is going to be emergent based on where we want to go.”
“Our entire
architecture” appears to mean the architecture of a single application, rather
than enterprise architecture.
“... Ideally that pool of seniors
in your team act as a kind of proxy architecture committee,
and we don’t
have to go to someone who’s supposedly got the title sitting in an ivory tower,
and has
never actually built that thing in the last ten years because they’ve been
thinking high-level.”
That is
certainly reasonable for low level software architecture decisions.
EA is
supposed to be about core business processes and application portfolio-level
management.
• “At an application level I
think that those architects are a waste of time. I really don’t think they know
what they’re talking about nowadays.”
There ought
to be a two-way collaboration, educating the EA if need be.
Development
teams should explain software architecture decisions, at least those that have
an impact at the application portfolio-level.
Good EA has
to be pragmatic and flexible – principles are guidelines to be used by
experienced architects – not rules.
It requires
collaborative behaviour to towards a target that may be continually
re-factored.
• “Though the architecture
discipline is needed, the architect’s role as a squad member is ill-defined …
Squad architects must attend all
agile ceremonies, otherwise they are at risk of
becoming marginalized.”
Using role
titles as in our training courses; it is probably solution architects (rather
than EAs) who ought to attend agile ceremonies.
If decisions
made there have an impact on EA road maps, then somebody in the team should
report that.
It is our firm belief that architecture (the thing) and
architecting (the verb) cannot be treated separately.
Put differently, in an agile context technical health and
the process dimension go hand-in-hand.
Generally, an
architecture is a product of architecting processes.
Here, what is the architecture? The software code? Higher-level design
documentation? A set of general principles and
patterns?
Enterprises too often have neglected the architecture,
organizational, and cultural pre-requisites of agility
(as the primary focus of agile
transformations has been the process dimension.)
This is written as though enabling
agile software development is the enterprise goal.
The truth is – in some projects, the
scope for agile development is limited.
And giving software development teams
their head can create the problems EA was invented to solve.
Generally, we see that:
• Enterprise Architects should
(re)focus their attention on modularizing monolithic systems, because it is the
number one pre-condition for agility
Have
enterprise architects stop discussing this with solution/software architects?
• Existing architecture practices
and roles need to evolve to remain relevant in an organization that adopts
agile ways of working
Yes to some
extent, though TOGAF (e.g.) is already so iterative and flexible it can be
applied in an agile way.
• The body of knowledge of architects
needs to be completed to meet the needs of the digital enterprise.
Does your
enterprise have the needs of a digital enterprise? See discussion of FANGS
above.
• Classical architecture
governance models lose relevance when shifting from large programs toward
multiple autonomous teams.
Perhaps,
though the teams in a program (given they share a goal and don’t want to
duplicate each other) cannot be wholly autonomous.
This White Paper formulates a vision that has the ambition
of solving these problems.
It is based on the diagnostic below:
• When teams are not autonomous
enough, it slows down continuous delivery which limits agility
Yes, though
autonomous enough varies according to the business need for system
integration and integrity.
• To avoid chaos, team autonomy
must be balanced by alignment mechanisms that cannot rely on a
command-and-control culture that otherwise would get in the way of autonomy.
Yes, though
the top-down command and control view of EA is out of date
• New software architecture
patterns deeply influence the evolution of Enterprise Architecture.
To be
discussed.
• The digital enterprise needs a
new architecture body of knowledge, new processes, and governance practices;
architecture roles need to be redefined.
Is your
enterprise a digital enterprise? See discussion of FANGS above.
Autonomy and Loose-Coupling
The 2017 State of DevOps Report 6
measured coupling between services and components by capturing whether:
•
Respondents could do testing without
requiring an integrated environment
•
Applications and services could be
deployed or released independently of other applications and services on which
they depend
They discovered that high-performing teams were more likely
to have loosely-coupled architectures than medium and low-performing teams.
That seems self-evident.
Obviously, the more isolated a system
is, the easier the system is to develop and maintain.
The more inter-system coupling there
is, the harder the systems are to develop and maintain.
One question is whether the coupling
is avoidable or not – a distinction not drawn in the white paper.
Another is whether the coupling is
logical or physical – a distinction not drawn in the white paper.
Our courses cover a dozen ways to be
tightly or loosely coupled – which are significant here?
The 2017 DevOps report confirmed
this and verified two new hypotheses:
•
Teams that can decide which tools
they use do better at continuous delivery –contrast [only tools] mandated by a
central group
•
In teams with strong IT and
organizational performance, the architecture of the system is designed so
delivery teams can test, deploy, and change their systems without depending on
other teams for additional work, resources, or approvals, and with less back-and-forth
communication.
Obviously, for productivity, software
teams and subsystems should be relatively decoupled/autonomous.
But physical decoupling is not logical
decoupling, and good EA should optimise the degree of decoupling.
Development team organisation
Academic research by MacCormak et al.8 demonstrates that a relationship exists between
the structure
of an organization and the design of the products that the organization
produces.
A natural experiment shows that loosely-coupled
organizations develop more modular designs than tightly-coupled organizations.
Obviously so; but the optimal degree
of coupling is determined by the business context.
The key takeaway is that to architect a loosely-coupled
system, it is important to pay attention to the organization that will produce
it.
Because the two are congruent, the reverse Conway law 9
suggests that the design of the architecture should influence the design of the
organization.
That relationship being established, we will explore how to
decompose a software system and the organization that will produce it.
Good EA recognises the software dev/ops
organisation structure should reflect the software system structure - and it
usually does.
Modularisation into software layers
Traditionally, architecture has focused on layering software
systems based on technology concerns such as data access, business logic,
application logic, or presentation logic.
Actually, a three-layer
software architecture preceded the introduction of two/three/four client-server
technology tiers.
(I know
because I was teaching it in 1979.)
It was introduced for logical reasons,
to separate concerns.
First, to separate the processing of
user interface (UI) data structures from the processing of persistent data
store structures
Given such a two-layer
architecture, data-centric business rules are better applied in data store than
UI layer.
And for various reasons, those
business rules may even better be separated out into a layer of code sitting
above the data store.
The main benefits are:
•
Standardization which could limit
technology risks and leverage economies of skills
•
A form of modularity because
changes in one layer do not impact layers below
Layering has created a generation of architects motivated by
learning technologies to increase their market value, and less interested in
learning the domain.
The proposal is that software teams
should develop vertical slices of an application rather than horizontal layers.
OK, though that doesn’t rule out some
team members wanting to specialise in either
client-side or server-side programming.
In his seminal book,10 Eric Evans
claims that the domain is the main source of software complexity.
In
2002, Martin Fowler observed that Domain-Driven Design is difficult to learn
and best reserved for complex systems with “rich domain models”.
And in January 2017, Wikipedia said: “Microsoft recommends that [domain-driven design]
be applied only to complex domains.”
(Where
I guess rich probably implies substantial use of inheritance.)
Some domains are complex; some are not.
Creating a complex domain model creates a second of kind
complexity.
That is, the structure clash between domain layers and data
store layers.
He has developed a method, Domain-Driven Design (DDD), to address
it.
The white paper goes on promotes DDD,
though perhaps only at a superficial level.
Domain-Driven Design
Domain-Driven Design (DDD) decomposes the domain into
sub-domains and contexts.
If done well, the resulting domain architecture defines a set
of loosely-coupled services [think
subsystems].
DDD focuses the attention on the vertical decomposition of
the system.
Before DDD, there was
business-component-based design.
Systems were partitioned (aka
vertically decomposed) by dividing a persistent data structures so as to
separate loosely-coupled “kernel entities”.
The system is decomposed into services that [each]
encapsulate a set of homogenous [cohesive?] capabilities.
This usually means the service is a
subsystem that acts on a cohesive subset of the persistent data structure.
The same idea underpinned
business-component-based design before DDD, and today, it underpins microservices.
The modularity rule applies, cohesion is high within a
service, inter-service coupling is low, and implementation details are hidden
behinds APIs.
Good EA recognises this rule was
advanced by Larry Constantine in 1968.
It also recognises that the rule
applies with different force at different levels of service (subsystem)
granularity.
Each service owns its persistence mechanisms and exposes its
functions and features through well-defined interfaces.
Inter-service communication occurs through synchronous or
asynchronous interactions.
When communication is asynchronous, messages or events link
services through protocols such as the publish-and-subscribe one.
Note: inter-component communication
can be synchronous.
Pub-sub and event-driven messaging are
not mandatory.
No communication is allowed through database sharing, shared
libraries, or other mechanisms.
The choice of tools and development stack is not
constrained, which has two upsides:
•
Innovation is not slowed down by
technology standards that are likely to become obsolete over time
•
Teams that can decide which tools
they use do better at continuous delivery11
Note: no shared libraries. Does this
mean no common subroutines?
A domain-oriented microservice
can be seen as a subroutine.
The various clients of that subroutine
do not communicate via it directly.
But they may communicate by means of data
maintained in its supposedly private data store.
The filtering of this communication
via a higher level API does not remove the logical coupling.
Services can be tested and deployed in isolation and are
easily containerized, which helps speed continuous deployment.
OK.
Decomposing a domain requires deep domain knowledge to avoid
designing services that expose APIs prone to abstraction leaks.
The level of granularity of services can vary.
When services are responsible for a single capability, they
are referred to as microservices.
The trouble is that “capabilities” are
just as composable and decomposable as “services” – so the rule is vague to the
point of vacuity.
Choosing the right level of responsibility for each service
– its scope - is one of the most difficult challenges.”12
Yes indeed.
Anecdotes suggest many (2014-2018)
have decomposed applications into microservices that
are too fine-grained.
This slows down process, and pushes
complexity into the messaging and middleware.
DDD is now a well-adopted approach to help decompose a
system into modular parts.
“Many microservice adopters have
turned to Eric Evans’ “Domain-Driven Design” (DDD) approach for a
well-established set of processes and practices
that
facilitate effective, business-context–friendly modularization of large complex
systems.”13
This discussion of DDD ought to
distinguish its application at higher and lower levels.
The first of interest to the
enterprise or solution architect
The second has been deprecated by
Microsoft as too complex for most applications.
For all but complex applications, even
Martin Fowler recommends “Transactions scripts” instead.
However, even a good method cannot replace domain expertise,
therefore Martin Fowler14
advises to start with a monolithic implementation and refactor
it into microservices when the domain is better
understood.
“Eventually the team merged the services back into one
monolithic system, giving them time to better understand where the boundaries
should exist.
A year later, the team was then able to split the monolithic
system apart into microservices, whose boundaries
proved to be much more stable.”15
Toward Modular and Empowered
Organizations
Let us now look at how enterprises alter their operating
models and organizational structures to become more agile and how it is
congruent with the evolution of software systems toward modularity.
See the notes in the first section
above on the business organisation and on the business change organisation.
The white paper is mostly about the
latter, the organisation that is responsible for changing business systems – by
software development.
The traditional way of steering change relies on programs
and projects staffed from shared pools of resources.
In the new operating model, the focus is shifting toward
stable teams with dedicated resources that are responsible for designing,
building, and running products or services.
The process is managed by strong product owners, often from
the business, who work closely with IT at all stages of the product lifecycle.
All roles are integrated within self-organizing feature
teams or squads sometime regrouped into tribes.16 (Spotify practices)
The project manager role is shifting toward an agile coach
role and line managers focus on capability building.
Remember the Spotify model is based on
an enterprise whose business is its software product.
In other businesses, a difficulty with
the new operating model is balancing supply and demand.
The demand for software change varies
over time in one business unit and between business units.
Where there is little or no demand for
change; how to maintain a dev/ops team?
A Practical Example
The example business is based on a
single monolithic ecommerce application?
OK, but EA practices are designed for
businesses that already have (say) 500 to 5,000 business applications.
We will illustrate this with an ecommerce enterprise whose
business model is based on sales that last for a few days and are announced
only 24 hours in advance.
Brands only appear twice a year … By frustrating demand and
putting on time constraints, it aims to create desire and impulsive buying.
<snip>
The enterprise and the software system are being decomposed
into 50+ products managed by autonomous teams;
for example,
a payments team and a logistic team whose missions are respectively to create
and run the best payments or logistics products.
The current software system which is too inflexible and monolithic
is being re-built in an incremental manner using the microservices
architecture style.
OK. Sounds a reasonable strategy if
the “products” are a based on relatively discrete data stores
See this Microservices paper for discussion of design tradeoffs.
Balancing autonomy with alignment
Much of this section is socio-cultural
systems thinking about business organisations.
The
white paper blurs the distinction between the business organisation and the
business change organisation (EA and software development)
Partly because its references are to giant
application-centric businesses (FANGS).
The white paper draws on some questionable
socio-cultural analysis and analogies.
Neither
a free market economy nor an army is about constructing, maintaining and
extending a complex orderly system.
Managing or intervening in a
disorderly situation is one thing; building an orderly system is another.
“Ensure that when the bottom spoke the top listened – was
one of the challenges we would eventually have to overcome …
Order can emerge from the bottom up, as opposed to being
directed, with a plan, from the top down.”17
It can
indeed, but the general did not disband the army’s
management structure or make his own role redundant.
The separation of decision-making from work characterizes
command-and-control thinking.
It keeps managers out of touch of their operations.
A central tenet of this thinking is management by numbers
which helps create a simplified and abstracted view of reality.
Good EA works in cooperation with
solution architects working on projects.
Shortcomings of Command-and-Control
Setting up a straw man?
This section deprecates a strict
top-down command and control style that I have not seen for a decade or more.
<snip>
Command-and-control thinking is not an effective way of
aligning autonomous teams because top-down flawed decisions are likely to clash
with autonomous teams.
For example, the Spotify engineering culture is
waste-repellent: if it works keep it, otherwise dump it.
At Spotify they skip or dump handoffs, useless meetings, and
corporate nonsense.18
In contrast, agile organizations align work with a
meaningful purpose … The few on the top provide clear vision, priorities, and
missions.
Transparency gives a team access to the information and
context it needs to make good decisions.
Well-informed teams are given empowerment and trust.
Access to privileged information is no longer a power source
that middle managers leverage to impose their will upon their teams.
Remember, Spotify is one of the FANGS.
Changing the Organizational Model
Shooting at a straw man?
This section is an academic’s sales
pitch for delegation of authority.
The shift from command-and-control to agility requires a
culture change.
In his book “Reinventing Organization”19 Frédéric
Laloux develops a taxonomy
of organizational models.
The author observes that most modern global corporations are
the embodiment of, what he calls, the Orange
Organization type where the hierarchical structure dominates.
Virtual teams, cross-functional initiatives, and expert
staff functions foster the innovative responsiveness that is needed to beat the
competition.
The next stage of evolution, the Green Organization, retains
the meritocratic hierarchical structure of Orange but pushes most of decisions
down to frontline workers.
In Green Organizations “a strong, shared culture is the
glue that keeps empowered organizations from falling apart.
Frontline employees are trusted to make the right decisions
because they are guided by a number of shared values, rather than by a thick
book of rules and policies”.
Too many enterprises that deploy agile-at-scale lack the
empowerment, strong culture, and shared values that are pre-conditions to agile
transformation.
Enterprise Architects who are used to operating in Orange
Organizations are often ill-prepared to drive change toward agility.
Architecture governance models that were developed in Orange
Organizations get into the way of agile transformation.
Note the fervour of the revolutionary
that we must sweep away of the old to reach a brave new world.
Dual organisation model
This section
copies another suggestion (rather than evidence of its success).
John Kotter has developed a model
that describes how traditional organizational hierarchies can shift toward this
next stage of organizational evolution.
For most companies, the hierarchy is the singular operating
system at the heart of the enterprise.
But the reality is that this system simply is not built for
an environment where change has become the norm.
Kotter advocates
a new system – a second, more agile, network-like structure that operates in
concert with the hierarchy to create what he calls a “dual operating system” –
one that allows companies to capitalize on rapid-fire strategic challenges and
still make their numbers.
“Accelerate”20 (XLR8) vividly illustrates the five core
principles underlying a new network system, the eight accelerators that drive
it, and how leaders must create urgency in others through role models.
Military analogy
The analogy in this section is weak, for
reasons explained below.
Let us now illustrate the magnitude of change that is
required.
General Stanley McChrystal
confronted a nimble and agile enemy.
To cope with this, he had to change the culture and
operating model of an institution that is used to command-and-control thinking:
“We restructured our force from the ground up on principles
of extremely transparent information sharing (what we call “shared
consciousness”) and decentralized decision-making authority (“empowered
execution”) …
We dissolved the barriers – the walls of our silos and the
floors of our hierarchies – that had once made us efficient …
We looked at the behaviors of our
smallest units and found ways to extend them to an organization of thousands,
spread across three continents.
We became what we called “a team of teams”: a large command
that captured at scale the traits of agility normally limited to small teams …”
Reflecting on military history, McChrystal
attributed the battle of Trafalgar’s victory to the organizational culture that
Nelson had crafted.
This culture rewards individual initiative and critical
thinking, as opposed to simple execution of commands.
Such a cultural change implies that leaders should not make
or approve all important decisions:
“The wait for my approval was not resulting in any better
decisions …
I came to realize that, in normal cases, I did not add
tremendous value, so I changed the process …
The risks of acting too slowly were higher than the risks of
letting competent people make judgment calls …
More important, and more surprising, we found that, even as
speed increased and we pushed authority further down, the quality of decisions
actually went up …”
Systems thinkers speak of top-down command, silos and collaborative
teams.
These ideas apply differently in different organisation and situations.
E.g. The US army strives to destroy distributed terrorist targets; the complexity lies in the disorderliness of
the situation.
McChrystal concluded top-down
command was ineffective in disorder.
He encouraged autonomous teams to collaborate around a shared purpose.
By contrast, a business strives to coordinate distributed silo systems;
the complexity lies in the orderliness of
the wider enterprise system.
To paraphrase Jacquelin
Conway.
“It may be said that silo systems help us develop and deploy each
system, thus making that work more efficient.
But the consequences of dividing the enterprise into
silo systems has a negative impact on the efficiency, effectiveness of
the enterprise as a system.”
For silo systems, you can read “microservices”;
or in an “agile architecture” context you can read “autonomous systems”.
The enterprise wants its end-to-end
business processes to work efficiently and effectively.
This only happens if the supposedly “autonomous”
subsystems are designed to be orchestrated or choreographed together.
If we accept the hypothesis that command-and-control is not
an effective alignment approach to steer an agile organization, what is the
alternative?
Leaders at the top need to provide
guidance, feedback, and support to their teams.
They need to lead with purpose, which requires strategic
clarity.
Business Architecture Patterns
Porter
In a seminal paper,21 Michael E.
Porter writes: “The essence of strategy is in the activities – choosing to
perform activities differently or to perform different activities than rivals.”
It all starts with the definition of strategic positions
that can be based on customer needs, customer experience, and/or some product
or service mix.
Strategy is about creating a unique position that involves a
different set of activities that better meet customer needs while delivering
superior experience.
The way these activities are implemented determines costs
(operating model view).
The difference between the price customers are willing to
pay and costs determines profitability (business model view).
You can read a distillation of
Porter’s points related to business process reengineering here https://ebrary.net/18120/management/michael_porter
His value chain consists of all
the activities necessary to produce and sell a product or service.
He says that “While operational effectiveness
is about achieving excellence in individual activities, or functions, strategy
is about combining activities.”
Architecting a business and its corresponding operating
model can no longer follow a waterfall process steered in a “top-down” manner.
The Lean Startup book22 has
popularized an incremental approach that relies on rapid experimentation and
validated learning.
Autonomous teams that are in direct contact with clients are
best equipped to define MVPs that are market-tested during rapid learning
cycles.
Though autonomous teams are free to experiment, they need
guidance.
It is good to “achieve excellence in
individual activities”; however, Porter says “strategy is about combining
activities”
He also says: “Positions built on
systems of activities are far more sustainable than those built on individual
activities.”
This implies attention to how
supposedly “autonomous” subsystems are optimally coordinated.
The leadership team needs to define a clear vision which can
be translated into a set of missions that are assigned down the organization.
The missions are operationalized
by agile teams that are empowered to challenge them if needed.
The learning process, that lean refers to as catch ball,23 provides a powerful alignment mechanism if conducted
well.
MIT paper
A recent paper from the MIT Center
for Information System Research (CISR) illustrates how a combination of
alignment mechanisms helped Spotify avoid chaos while protecting teams’
autonomy:
•
Provide distinct goals and
objectives to autonomous teams and align teams without introducing layers of
hierarchy
•
Set up formal sharing mechanisms
that synchronize activities as the number of teams grows
•
Define architectural standards
that facilitate autonomy by ensuring that individual components are compatible
A new class of technology-enabled business model is
transforming industries, the platform.
It connects people, organizations, and resources in
interactive ecosystems that disrupt incumbents.
Airbnb™, Uber™, Alibaba, or Amazon
Marketplace epitomize this disruptive power.
OK, but most businesses are not
application-centric FANGS.
Traditional business models were
built around products or services which were
designed on one
end of a pipeline and delivered to clients at the other end.
For sure, many business processes run
from end to end.
E.g. the process in a university
admissions business runs from the start of an academic year to its end.
There will always be end-to-end
processes in applying for role, in a goods supply chain, in a factory.
When platform-based businesses enter markets dominated by
“pipelines”, they enjoy a competitive advantage.
Why? Because pipelines rely on inefficient
gatekeepers to manage the flow of value when platforms promote self-service and
direct interactions between participants.
For sure, the way of the world is that
older businesses are replaced by newer ones.
E.g. we see e-commerce businesses are
putting retail stores out of business.
The internet has enabled customers to
find suppliers without the need for retailers who hold stocks.
The threat to some businesses is that
the internet can shorten the supply chain, and make some middlemen redundant.
The threat to others is that internet
giants operate at a scale that can make low-volume businesses uncompetitive.
What makes agile software development
relevant and useful here?
It becomes relevant when businesses
compete on the basis of the applications their customers use to transact
business.
But isn’t it the real threat the market
dominance of FANGS and other internet giants - more than the quality of their
apps?
Frankly, the quality of some internet
giant apps is low (e.g. Linkedin sometimes posts
messages in the wrong sequence).
A platform can scale and grow more rapidly and efficiently
because the traditional gatekeeper is replaced
by signals
provided by market participants through a platform that acts as a mediator.
Platforms stimulate growth because they expose new supply
and unlock new demand.
They also use big/fast data and analytics capabilities to
create community feedback loops. 24
OK, though how many businesses aim to
compete with FANGS in this self-serve community game?
And will community management software
become commoditised rather than bespoke?
Platforms need governance which consists of a set of rules
concerning who gets to participate in an ecosystem, how to divide the value,
and how to resolve conflicts.
Good governance distributes wealth among those who add value
in a manner that is perceived as fair.
Governance must pay special attention to externalities.
For example, Airbnb suffered from
new rules issued by public authorities wanting to limit externalities such as
its negative impact on apartment rental markets.
OK.
Because technology is a key enabler, we will now review the
software architecture patterns that make digital business models possible.
First, the
internet?
We will also briefly introduce another type of platform that
Michael A. Cusumano designates as “internal
platform”.25
They allow their owners to achieve economic gains by reusing
or redeploying assets across families of products.
“Ensure that when the bottom spoke the top listened – was
one of the challenges we would eventually have to overcome …
Order can emerge from the bottom up, as opposed to being
directed, with a plan, from the top down.”17
Again yes,
but the general did not disband the army’s
management structure or make his own role redundant.
Surely, collaboration between scores
or hundreds of subsystems/teams requires some oversight?
(Note that biological evolution is a poor analogy
True, complex biological systems have emerged from evolution by chance
– but this is not a viable strategy to extend and integrate business systems.
Because evolution proceeds very slowly, by tiny unnoticeable changes,
and 99.9% of changes in DNA turn out to be harmful.)
Software architecture patterns
“Software is eating the world”, Marc Andreessen.26
The rapid evolution of software technology has fueled the growth of digital business.
Following Internet giants’ lead, some enterprises from the
old economy are framing themselves as tech companies;
for example, Banco Bilbao Vizcaya Argentaria (BBVA): “If you want to be a leading bank, you
have to be a technology company.”27
Internet giants did succeed at retaining the agility of startups while they grow at a fast pace and operate at a
global scale.
They paid special attention to loose-coupling and team
autonomy and they learned how to master distributed computing at scale.
Let’s illustrate this with Amazon and Google®.
Amazon
In 2002, Amazon was facing a complexity barrier. The size of
its home page reached 800 MB and it took 8 to 12 hours to compile.
Jeff Bezos issued a mandate that
profoundly changed the way software is created and the enterprise is organized.
Steve Yegge has reported this in a
post.28
1. “All teams
will henceforth expose their data and functionality through service interfaces.
2. Teams must
communicate with each other through these interfaces.
3. There will
be no other form of interprocess communication
allowed: no direct linking, no direct reads of another team's data store, no
shared-memory model, no back-doors whatsoever. The
only communication allowed is via service interface calls over the network.
4. It doesn't
matter what technology they use. HTTP, CORBA, Pub/Sub, custom protocols –
doesn’t matter. Bezos doesn’t care.
5. All
service interfaces, without exception, must be designed from the ground up to
be externalizable. That is to say, the team must plan
and design to be able to expose the interface to developers in the outside
world. No exceptions.
6. Anyone who
doesn't do this will be fired.
7. Thank you;
have a nice day!”
By shifting toward modularity and APIs, Amazon became well
positioned to open its distribution and logistics capabilities to third-party
vendors.
The self-service nature of the platform made it easy for
vendors to sell and distribute their products in a frictionless manner. This
helped Amazon compete against eBay leveraging a business model which is
different.
Note that Bezos
allows that inter-component communication to be synchronous.
Pub-sub and event-driven messaging are
not mandatory.
They can increase complexity in the messaging
and the sagas needed to apply compensating transactions.
Google
In 2004, Jeffrey Dean and Sanjay Ghemawat
from Google published a paper29 that described the MapReduce
programming model.
This innovation deeply influenced distributed computing.
MapReduce is based
on a functional style that makes it easy to automatically parallelize and
execute code on large clusters of commodity machines.
It allows developers without any experience with parallel
and distributed computing to easily utilize the resources of a large
distributed system.
It is also highly scalable and resilient to hardware or
network failures.
Two years later, a team from Google published a paper30 that
describes a distributed storage system for managing structured data designed to
scale to a very large size:
petabytes of data
across thousands of commodity servers.
This system exploits immutability which simplifies
concurrency control: “we do not need any synchronization of accesses to the
file system when reading from SSTables.
As a result, concurrency control over rows can be
implemented very efficiently”.
99% of businesses do not aim to
operate of the scale of FANGS like Google.
EA is about the whole application
portfolio – perhaps 500 to 5,000 applications.
Most of those apps can be supported by
an ordinary DBMS or document store.
Most if not all UK retailer’s web
sites could be supported by a regular DBMS (with SDDs).
The FANGS
The vast amount of data Internet giants are gathering is best exploited using Artificial
Intelligence (AI).
Analytics algorithms help predict customer behavior and recommend products.
Deep learning technology powers new types of applications
that leverage computer vision and natural language processing.
The model Internet
giants follow is based on developing many custom solutions to support
their own products and services.
They paper many of these internal solutions in white papers
that later evolve into open source projects.
The vast majority of leading-edge technology is freely available,
including advanced AI libraries such as TensorFlow™.31
Some of the open source projects even include pre-trained
deep learning algorithms that simplify the creation of new AI applications;
for
example, OpenFace:32 “Free and open source face recognition with deep neural
networks …
Please use responsibly! We do not support the use of this
project in applications that violate privacy and security.”
This stream of continuous technology innovation profoundly
impacts the way software that supports the enterprise is architected.
The rise of automation that ultimately results in the
creation of autonomous systems has and will continue to disrupt business and
operating models.
We will now focus on a few key software design patterns that
should be part of the architect’s body of knowledge.
New Rules of Distributed Computing
The design, development, and operation of distributed
systems has always been a difficult endeavour.
In the past, middleware technology based on transaction
monitors and two-phased commit protocols was good enough.33
Today, it cannot anymore meet the scalability and
availability needs of digital operating models.
OK. But how many businesses aim to
operate at the scale and availability of FANGS?
Design for extreme scale is the exception
rather than the norm.
And note that IoT
is still considered important by only small minority.
Architects need to understand and take advantage of the
paradigm shift toward new distributed computing models that:
•
Decompose systems into distributable
parts that run concurrently on commodity hardware
•
Are horizontally scalable and
elastic to varying workloads
•
Take full advantage of modern
multi-core processors
Surely, only where scaling up is
needed?
Architecting distributed systems is about making trade-offs
between operational complexity, performance, availability, and consistency.
The CAP theorem states that any networked shared-data system
can have at most two of three desirable properties:
•
Consistency (C)
•
High availability (A)
•
Tolerance to network partitions
(P)
The trouble with CAP theorem is that
it omits complexity.
Designing for A and P, means
sacrificing C for a while - if not forever.
But if C is important then you have to
design to restore consistency – which adds complexity (see Saga pattern below).
Splitting a system into distributable parts gives the
ability to scale service capacity, using a larger number of shards to serve
more users.
Replication (data or functionality) in more than one
location is required to recover from failures, thus contributing to the high
availability quality.
Sure, but most business systems
operate at scale that is a small fraction of the scale of FANGS.
Because traditional concurrent programming is error prone,
new “immutable” programming models are important.
Reasoning about the possible states of complex objects is
difficult.
Reasoning about the state of immutable objects is trivial
because they can only be in one state and can be shared safely:
“writing correct concurrent
programs is primarily about managing access to a shared, mutable state ...
If an object’s state cannot be modified, these risks and
complexities simply go away”.34
Still, the business world moves on
(mutates) and the enterprise must know the current state of the entities it
monitors and directs.
So if objects are immutable, new
objects must be created.
Functional Programming (FP) treats computation as the
evaluation of mathematical functions.
A pure function is a function which given the same inputs,
always returns the same output, and has no side-effects.
Pure functions are completely independent of outside state
and, as such, they are immune to entire classes of bugs that have to do with
shared mutable state.
Their independent nature also makes them great candidates
for parallel processing across many CPUs, and across entire distributed
computing clusters.
Because of this, the FP paradigm is used to build
large-scale distributed systems and is becoming mainstream.
For example, software tools such as Apache Spark™ or Kafka®
are written in Scala which is a functional language
and languages such as JavaScript or Java® have functional extensions.
Highly distributed computing models create specific
challenges on the data side.
For example, microservices which
can run in parallel on multiple nodes own their data.
This makes ensuring data consistency a challenge. The Saga
pattern solves this problem.
Read this Microservices paper for discussion of design tradeoffs.
New Data Patterns
Sagas
A Saga is a long-lived transaction 35 that can be
written as a sequence of transactions that can be interleaved.
All transactions in the sequence complete successfully or
compensating transactions are executed to amend a partial execution.
Both the concept of Saga and its implementation are
relatively simple, but they have the potential to improve performance significantly.
Sagas are not well-called a data
pattern; they are workflows that orchestrate transactions on different data
stores.
Again, there are downsides to design
for scalability.
Compensating transactions (and the
workflows needed to implement them) are not simple – they do add complexity.
The commentary on the white paper stops here; its further sections
address:
·
NoSQL
·
Sharding big data
·
Real-time
analytics
·
Infrastructure
as Code
·
Architecture Framework (AAF).
References
1 See www.mckinsey.com/business-functions/organization/our-insights/how-to-create-an-agile-organization.
2 See www.scaledagileframework.com.
3 See
https://less.works/less/framework/introduction.html.
4 Scaling Agile @ Spotify with Tribes, Squads, Chapters & Guilds, Henrik Kniberg, Anders Ivarsson, October 2012; refer to:
https://blog.crisp.se/wp-content/uploads/2012/11/SpotifyScaling.pdf.
5 Modular Architectures Make You Agile in the Long Run, Dan Sturtevant,
IEEE Software, Vol. 35, Issue 1, January/February 2018; refer to: https://ieeexplore.ieee.org/paper/8239949/.
6 See
www.ipexpoeurope.com/content/download/10069/143970/file/2017-state-of-devops-report.pdf.
7 Analyst Watch: Water-Scrum-fall is the reality of Agile, Dave West, December 2011; refer to: https://sdtimes.com/agile/analyst-watch-water-scrum-fall-is-the-reality-of-agile/.
8 Exploring the Duality between Product and Organizational
Architectures: A Test of the “Mirroring” Hypothesis, Alan MacCormack,
John Rusnak, Carliss
Baldwin, Harvard Business School Working Paper; refer to: www.hbs.edu/faculty/Publication%20Files/08-039_1861e507-1dc1-4602-85b8-90d71559d85b.pdf.
9 See www.agilealliance.org/resources/sessions/the-reverse-conway-organizational-hacking-for-techies.
10 Domain-Driven Design: Tackling Complexity
in the Heart of Software, Eric Evans, Addison Wesley, August 2003.
11 Source:
2017 State of DevOps Report
12 Microservices in
Action, Morgan Bruce, Paulo A. Pereira, Manning Publications, 2018.
13 Microservice
Architecture: Aligning Principles, Practices, and Culture, Irakli
Nadareishvili, Ronnie Mitra,
Matt McLarty, Mike Amundsen, O'Reilly Media, 2016.
14 See https://martinfowler.com/articles/microservices.html.
15 Building Microservices, Sam Newman, O’Reilly Media, 2015.
16 See http://schd.ws/hosted_files/agilecamppacificnorthwest2017/20/AgileCamp2017%20-%20Jeff%20Nicholls.pdf.
17 Team of Teams: New Rules of Engagement for a Complex
World, General Stanley McChrystal, David Silverman, Tantum Collins, Chris Fussell,
Portfolio Penguin, 2015.
18 See
https://vimeo.com/94950270.
19 Reinventing Organizations: A Guide to Creating
Organizations Inspired by the Next Stage of Human Consciousness, Frédéric Laloux, Nelson Parker,
2014.
20 Accelerate: Building Strategic Agility for a
Faster-Moving World, John P. Kotter, Harvard Business Review Press, 2014.
21 What is Strategy?, Michael E.
Porter, Harvard Business School Press, 1996.
22 The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create
Radically Successful Businesses, Eric Ries, Random
House Audio, 2011.
23 See
www.lean.org/lexicon/strategy-deployment.
24 Platform Revolution:
How Networked Markets Are Transforming the Economy – and How to Make Them Work
for You, Geoffrey G. Parker, Marshall W. Van Alstyne,
Sangeet Paul Choudary, W.
W. Norton & Company, 2016.
25 Industry Platforms and Ecosystem Innovation, A. Gawer,
M. Cusumano, Journal of Product Innovation
Management, 2013.
26 See
www.wsj.com/articles/SB10001424053111903480904576512250915629460.
27 See www.bbva.com/en/want-leading-bank-technology-company.
28 See
http://homepages.dcc.ufmg.br/~mtov/pmcc/modularization.pdf and
https://plus.google.com/+RipRowan/posts/eVeouesvaVX.
29 See
https://static.googleusercontent.com/media/research.google.com/fr//archive/mapreduce-osdi04.pdf.
30 See
http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf.
31 See
www.tensorflow.org.
32 See
https://cmusatyalab.github.io/openface/.
33 Essential Guide to Object Monitors, Karen Boucher,
Fima Katz, Wiley, 1999.
35 SAGAS, Hector Garcaa-Molrna,
Kenneth Salem, Department of Computer Science, Princeton University, 1987.
36 Immutable Infrastructure:
Considerations for the Cloud and Distributed Systems, Josha
Stella, O’Reilly Media, Inc., 2016.
37 Beyond the Twelve-Factor App,
Kevin Hoffman, O'Reilly Media, Inc., 2016.