SOA, the Bezos mandate and TOGAF
Copyright 2014-17 Graham Berrisford. One of about 300 papers at http://avancier.website. Last
updated 20/01/2019 11:18
This paper serves as background and footnotes to a companion Microservices paper (which has been downloaded > 10,000 times).
Many will see the history of SOA as peripheral to EA.
But software design fashions do have an impact on the enterprise application portfolio, and sometimes affect business processes.
"Many will see this as peripheral to EA" is very true, and IMO
is one of the biggest reasons that EA is failing.
Like it or not EA is almost always
about utilising advances in technology to redesign the services provided by the
organisation.
So, an EA team should in my opinion
understand all the fundamentals you have outlined in the paper.”
Comment in Linkedin discussion 31/08/2017 by an Enterprise Architect with 20+ years experience in Software development as a programmer and an architect.
Contents
RPC:
Distributed programming (1980s)
DO:
Distributed Objects (c1990)
CBD:
Component-Based Design (late 1990s)
SOAP
and SOA: Service-Oriented Architecture (c2000)
REST:
Representational State Transfer (2003)
The
Bezos mandate (2002, but not widely known until later)
Microservices
architecture (2010s)
Conclusions
and remarks wrt EA
"The beginning of wisdom for a computer programmer is to recognise the difference between getting a program to work and getting it right" M.A. Jackson (1975).
What makes a software architecture good or right?
Ideally, an architecture is simple - meets requirements more elegantly and economically than alternative designs.
Also it is agile – it facilitates change to meet future requirements..
But those two objectives often conflict.
One principle most agree, a software component should be encapsulated.
It should be defined primarily by its input/output interface, by the discrete events it can process and services it can offer.
But that doesn’t get us very far, because we have to make a series of modular design decisions.
· What is the right size and scope of a component?
· How to avoid or minimise duplication between components?
· When and how to separate or distribute components?
· How integrate components?
Enterprise application design is often discussed in terms of technologies.
But abstract away from the technologies and you see a story of modular design.
In the 1960s, Larry Constantine introduced the concepts of:
· Module cohesion: the degree to which a module’s internal contents are related.
· Module coupling: the degree to which a module is related to other modules.
Strong cohesion within a module and low coupling between modules were considered good things.
In 1972, Parnas wrote that a module should encapsulate a data structure and operations on it.
By the mid 1970s, several modular design principles were widely recognised.
Design
principles |
Meaning
that |
Cohesion |
Components
encapsulate cohesive data structures and algorithms |
Loose coupling |
Components are
encapsulated by and communicate via interfaces |
Composability |
Components can be
invoked and orchestrated by higher processes |
Reusability |
Components are
designed for use by different clients |
Maintainability |
Components can be
maintained in several versions to enable incremental change |
Back then, still in the age of the mainframe computer, software design gurus added further advice on modularisation.
In 1975, Michael A Jackson showed how to divide a program into modules that handle different data structures.
In 1979, Keith Robinson proposed dividing an enterprise application into three client-server layers (external, conceptual, internal).
Since the 1970s, the IT industry has continually revisited modular design and integration concepts and principles.
Many architectural styles or patterns have been promoted.
Each is defined by some core ideas, presumptions and constraints.
Local procedure calls were fine as long as the code remained on one computer
But was inadequate for the distributed client-server computing that followed.
Local Procedure Calls
(LPC)
In the days when people wrote code for execution on a single computer, communication was via simple local procedure calls.
Many business systems were computerising using COBOL.
COBOL subroutines were contained inside a COBOL program, and could be “performed” from any point within that one program
COBOL modules were deployed separately, and could be “performed” from any point in any other COBOL program.
A COBOL module was encapsulated behind a kind of API.
(I always imagined communication between separately deployed COBOL modules was little more than a GO TO and a GO BACK.
A reader tells me that LPC calls between executing programs might be via undocumented backdoor interfaces of Windows, or inter-process pipes in UNIX.)
Remote Procedure
Calls (RPC)
In a distributed client-server application, client components connect to server components over a network.
In the 1980s, they did this using some kind of RPC technology.
RPC was more complex than LPC, since it raised issues of complexity, availability and security.
RPC was fine for many, but was deprecated in this Vinoski paper.
The “OOP revolution” started with some presumptions inherited from traditional modular design.
Like Parnas (above), Bertrand Meyer wrote that classes (abstract data types) should encapsulate data structures.
And he proposed that data-centric objects should be coordinated by higher level processes (this was later called orchestration).
Later OO gurus proposed domain objects should interact to perform a process, with no overarching control procedure (this was later called choreography).
The early OOP case studies were small in-memory applications, often real-time process control applications.
To build larger client-server applications people needed to distribute the objects across a network.
So, RPC evolved into the form of Object Request Brokers (ORBs).
The idea was to:
· code client and server objects as though they communicate by LPC on single computer.
· tell the ORB where client and server objects are distributed to different computers.
· let the ORB manage all cross-network communication at run time.
Using an ORB means that client and server objects are still tightly coupled in some ways.
During the 1990s, architects concluded:
· objects are too small to be distributed.
· inheritance is limited and fragile reuse mechanism, unusable in distributed systems.
· we need other ways to modularise enterprise applications that maintain large databases.
You may know Martin Fowler’s first law of distributed objects? “Don’t distribute your objects.”
When the object-oriented design craze faded, the talk was of the need to compose applications from larger components.
Out of this emerged component-based design (CBD).
CBD modularised an application into “business components” - a direct ancestor of microservices.
Middleware vendors had been frustrated by the impracticality of distributing fine-grained objects (and by implication, classes).
They leapt on the CBD bandwagon and promoted the idea that somewhat larger components should communicate via middleware.
For some, loose-coupling via messaging has become a mantra, but there are always trade offs, as this table shows.
Couple
via synchronous request/reply for |
Decouple via asynchronous messages/events for |
Speed Simplicity Consistency |
Scale (very high volume and/or throughput) Adaptability/Manageability (of smaller
distributed components) Availability (of
smaller units of work) and Partition
Tolerance |
By the end of the 1990s, loose-coupling had become the mantra of software architects.
Microsoft deprecated the constraints of connecting distributed objects using object request brokers like DCOM.
Instead, they advocated service-oriented architecture (SOA) as a more loosely-coupled kind of modular design and integration style.
Their core ideas might be distilled as:
· Clients send data in XML documents
· Client invoke operations they find in WSDL-defined interfaces
· Clients use the protocol called SOAP over standard internet protocols, usually HTTP.
People
observed that SOAP was not simple and not object-oriented.
The SOAP
standard is now maintained by the XML Protocol Working Group, and SOAP now
means nothing, it is merely a name.
Other gurus generalised SOA in terms of more technology-independent design principles.
· A service is a component that can be called remotely, across a network, at an endpoint.
· A component may act as a client/consumer and/or server/provider of data.
· Components typically exchange request and response data in the form of self-describing documents.
· Components make few if any assumptions about the technological features of other components.
SOA overturned some presumptions of distributed objects, as this table indicates.
Feature |
Early
distributed objects presumptions |
Later
SOA design presumptions |
Naming |
Clients use
object identifiers Client and
servers in one name space |
Clients use
domain names Client and servers
in different name spaces |
Paradigm |
Stateful server objects/modules Reuse by OO
inheritance Intelligent
domain objects (choreography) |
Stateless server
objects/modules Reuse by
delegation Intelligent
process controllers (orchestration) |
Time |
Request-reply
invocations Blocking servers |
Asynchronous
message/event queues Non-blocking
servers |
Location |
Remember remote
addresses |
Use
brokers/directories/facades |
More generally, SOA principles can be seen as extending those of 1970s modular design and 1990s CBD.
1970s
Module design principles |
2000s
SOA principles |
Meanings |
Cohesion |
Abstraction |
Components
encapsulate cohesive data structures and algorithms |
Loose coupling |
Components are encapsulated
by and communicate via interfaces |
|
Composability |
Composition |
Components can be
invoked and orchestrated by higher processes |
Reusability |
Reuse |
Components are
designed for use by different clients |
Maintainability |
Maintainability |
Components can be
maintained in several versions to enable incremental change |
|
Statelessness |
Components on the
server-side do not retain data in memory between invocations |
|
Autonomy |
Components are
deployed and operate on their own. |
Three common presumptions in a SOA are:
· Client components are decoupled from server component locations and technologies.
· A client component makes asynchronous invocations using an IP network and/or messaging technologies.
· A server component can handle many clients at once (this “non-blocking” behaviour is desirable for performance reasons).
Messaging tools
Middleware vendors stopped talking about CBD and presented their products as SOA tools.
They made some effort to comply with “web service standards”.
Then, each vendor defined SOA in terms of the features provided by their tool!
Messaging technologies can
· consume service requests and direct them to service providers
· orchestrate services in workflows
· act as broker between choreographed services
· authenticate messages, log messages etc..
But there are many who deprecate using messaging tools where RPC or REST fits the bill.
Roy T. Fielding, in his Ph.D. thesis, formalised the ideas behind web protocols and invented the term REST.
Representational State Transfer (REST) means the state of a server component is represented in messages - using internet-friendly text data formats, which can include hyperlinks.
REST supports the general principles of SOA with more specific guidance.
It is a set of principles for using web standards to invoke operations acting on remote resources.
It suggests ways modularise and integrate application components using web standards.
It encourages you to rethink how you structure server-side components/resources.
It takes advantage of hyperlinks and the web protocols used over ubiquitous TCP/IP networks.
A RESTful architecture contains RESTful client components.
Every resource of a software application (Web Service, web site, HTML page, XML document, printer, other physical device, etc.) is named as a distinct web resource.
A client component can only call a server component/resource using the operations available in a standard protocol - usually HTTP.
This decouples distributed components; it means a client component needs minimal information about the server resource.
A REST-compliant architecture contains
REST-compliant server components/resources.
A REST-compliant server component/resource can offer only the services named in an internet protocol.
Given there are fewer operations (verbs) per component, there must be more components (nouns).
One may have to divide a large data resource into many smallish elements (many nouns).
Clients must orchestrate or integrate those smallish resources.
By now, the term web service no longer implied a WSDL-defined interface.
Any resource accessible over the web - perhaps using a RESTful invocation – was called a web service.
More recently, REST was extended by the Open Data Protocol (OData) protocol (by Microsoft 2007, then OASIS 2014).
This standard that helps clients access any remote data server wrapped up behind an OData interface.
It defines best practices for building and consuming RESTful APIs in that context.
SOAP is a technology standard for client components making remote procedure calls to server components.
REST might better be seen as a theory that happens to be associated with some technology standards.
Read this paper for more details of SOAP and REST, and the choice between them.
FAANG is an acronym for five large IT-centric businesses, namely Facebook, Apple, Amazon, Netflix and Alphabet’s Google.
To which you might add eBay and Spotify.
“Hard
cases make bad law” is
the legal maxim that a general law is better drafted for average or common circumstances.
Few businesses are like the FAANG companies
Whatever those IT giants do to handle extreme business transactions volumes is not necessarily optimal for your business.
But one thing done at Amazon seems widely accepted as good practice.
Jeff Bezos famously issued an executive order to Amazon’s software teams; his Big Mandate went something along these lines:
· All teams will henceforth expose their data and functionality through interfaces.
· Teams must communicate with each other through these interfaces.
· There will be no other form of inter-process communication: no direct linking, no direct reads of another team's data store, no shared-memory model, no back-doors whatsoever.
· The only inter-team communication allowed is via interface calls over the network.
· It doesn't matter what technology teams use: HTTP, CORBA, Pubsub, custom protocols. It really doesn't matter; Bezos doesn't care.
· All interfaces, without exception, must be designed from the ground up to be externalizable, exposable to developers in the outside world. No exceptions.
· Anyone who doesn't do this will be fired.
The Bezos mandate refers to interfaces between teams – does not indicate the size of a team or what it maintains..
How big and complex is the system behind a team’s API?
Does a team maintain one large monolithic application? or several? or divide each application into “microservices”?
Does it deploy those microservices on one server, or many servers?
A few lessons learned
For more on the Bezos mandate
https://gigaom.com/2011/10/12/419-the-biggest-thing-amazon-got-right-the-platform/
https://plus.google.com/u/0/+RipRowan/posts/eVeouesvaVX?hl=en
“Amazon learned a tremendous amount while effecting this transformation.
Teams learnt not to trust each other in most of the same ways they’re not supposed to trust external developers.
Assigning an incident is harder, because a ticket might bounce through 20 service calls before the real owner is identified.
If each bounce goes through a team with a 15-minute response time, it can be hours before the right team finally finds out, unless you build a lot of scaffolding and metrics and reporting.
Every team suddenly becomes a potential Denial of Service attacker, so quotas and throttling must be put in place on every service.
Monitoring = automated QA, since to tell whether a service is responding to all invocations, you have to make individual calls.
The problem continues recursively until your monitoring is doing comprehensive semantics checking of your entire range of services and data.
With hundreds of services, you need a service-discovery mechanism, which implies also service registration mechanism, itself another service.
So Amazon has a universal service registry where you can find out reflectively (programmatically) about every service, what its APIs are, and also whether it is currently up, and where.
Debugging problems with someone else’s code gets a LOT harder, and is basically impossible unless there is a universal standard way to run every service in a debuggable sandbox.”
A microservices architecture divides an application into modules or components - confusingly called micro services.
It might be seen as an application of modular design, OO design, CBD or SOA principles.
Microservices might be distributed and integrated over a network, but this is not essential.
Martin Fowler (2014) defined microservices in terms of nine characteristics or principles.
The first six about are how best to scope a module, separate modules and integrate modules.
The last three
principles are drawn from the wider agile development movement.
For discussion, read this Microservices paper.
Of course, it is quicker to design and build a smaller system than a larger system.
But generally speaking, the smaller and simpler the system, the less it can do.
So, the trade off is that smaller systems = more systems and more system integration.
Component size |
Process length |
Message quantity |
|
Macro
pattern |
larger components
perform |
longer processes |
with less
inter-component messaging |
Micro
pattern |
smaller
components perform |
shorter processes |
with more
inter-component messaging |
The granularity of components makes a difference to how they are best deployed and integrated
Smaller/simpler
modules means larger/more complex
messaging.
Separating modules leads to slower
processing and disintegrity.
And therefore to the extra complexity of compensating
transactions to restore data integrity (CAP theorem
ignores this simple/complex trade off).
Other trade offs include simplicity versus flexibility, and scalability versus integrity.
It is important to recognise that design principles are not mandatory goals, only means to some ends.
And to consider what trade offs have to be made between requirements.
Loose-coupling is not singular or simple concept, and is not always desirable.
Decoupling is an important tool of design; but it means many things, and can be overdone.
The architectural problem is not so much coupling, as coupling between modules that are unstable in
some way (said Craig Larman).
Must a client
component make asynchronous invocations?
In practice, clients often need to make request-reply invocations.
So while an invocation is asynchronous at the level of application protocol, it is synchronous to the component designer and user.
Must a client
component make invocations across a network?
The Bezos mandate insists inter-team
communication is via APIs exposed across the network.
Mandating the same for all inter-microservice communication may hinder performance and increase complexity.
What are the implications for network traffic, network costs and network monitoring?
Further, one forced to use defensive design techniques (since microservices don’t trust each other).
An architect told me agile development of distributed microservices in his enterprise had led to wasteful duplication of data and code.
Must client
components be decoupled from server component technologies and locations?
In a client-server technology stack, components are usually de-coupled in those two ways.
But finer-grained components within one tier are commonly co-located and coded using the same technology.
Must one application
be decomposed into microservices that communicate via
messaging?
Suppose you insist on messaging between microservices between fine-grained components coded using the same technology on the same server?
An architect on a training course told me:
· His enterprise has a "microservices dependency nightmare" featuring c200 microservices on c15 app servers.
· Middleware is hugely overused between microservices, logging a ridiculous number of events that are of no interest to the business.
· Some or many microservices would be better closely-coupled or combined, and deployed on one or very few servers.
· So, they are looking to strip the middleware out of their primary business application as far as possible.
Physical decoupling using network and/or messaging technologies is not logical decoupling.
Often, a microservices
architecture involves dividing what could be one coherent data store.
But dividing a coherent database
schema into smaller (still related) schemas doesn’t reduce coupling,
It merely moves it from the database
to the messaging system, at some cost to performance if not also integrity and
availability.
So, mandating that all microservices communicate asynchronously can increase complexity, disintegrity and have unwanted side effects.
There are many ways to integrate application components.
OK, message brokers and workflow
engines can be useful.
But business rules are not best placed in the middleware.
And the Bezos mandate does not presume middleware use.
Must a server
component handle many clients at once?
This “non-blocking” behaviour is desirable for performance reasons.
But is unnecessary where throughput is low.
Scalability
Scalability is sometimes an issue.
But an ordinary RDMS can handle
20,000 transactions per second.
Module
design patterns
Domain-oriented modularity has its
place.
But some gurus have encouraged the
writing of over complex OO code on app servers.
Transaction scripts and subroutines thereof can be fine; and some code is better on data servers.
Things to think about include:
· Physical decoupling makes logical coupling more complex.
· Naïve architecture guidance - can mandate decoupling, asynchronicity and scalability where not needed
· Response time – where one transaction requires microservices that communicate via network and/or messaging.
· Availability – where synchronous access is needed to partitioned/distributed data stores.
· Scope/complexity creep – where microservices are increasingly hooked together.
· Business data and process integrity – where BASE replaces ACID.
· Application maintenance – where multiple design patterns and technologies are used.
· Best practice use of CQRS/Event Sourcing.
Isolating components, making them “autonomous” can lead to data and processes disintegration.
EA was a response to silo system proliferation (read “50 years of Digital Transformation and EA” for some history).
Themes in EA literature include data and process quality, data and process sharing, data and process integration.
This table lists shows how agile development can lead to difficulties for EA.
Agile
developers’ dream |
Enterprise
architects’ nightmare |
Smaller, simpler
modules |
Larger, more complex
module dependency & integration |
Loosely-coupled
autonomous modules |
Duplication
between modules and/or difficulties integrating them |
Data store
isolation |
Data disintegrity and difficulties analysing/reporting data |
EAs ought to be wary of excessive decoupling of applications components across a network and/or via middleware.
Since it can create needless disintegrities and complexities, not to mention needless costs and performance issues.
Above all, EAs need to be aware that excessive decoupling can hinder how customers or employees perform business processes.
Some use the terms system, component, process and service interchangeably.
Some use them with different (and sometimes contradictory) meanings.
In some standards from The Open Group |
In Fowler’s
writings for programmers |
|
A structural
building block that offers multiple services (a service portfolio) to its
clients |
Component |
A unit of software
that is independently replaceable and upgradeable. |
A behavior composed of process steps in a sequence that
leads to a result. |
Process |
An instance of
a component executed on a computer. It contains
executable code and current state data. (Some OS allow it
to contain multiple concurrent threads of execution.) |
A discretely requestable behaviour that encapsulates one or more
processes. |
Service |
An out-of-process
component, typically reached by a web service request or RPC. |
Remember Martin Fowler’s process and service correspond to application components in TOGAF and ArchiMate.
And TOGAF’s Function is a logical component that groups behaviors by some cohesion criterion other than in sequence (which is a process).
TOGAF has always been thoroughly service-oriented.
It regards an enterprise as a system, as a collection of encapsulated building blocks that interact.
All architecture domains and the building blocks within
them are defined by the services they offer.
A service is a logical representation of a repeatable activity that has a
specified outcome.
A service is first named, then lightly described, and eventually defined in a service contract.
A service contract makes no reference to the internal workings of any building block that provides it (or other building blocks it depends on).
So, the Architecture Continuum and Architecture Repository contain Architecture Building Blocks defined by the services they offer.
The ADM is a method for service-oriented architecture development.
In phases B and C, contracts for Business and IS Services are recorded in the Architecture Requirements Specification.
In phase C, the III-RM provides a service-oriented design pattern for the applications architecture domain/layer.
In phase D, the TRM catalogues the foundation Platform Services provided by the technology architecture domain/layer.
The
Open Group offers a definition of SOA that is unsatisfactory in a TOGAF context
for the reasons below.
“SOA is an architectural style that supports service-orientation”
In
other words, SOA is an A that supports SO, but you cannot define words using
the same words.
“Service-orientation
is a way of thinking in terms of services”
This
sounds like floundering in the face of a concept too vague to be defined.
“and service-based development.”
TOGAF
is not about software development; however, the ADM can well be called method
for service-oriented architecture
development.
“and the outcomes of services.”
That
is a more promisingly meaningful definition element.
“A
service is a logical representation of a repeatable business activity that has
a specified outcome”.
This is
much better.
· “A logical representation”
implies a service contract defining inputs, preconditions and outputs and post
conditions.
· “A repeatable business
activity” implies a discrete behavior.
· “Has a specified
outcome" implies a discretely requestable
service delivering a result.
However
the term “repeatable business activity” is misleading, because TOGAF’s services
include IS Services and Platform Services
“It is self-contained. It is a black box for its consumers.
This is not wrong, but it risks people
confusing a service with a component/building block, which is self-contained
black box.
In TOGAF and ArchiMate standards, the “services” are
discretely requestable behaviors.
Each is a response to a query or command, and
definable declaratively in a service contract.
“It may consist of other underlying services.”
It doesn’t really “consist of” underlying services;
services are external views and know nothing of other services.
You could however say a service is implemented by a
process, which might in turn invoke other services.
All free-to-read materials at http://avancier.website are paid for
out of income from Avancier’s training courses and
methods licences.
If you find the web site helpful, please spread the word and
link to avancier.website in whichever social media
you use.