Microservices (and decentralised data management)

Copyright 2014-17 Graham Berrisford. One of about 300 papers at http://avancier.website. Last updated 27/03/2017 20:19

 

New to the web site above? This paper supports one session in our ESA training courses.

It was initially written in June 2014 with reference to what Martin Fowler (a brilliant writer) then said about microservices on his web site.

It draws from discussions with software architects about difficulties they are having with microservices.

And from discussions with EA teams about the need govern “agile” enterprise application development.

The paper discusses several varieties of a microservices architecture.

In short, there is no silver bullet, you can’t have everything, there are always trade offs.

Contents

Preface. 1

Modularisation principles and trade offs. 3

Microservices as server-side application components. 4

True (and other) microservices. 7

Issues raised by microservices. 10

ACID, BASE, CAP and complexity. 11

Lessons from CBD history. 13

Particular conclusions and remarks. 14

General conclusions and remarks. 15

Footnotes. 16

 

Preface

 

Why this paper?

First to make sense of the term “microservice”.

E.g. One architect complained his CIO was already using the term as meaninglessly as the term “SOA” has been used for many years.

Second, to steer people away from badly-designed, hard-to-maintain and poorly-performing microservices.

And third, to propose that enterprise and solution architects are responsible for governing software architects.

 

If every component or API is called a microservice, then the term adds no meaning.

Every microservice is an application component, but not every component is a microservice.

Every microservice has an API, but not every API is a microservice.

 

Here, a true microservice:

·         is a micro application, encapsulated behind an API

·         is a discretely deployable subdivision of what could be a monolithic application (a macroservice?)

·         encapsulates one partition of what could be a monolithic data structure, and so “decentralises data management”.

 

Some say microservices should be “isolated” and “autonomous” – but this ranges from questionable to impossible where data integrity is needed.

Some say dividing a monolithic application into microservices should not affect the services offered to clients - but it might if microservices are decoupled too far.

Some apply the term “microservice” things that are not true microservices – this might be OK (see below) but should be done knowingly.

 

Vocabulary hell

Beware the terminology clashes that are the curse of every writer in this field.

In EA standards from The Open Group

Fowler’s writings for programmers

A structural building block that offers multiple services (a service portfolio) to its clients

Component

A unit of software that is independently replaceable and upgradeable.

A behavior composed of process steps in a sequence that leads to a result.

Process

An instance of a component executed on a computer.

A discretely requestable behaviour that encapsulates one or more processes.

Service

An out-of-process component, typically reached by a web service request or RPC.

 

So in EA terms, a microservices architecture would be better called a microcomponent architecture.

The services offered by the wider monolithic application should remain unaffected by division into microservices, but they might be affected.

 

Why microservices?

Motivations that have been advanced for microservices include:

 

-1- To maintain data stores that are already distributed and separately managed.

This is not interesting here, because it describes the application portfolio most enterprises already have.

And integrating discrete applications is a system integration task as it ever has been.

 

-2- To separate what must be coded using different technologies.

This is not interesting here, because such application components are naturally discrete.

And it seems advisable to not to mix technologies unless you have to.

 

-3- To enable very high throughput (by processing transactions in parallel components).

Exceptional requirements require exceptional designs, for which a price must be paid.

Few enterprise applications have a throughput high enough to require division to parallel microservices.

Solid state drives and in-memory data storage can handle remarkably high transaction volumes.

 

-4- To facilitate agile development.

This seems the most common motivation, and the one Fowler clearly had in mind.

The aim is to help one person or a small team develop and deploy a microservice (separately from others) quickly.

 

Relevance to enterprise architecture?

EA was a response to silo system proliferation (read “50 years of Digital Transformation and EA” for some history).

Themes in EA literature include data and process quality, data and process sharing, data and process integration.

Making microservices “isolated” and/or “autonomous” means creating silo systems that disintegrate data and processes.

Publishing "decoupling" as a general principle can accidentally encourage excessive decoupling of software components across a network and/or via middleware.

EAs ought be concerned about creating needless disintegrities and complexities, not to mention needless costs and performance issues.

Above all, EAs need to be aware that excessive decoupling can hinder how customers or employees perform business processes.

Modularisation principles and trade offs

 

Five modular design principles (1970s)

Enterprise application design is often discussed in terms of technologies.

But abstract away from the technologies and you see a story of modular design.

In the 1960s, Larry Constantine introduced the concepts of:

·         Module cohesion: the degree to which a module’s internal contents are related.

·         Module coupling: the degree to which a module is related to other modules.

 

Strong cohesion within a module and low coupling between modules were considered good things.

By the 1970s, several modular design principles were widely recognised.

 

Design principles

Meaning

Cohesion

A module encapsulates a cohesive data structure or algorithm

Loose coupling

Modules are encapsulated by and communicate via interfaces

Composability

Modules can be readily invoked and orchestrated by higher processes

Reusability

Modules are designed for use by different clients

Maintainability

Modules can be maintained in several versions to enable incremental change of module clients

 

Loose-coupling is often promoted as though it is always a good thing, but it turns out that:

·         Coupling is not a simple concept (our course explains a dozen ways modules may be coupled or decoupled).

·         It usual to couple finer-grained components more closely than coarser-grained components.

·         The problem is not so much coupling, as coupling between modules that are unstable in some way (after Craig Larman).

 

The discussion ever since the 1970s has been about how best to scope a module, separate modules and integrate modules.

 

Nine microservices characteristics (Fowler)

Fowler defines microservices in terms of nine characteristics, most of which amount to principles.

The first six about are how best to scope a module, separate modules and integrate modules.

·         Componentization (discussed in this paper)

·         Decentralized data management (discussed in this paper)

·         Decentralized governance

·         Organisation around “Business Capabilities”

·         Smart endpoints and dumb pipes

·         Design for failure and "You build, you run it"

 

The other three are drawn from the wider agile development movement.

·         Products not Projects

·         Infrastructure Automation

·         Evolutionary Design

 

Trade offs

Principles are not goals, only a means to an end.

Of course, it is quicker to design and build a small subsystem than a large system.

But the smaller and simpler each sub system, the less useful it is, and the more complex the integration between subsystems.

This table lists Berrisford’s universal modularisation trade offs.

Agile developers’ dream

Enterprise architects’ nightmare

Smaller, simpler modules

Larger, more complex module dependency & integration

Module isolation/autonomy

Duplication between modules and/or difficulties integrating them

Data store isolation

Data disintegrity and difficulties analysing/reporting data

 

There are other trade offs to be considered, for example, between flexibility or simplicity, scalability or integrity.

Microservices as server-side application components

c1980, an EA vision was a central database, supporting one application that served all the enterprise’s business functions.

That vision became unachievable as enterprises deployed ever more applications, which were increasingly distributed.

The vision evolved: EA divided the enterprise into “segments”; and the centralised data vision was recast as “single-version-of –the-truth”.

EA now involves “master data management” within each segment of the enterprise, and sometimes across segments.

EA expects each enterprise segment to be supported and enabled by many applications; some in-prem and some in-cloud.

 

Much as an enterprise may be divided into segments, an enterprise application may be divided into application components.

A typical client-server enterprise application processes successive transactions and is divided into three layers.

Microservices divide the server-side of what could be one large application into components that are supposedly autonomous.

By “decentralising data management” Fowler implies dividing what could be one persistent data structure into parts.

The idea is that each microservice encapsulates the data it needs; it does not directly access the data maintained by any other microservice.

Each is an application component that can perform one or many transactions or partial transactions.

 

Size matters!

You might see microservices as a scaling down of the “Bezos mandate” (footnote 2) and/or Service-Oriented Architecture (footnote 3).

In other words, as extending the principle that components are distributed across a network and communicate via APIs.

However, microservices do not have to be distributed across a network.

Other modularisation strategies are no less service-oriented; and SOA does not say how to divide an application into modules.

 

You might better view microservices as a scaling up of OO design principles.

In other words, as a response to complaints that the granularity of “objects” is too small for architecture-level design.

 

Hierarchically layered client-server architecture

An enterprise application enables or supports one or more business roles and processes.

It encapsulates some business rules, business knowledge or business capability.

A few enterprise applications are purely algorithmic, or need very little stored data.

But the majority maintain, or at least need access to, a substantial business data store.

 

Fowler says an enterprise application is typically composed of three horizontal layers, as shown below.

Client-side user interface

Typically HTML pages and javascript running in a browser on the user's machine

Server-side application

Controllers: handle requests/commands from the client

Views: populate views/pages sent to the client.

Models: retrieve and update data, applying business rules

Data store

A persistent data structure maintained using a database management system.

Typically a relational DBMS, but there are many data store varieties.

 

Dividing an application into three layers dates back to the 1970s - the days of COBOL and pre-relational databases.

The layers form a client-server hierarchy, meaning each layer requests services from the layer below

Despite the OOPers mantra that business rules belong in the middle later, many regard the data layer as the best place for data-centric business rules.

Middleware may be used to connect layers, but Fowler says "smart end points, dumb pipes", meaning don't put business rules in middleware.

 

Dividing the server-side application

Fowler says the server-side application is typically a single executable.

One large application serves many requests/commands from clients.

Moreover, it is seen as a single software development unit.

So changes to requests/commands involve building and deploying a new version of the whole server-side application.

 

Microservices are subdivisions of the server-side application layer.

Each microservice receives a subset of the service requests that clients make of the whole application.

The ideal is that each of these application components can be built and tested in parallel.

 

Client-side user interface

Screens / Pages

Server-side application

Customers, Agents, Sales

Reservations, Rooms, Hotels

 

How to divide one application into microservices? Fowler suggests:

·         “Organisation around Business Capabilities” according to Domain-Driven Design (see footnote 1).

·          “Decentralized data management”, which is discussed in this paper.

 

“At a first approximation, we can observe that microservices map to runtime processes, but that is only a first approximation.” Fowler

 

Client components (of any kind) do not have to be invoked a microservice via a message broker.

Microservices can be invoked using web service requests, the OData protocol, or whatever suits.

The closer microservices are adjoined in location and time, the more likely synchronous request-reply invocation will be appropriate.

 

Dividing the data store

Microservices imply not only dividing the application layer processing of what could be one cohesive data structure.

But also partitioning that data structure to match the division of the application layer into microservices.

 

“A microservice may consist of multiple processes that will always be developed and deployed together, such as an application process and a database.” Fowler

In other words, an application layer microservice cannot work without its data store.

 

To keep thing simple, let us minimise the need for a data abstraction layer between application and data layers.

In other words, the class diagram of the application layer and the data model of the data layer are variants of one logical domain model, and divided the same way.

 

Client-side user interface

Screens / Pages

Server-side application

Customers, Agents, Sales

Reservations, Rooms, Hotels

Data store

Customers, Agents, Sales

Reservations, Rooms, Hotels

 

If you decouple the microservices’ data structures, and store them separately, then you need a cross-reference between them.

At least one entity type is duplicated in adjacent microservices' data stores, with a common primary key, but different, attributes and relationships.

That is an old and generally applicable idea; it can be applied however you choose to divide work between horizontal layers.

 

On division of work between horizontal layers

Regardless of division into vertical partitions, an idea that Fowler seems to presume is that all business logic is in the application layer.

However, a data model captures business-specific terms, concepts, data types, data derivation and data integrity rules.

And coding data-centric rules next to the data is a rational design strategy.

So it is common to find data-centric business rules in the data layer.

Especially basic consistency/integrity rules such as: no row can be inserted into the Sale table unless there is a corresponding row in the Customer table.

And perhaps: no row can be removed from the Customer table if that Customer has Sales with future-dated Reservations.

True (and other) microservices

Macro applications might be seen as the enemy here, but a macro application is always modularised one way or another.

As Fowler says, most business applications handle a series of discrete business transactions (event-triggered processes).

And they maintain a cohesive data resource – describable as data entities related to each other in a network structure.

 

This table maps business event-triggered business processes against business data entities accessed.

    Events

Register customer

Place order

Complete sale

Launch product

Recall product

Entities

Customer

Create

Read

 

 

Read

Sale

 

Create

Update

 

Read

Sale item

 

Create

 

 

Update

Product type

 

Update

 

Create

Update

 

Event-oriented modularisation

You can modularise an application into one module for each event-triggered process to be completed.

That was the expectation in using structured analysis and design methods in the 1980s

Duplication is removed by factoring out common sub-routines (done well, this gives the leanest design).

 

Procedural microservice (handles 1 whole transaction, has access to the whole data structure)

This is not a microservice of the kind Fowler discussed.

You can partition the application into one module for each command/transaction script.

A transaction is a process that has an access path through the persistent data structure.

You can both design and deploy each transaction script separately.

So, each is performed by a component that has complete responsibility for it.

 

Some transactions will contain duplicate parts - same access path and same code.

You can and should factor out any substantial common part into a reusable module.

(Aside: reuse can be classified as a) direct b) copy and paste c) copy, tailor and paste.
Two or more transaction scripts may directly reuse modules that they share.

You can analyse transactions scripts looking for common processing, and “factor out” shared modules.

And/or design modules from the bottom up to encapsulate entities or aggregate entities in the persistent data structure.

You’ll probably need a simple hierarchical module dependency diagram to track inter-dependencies.)

Entity-oriented modularisation

You can modularise an application into one module for each data entity (cf. “object-based” design in the 1980s).

The entity modules are coordinated to complete event-triggered processes by

·         Choreography: one entity module calls another, and so on, as need be or

·         Orchestration: an event-triggered process invokes entity modules

·         A mix of both the above (using the GRASP pattern).

 

Modules that encapsulate more-or-less normalised entities are too small to be deployed separately as microservices.

Cf. Fowler’s first rule of distributed objects: “Don’t distribute your objects!”

Objects are too small, and coordinating them in higher level processes is horrendously inefficient.

 

However, Fowler proposes the design of microservices should decentralise data management and governance.

Which implies a microservice encapsulates a data structure considerably larger than a normalised data entity.

If you are thinking of “aggregate entities” as in Evans’ Domain-Driven Design, then see footnote 1.

Another approach is to apply cluster analysis (the north-west corner method) to the table above to aggregate data entities acted on by the same events.

Either way, you can identify a cohesive data structure that is more substantial than a normalised data entity.

And then design an application component that will apply several transactions to several data entities.

 

Pseudo microservice (handles a cluster of whole transactions, has access to the whole data structure)

This resembles a true microservice in scope, but does not completely encapsulate the data.

A pseudo microservice is a bundle of transaction scripts that are managed as a discrete application layer component.

It processes all transactions that are clustered using some cohesion criteria.

E.g. cluster all transactions that start at the same data entity (or in the same aggregate entity).

 

Pseudo microservices are separately deployable application layer components.

They can be assigned to different individuals or teams for development and testing.

Thus, they can suit agile development, accepting that data management remains centralised.

 

Design options include: allow pseudo microservices to compete for the same data (limits scalability).

Duplicate data needed by several microservices (increases disintegrity and complexity).

 

True microservice (handles a cluster of whole and partial transactions, has access to one part of the whole data structure)

A true microservice encapsulates a discrete group of data entities - an aggregate entity or more.

Ideally, the data structure is sufficiently wide that a microservice can complete many (most?) service requests on its own.

 

There are recognised techniques for dividing an OOP domain model and dividing a data model.

Layer

Logical structure

Divisible into

App server

OOP domain model

Aggregate entities (see footnote 1)

Data abstraction

 

Data server

Data model

Hierarchical structures (linked by many-to-many link entities)

 

Ideally, the data server and app server layers of the application are partitioned in the same way.

Otherwise, the data abstraction layer between them will be complex, and undermine the hoped-for benefits of microservices.

 

What if one business transaction needs to access the data encapsulated by more than one microservice?

The business transaction can be completed by coordinating microservices by:

·         Orchestration: an overarching workflow process invokes partial transactions in several microservices .

·         Choreography: one microservice calls another, and so on, as need be.

·         A mix of both the above (using the GRASP pattern).

 

Or else, if you choose to deploy each microservice with all the data it needs, you have to duplicate some data between data stores.

This leads to the complexity of additional processes to align data stores after a transaction – asynchronously - as best they can.

 

Asides

A naive microservice will tend to cache more data than it needs for some or most of the transactions it processes.

The code on the app server tier might be structured into model, view and controller objects in an MVC pattern.

The many possible ways to map transactions and microservices to the elements of an MVC pattern are not explored here.

Issues raised by microservices

Decoupling is important of course; but it means many things, and can be overdone.

There are dozen or more ways to decouple application components (discussed in our training).

Physical decoupling (using network and/or messaging technologies) is not logical decoupling.

 

There are also many ways to integrate application components (discussed in our training).

The famous Bezos mandate (footnote 2) directed all Amazon software teams to exposing APIs over an IP network.

However, the granularity of components makes a difference to how they are best deployed and integrated.

 

Component granularity?

How big and complex is the system behind an API?

The Bezos mandate refers to interfaces between teams – does not indicate the size of a team or what it maintains.

Each team might well maintain a large monolithic system.

And divide it “microservices” – unknown to other teams.

 

Network use?

The Bezos mandate insists inter-team communication is via APIs exposed across the network (surely an IP network).

Mandating the same for all inter-microservice communication may hinder performance and increase complexity.

What are the implications for network traffic, network costs and network monitoring?

One is forced to use defensive design techniques.

An architect told me agile development of distributed microservices in his enterprise had led to wasteful duplication of data and code.

 

Middleware use?

The Bezos mandate does not presume middleware use.

Mandating that all microservices communicate asynchronously can increase complexity, disintegrity and have unwanted side effects.

Must we use messaging between microservices? Even fine-grained components coded using the same technology on the same server?

Another architect told me they have a "microservices dependency nightmare" featuring c200 microservices on c15 app servers.

Middleware is hugely overused between microservices, logging a ridiculous number of events that are of no interest to the business.

Some or many microservices would be better closely-coupled or combined, and deployed on one or very few servers.

So, they are looking to strip the middleware out of their primary business application as far as possible.

 

Things to think about include:

·         Physical decoupling makes logical coupling more complex.

·         Naïve architecture guidance - can mandate decoupling, asynchronicity and scalability where not needed

·         Response time – where one transaction requires microservices that communicate via network and/or messaging.

·         Availability – where synchronous access is needed to partitioned/distributed data stores.

·         Scope/complexity creep – where microservices are increasingly hooked together.

·         Business data and process integrity – where BASE replaces ACID.

·         Application maintenance – where multiple design patterns and technologies are used.

·         Best practice use of CQRS/Event Sourcing.

ACID, BASE, CAP and complexity

The next question is: Must every business transaction be completed before a server-side application replies to the client?

 

We are talking about the server-side of an enterprise application.
Remote clients send a service request to a point of entry on the server-side.
That service request is a transactional command or query; it is definable by a service contract.
The service contract for a business transaction defines I/O data with reference to one or more persistent data entities.

 

Let us presume the intention is to modularise an application into true (entity-oriented) microservices.

There are various ways to implement server-side business transactions, including the options below.

 

ACID transactions for consistency/integrity

Often, data integrity is important to business operations

A client wants a business transaction to succeed or fail completely before the server-side application replies.

A customer does not want to reserve a hotel room (using the sales agent microservice) that later turns out not be available (in the hotel administration microservice).

 

ACID means a transaction is atomic, consistent, isolated and durable.

Here, it means microservices are coordinated synchronously on the server-side.
So, clients’ service requests are not affected by how the server-side modularised.

 

Aside: For scalability, microservices can be replicated on parallel server-side nodes.

Sticky sessions or state replication enable a load balancer to choose which node responds.

 

To keep things simple, it is best if all the required data can be stored in one place.

Then, the database management system can be used to roll back a transaction when a business rule is violated.

It can also save some programming effort by maintaining referential integrity.

And support analysis of data for management information purposes.

If data is distributed, it may be necessary to hand-code the two-phase commit units a transaction manager could handle.

 

BASE for very high throughput by partitioning

Naturally, it is easier and quicker to develop a module that has minimal interaction with others.

For agility, developers want their microservice to work on its own, and reply to a client immediately with minimal dependency microservices.

For scalability, they want to isolate modules physically as well, deploying them on differ server-side nodes.

 

BASE means an application is basically available, scalable and eventually consistent.

Here, it means that one microservice may reply to a client before that client’s desired business transaction is completed.

Other microservices are coordinated later, asynchronously, and compensating transactions are performed if need be.

The result is that clients become aware of, and are affected by, modularisation into microservices.

 

Basically available

The idea is that a microservice can serve its clients well enough without immediate reference to other microservices.

A microservice can complete some business transactions immediately, and set other business transactions in motion.

The presumption is that a client will be happy to see a business transaction started, with no promise it will be completed.

E.g. a customer reserves a hotel room using the sales microservice, without the promise that the room will be available.

 

Scalable

Today, an ordinary relational database can handle many thousands of transactions per second.

Very high transaction volumes can be handled by beefing up the data server and using solid state drives.

To handle extraordinary transaction volumes, the data store can be partitioned in one of two ways.

·         Sharding: this means partitioning the data store into separate data populations (perhaps by geography or region?).

·         Functional scaling: this means partitioning the data structure by subject matter (e.g. customers and products).

 

Microservices can enable functional scaling, but this is not a general motivation.

Few business applications have a transaction throughput like that of Google, Amazon, Facebook or Spotify.

So, few need an architecture designed for throughput at the levels those kinds of business handle.

 

Eventual consistency

Doing business using inconsistent data is a business issue before it is a technical issue.

Any business transaction that spans two or more partitions needs special attention.

The BASE strategy is to divide a business transaction into smaller transactions, each performed by a different microservice.

And to allow cases were a whole business transaction cannot be completed in one go.

You must analyse possibility that data becomes inconsistent, the impact that has on business operations, and when and how to restore data integrity.

E.g. what to do if a customer reserves hotel room (using the sales agent microservice) that turns out not be available (in the hotel administration microservice)?

You have to analyse and design additional compensating transactions to correct or undo the unwanted effects of business transactions that start, but cannot be completed.

 

Complexity

The famous CAP theorem says you can’t have all three of Consistency (integrity), Availability and Partition (broken connection) tolerance.

It doesn’t mention a fourth thing you want from a design – Simplicity.

The disintegrity that results from dividing a cohesive data model tends to increase the complexity of a system.

BASE (allowing data to become inconsistent, then restoring integrity by compensating transactions) is more complex both for programmers and for a business.

Lessons from CBD history

Microservices can be seen as a renewal of the Component-Based Design fashion in the 1990s.

CBD partitioned a cohesive data model into chunks, each maintained by a “business component”.

Each business component maintained its own subset of – a group of data entities in - the data model.

 

How did solution architects then direct developers to maintain data integrity?

How to complete a business transaction that requires more than one business component?

Some options in the table below couple components more obviously that others; but all options couple components in some way.

I recall discussing the options and observed projects using the single data store options.

 

Storage in

Single data store

Facilitates queries, data analysis and reports

Multiple data stores

Hinders queries, data analysis and reports

Coordination by

Synchronous ACID transactions

Data integrity maintained

1a Before replying, communicate using RPC or middleware

and complete a two phase commit unit.

3a Before replying, communicate using RPC or middleware

and complete a two phase commit unit.

1b Use the DBMS to maintain cross-component data integrity.

3b Use a distributed/federated transaction manager.

Asynchronous BASE

Data integrity allowed temporarily

2 Reply, then communicate using RPC or middleware

and complete compensating transactions as need be.

4 Reply, then communicate using middleware

and complete compensating transactions as need be.

Particular conclusions and remarks

Publishing "decoupling" as a general principle can accidentally encourage excessive decoupling of software components across a network and/or via middleware.

EAs ought be concerned about creating needless disintegrities and complexities, not to mention needless costs and performance issues.

The issues, principles and tradeoffs are largely vendor and technology neutral.

If EAs can't directly govern Technical and Software Architects, then they need to be socialised with Solution Architects who can do this on their behalf.

 

Assisting agile development is a plus for any modular design strategy.

A clean division into business components/microservices can suit the division of work between small teams.

However, remember Berrisford’s universal trade offs.

Agile developers’ dream

Enterprise architects’ nightmare

Smaller, simpler modules

Larger, more complex module dependency & integration

Module isolation/autonomy

Duplication between modules and/or difficulties integrating them

Data store isolation

Data disintegrity and difficulties analysing/reporting data

 

Be cautious about partitioning what could be one cohesive data structure.

Beware naïve principles, look at realistic requirements and assess trade offs.

·         Flexibility? Beware this usually requires a more complex design.

·         Scalability? Be realistic about the need, and assess the cost of allowing and restoring integrity.

·         Decoupling? Beware that decoupling related components physically does not decouple them logically.

·         Reuse? Beware that dividing an application does not usually create components readily reusable in other contexts.

·         Keep it simple? Beware BASE can be considerably more complex than ACID.

·         Agile development? Grouping transaction scripts into “pseudo microservices” is a rational way to divide work between individuals or teams.

General conclusions and remarks

In short, there is no silver bullet, you can’t have everything, there are always trade offs.

 

Having a programming background does help people on our architecture courses.

It gives them insights that are valuable at the architecture level.

Nevertheless, there is a problem out there; the truth is that:

·         Many programmers are ill-educated in the very wide variety of ways to code software – and trade offs between them.

·         Technology vendors encourage programmers to use technologies they don’t need.

·         Industry analysts repeat what technology vendors tell them.

·         Reading software guru books can encourage programmers to use patterns designed for problems they don’t have.

·         Programmers like extend their CV with the latest buzz words.

 

The result is that programmers generate solutions that are more complex than is justifiable.

Complexity can appear in excess code, data abstraction layers, message passing, compensating transactions and middleware dependency.

 

Vendors, industry analysts and commentators encourage programmers to decouple application components.

Concerns about this include:

·         there are a dozen ways to be coupled or decoupled, so the meaning of the direction is unclear.

·         physical decoupling is not logical decoupling.

·         decoupling tends to increase complexity, slow down and disintegrate business processes.

·         the optimal degree of coupling varies with the granularity and cohesiveness of the specific components at hand.

·         following the direction has led to overuse of middleware and consumes excessive server-side resources.

 

As Craig Larman wrote: the problem is not high coupling per se, it is high coupling between components that are unstable in some way.

Rather than directing decoupling throughout, architects should be balancing trade offs and providing more nuanced guidance.

 

How to save the enterprise from the cost, risks and issues of following software design fashions?

EA needs solution architects to shape and steer programming efforts in an EA-friendly direction.

They have a role to play in abstracting and repeating lessons learned from experience.

 

Don’t be over eager to use the latest design pattern/method or technology.

Beware that ungoverned agile development can lead to duplication, disintegrity and complexity.

Keeping things simple, minimising unnecessary code and use of middleware, is a reasonable principle.

This can sometimes mean constraining programmers’ enthusiasm for “new” ways of doing things.

 

As the first slide in our enterprise and solution architect training says.

Think about the business context.

Don't forget the numbers.

There are many ways to design and build something.

You have to balance trade offs:

·         between qualities (e.g. flexibility or simplicity)

·         between what is best for the local system and what is best for a global system.

You are responsible for:

·         knowing there are many possible answers

·         ensuring trade offs are addressed and

·         recommending one or more options.

 

Footnotes

Footnote 1: Design practices some relate to true microservices

Read Dividing an application into microservices and Domain Driven Design, Transaction Script and Smart UI for practices related to microservices.

Domain-Driven Design, CQRS and Event Sourcing are sometimes related to each other.

But Transaction Scripts can equally well publish Events (for others to consume) and log Events (for subsequent query and replay).

And you don’t need Domain-Driven Design or CQRS to design microservices, separate database update transactions from queries, publish Events, or use Event Sourcing.

 

Design by Contract (DBC) means a server will fall over if a client (any client) does not guarantee its preconditions.

Defensive Design means a server tests (or likely retests) its preconditions are met, and responds gracefully if not.

Distribution of microservices (or any software) between nodes tends to increases the need for Defensive Design.

 

You might see microservices as a scaling down of the “Bezos mandate” (footnote 2) and/or Service-Oriented Architecture (footnote 3).

In other words, as extending the principle that components are distributed across a network and communicate via APIs.

However, microservices do not have to be distributed across a network.

Other modularisation strategies are no less service-oriented; and SOA does not say how to divide an application into modules.

Footnote 2: The Bezos mandate and microservices

Hard cases make bad law is legal maxim.

It means that an extreme case is a poor basis for a general law that would cover a wider range of less extreme cases.

In other words, a general law is better drafted for average or common circumstances.

 

Few businesses are like Amazon, Google, Facebook or eBay.

Whatever they do to handle extreme business transactions volumes is not necessarily optimal for others.

But one thing done at Amazon seems widely accepted as good practice.

Jeff Bezos famously issued an executive order to Amazon’s software teams.

His Big Mandate went something along these lines:

 

·         All teams will henceforth expose their data and functionality through interfaces.

·         Teams must communicate with each other through these interfaces.

·         There will be no other form of inter-process communication: no direct linking, no direct reads of another team's data store, no shared-memory model, no back-doors whatsoever.

·         The only inter-team communication allowed is via interface calls over the network.

·         It doesn't matter what technology teams use: HTTP, CORBA, Pubsub, custom protocols. It really doesn't matter; Bezos doesn't care.

·         All interfaces, without exception, must be designed from the ground up to be externalizable, exposable to developers in the outside world. No exceptions.

·         Anyone who doesn't do this will be fired.

·         Thank you; have a nice day!

Does Bezos mandate scale down to microservices?

The granularity of components makes a difference to how they are best deployed and integrated.

 

Component granularity?

How big and complex is the system behind an API?

The Bezos mandate refers to interfaces between teams – does not indicate the size of a team or what it maintains.

Each team might well maintain a large monolithic system.

And divide it “microservices” – unknown to other teams.

And then deploy several microservices on one server - without getting fired!

 

Network use?

The Bezos mandate insists inter-team communication is via APIs exposed across the network (surely an IP network).

Mandating the same for all inter-microservice communication may hinder performance and increase complexity.

What are the implications for network traffic, network costs and network monitoring?

One is forced to use defensive design techniques.

An architect told me agile development of distributed microservices in his enterprise had led to wasteful duplication of data and code.

 

Middleware use?

The Bezos mandate does not presume middleware use.

Mandating that all microservices communicate asynchronously can increase complexity, disintegrity and have unwanted side effects.

Must we use messaging between microservices? Even fine-grained components coded using the same technology on the same server?

Another architect told me they have a "microservices dependency nightmare" featuring c200 microservices on c15 app servers.

Middleware is hugely overused between microservices, logging a ridiculous number of events that are of no interest to the business.

Some or many microservices would be better closely-coupled or combined, and deployed on one or very few servers.

So, they are looking to strip the middleware out of their primary business application as far as possible.

 

A few lessons learned

For more on the Bezos mandate

https://gigaom.com/2011/10/12/419-the-biggest-thing-amazon-got-right-the-platform/

https://plus.google.com/u/0/+RipRowan/posts/eVeouesvaVX?hl=en

 

“Amazon learned a tremendous amount while effecting this transformation.

Teams learnt not to trust each other in most of the same ways they’re not supposed to trust external developers.

 

Assigning an incident is harder, because a ticket might bounce through 20 service calls before the real owner is identified.

If each bounce goes through a team with a 15-minute response time, it can be hours before the right team finally finds out, unless you build a lot of scaffolding and metrics and reporting.

 

Every team suddenly becomes a potential Denial of Service attacker, so quotas and throttling must be put in place on every service.

 

Monitoring = automated QA, since to tell whether a service is responding to all invocations, you have to make individual calls.

The problem continues recursively until your monitoring is doing comprehensive semantics checking of your entire range of services and data.

 

With hundreds of services, you need a service-discovery mechanism, which implies also service registration mechanism, itself another service.

So Amazon has a universal service registry where you can find out reflectively (programmatically) about every service, what its APIs are, and also whether it is currently up, and where.

 

Debugging problems with someone else’s code gets a LOT harder, and is basically impossible unless there is a universal standard way to run every service in a debuggable sandbox.”

 

Footnote 3: A history of Service-Oriented Architecture (SOA)

 

1970s Local Procedure Calls (LPC)

When modular programming began, all the modules were deployed together on one computer.

There were subroutines within one deployable program, and modules shared between separately deployable programs.

Cobol programmers encapsulated local modules behind APIs.

Software design thinkers added advice on modularisation:

·         1972 Parnas wrote that a module should encapsulate a data structure and operations on it.

·         1975 Michael A Jackson showed how to divide a program into modules that handle different data structures.

·         1979 Keith Robinson proposed dividing enterprise application software into three client-server layers (external, conceptual, internal)

LPC was fine as long as the code remained on mainframe computer.

But was inadequate for the distributed computing that followed.

 

1980s Remote Procedure Calls (RPC)

Clients in a client-server application connect to server nodes by RPC.

RPC was fine for many, but it was deprecated in this Vinoski paper.

 

1990s Distributed Objects using Object Request Brokers (ORBs)

Meyer wrote that OOP classes should encapsulate data structures, and be coordinated by higher level processes.

The “OOP revolution” led to people distributing objects and to the replacement of RPC by ORBs.

ORBs assumed client and server objects were rather tightly coupled.

By the end of the decade, ORBs were deprecated by Microsoft in favour of their home-grown alternative.

 

1990s Component-Based Design (CBD)

Architects concluded:

·         objects are too small to be distributed.

·         inheritance is limited and fragile reuse mechanism, unusable in distributed systems.

·         we need other ways to modularise enterprise applications that maintain large databases.

So, CBD modularised applications into larger components that resemble microservices today.

Middleware vendors promoted the idea that components should communicate via middleware.

 

2000s Service-Oriented Architecture (SOA)

Microsoft deprecated RPC and ORBs.

They applied the term SOA to encapsulation of remote services behind a WSDL-defined interface.

A client uses that interface to find how and where to invoke a service.

The client then invokes that service by sending an XML message over SOAP and HTTP.

 

Thomas Erl and other gurus promoted SOA as a technology-independent design paradigm.

Their SOA principles overlapped with those of 1970s modular programming.

1970s

Design principles

2000s

SOA principles

Meaning that modules/components

Cohesion

Abstraction

encapsulate cohesive data structures and algorithms

Loose coupling

are encapsulated by and communicate via interfaces

Composability

Composition

can be invoked and orchestrated by higher processes

Reusability

Reuse

are designed for use by different clients

Maintainability

 

are maintained in several versions to enable incremental change

 

Statelessness

do not retain data in memory between invocations

 

Autonomy

are developed or deployed or operate on their own.

 

Many have equated SOA with loosely-coupled design, but what did that mean exactly?

Server components should be invoked asynchronously?

In practice, clients still commonly make synchronous request-reply invocations.

Microsoft, Thomas Erl and others tended to presume

·         clients should be decoupled from server component locations and technologies.

·         server components should be invokable over an IP network.

Beyond that, loose-coupling was interpreted to mean inserting middleware between clients/senders and servers/receivers.

 

2000s SOA and middleware

Middleware vendors made their tools comply to “web service standards” and promoted them as SOA tools.

So SOA was about using middleware to consume service requests, direct them to service providers and orchestrate them.

Each vendor defined SOA in terms of the features provided by their tool.

So for some of their customers, SOA became about using middleware to:

·         orchestrate services in workflows

·         choreograph services via message passing.

·         log messages, authenticate messages, etc..

 

2003 REST

Roy Fielding defined ways to take advantage of ubiquitous TCP/IP networks.

By designing RESTful client components and RESTful server components.

 

By now, the term web service no longer implied a WSDL-defined interface.

Any resource accessible over the web - perhaps using a RESTful invocation – was called a web service.

 

2010 OData (Microsoft 2007, OASIS 2014)

REST was extended by the Open Data Protocol (OData) protocol.

A standard that defines best practices for building and consuming RESTful APIs.

It helps clients access any remote data server wrapped up behind an OData interface.

 

SOA and microservices

Microservices are about how to divide an application into components.

Microservices might be seen as an application of OO design principles.

Some say microservices extend SOA principles, and microservices may be distributed and integrated over a network.

However, modularising an application into microservices does not require that they are distributed thus.

And alternative modularisation strategies could equally well be described as extending SOA principles.

 

Footnote 4: The ambiguity of “service”

Remember Fowler’s process and service correspond to application components in TOGAF and ArchiMate

In Fowler’s writings for programmers:

·         Component: A unit of software that is independently replaceable and upgradeable.

·         Process; An instance of a component executed on a computer that contains its executable code and current state data. (Some OS allow it to contain multiple concurrent threads of execution.)

·         Service:  An out-of-process component, typically reached by a web service request or RPC.

 

In the ArchiMate and TOGAF standards from The Open Group:

·         Component: A structure (or building block) that offers multiple services (or service portfolio) to its clients.

·         Function: A logical component that groups behaviors by some cohesion criterion other than in sequence (see process).

·         Process: A behavior composed of process steps in a sequence that leads to a result.

·         Service: A discretely requestable behaviour that encapsulates one or more processes.

 

The Open Group defines a service as below.

 “SOA is an architectural style that supports service-orientation”

You cannot define a thing in terms of the thing.

“Service-orientation is a way of thinking in terms of services and service-based development and the outcomes of services.”
A “way of thinking” suggests there is no specific meaning; however, the term “outcome” is promising.

“A service is a logical representation of a repeatable business activity that has a specified outcome”.
Aha! This is much clearer.

·         “A logical representation” = a service contract defining inputs, preconditions and outputs and post conditions.

·         “A repeatable business activity” = a discrete behavior as in system theory.

·         “Has a specified outcome" = a discretely requestable service.

 

“It is self-contained. It is a black box for its consumers. It may consist of other underlying services.”

OK, as in modular programming forever.

The Open Group offers these examples:
“check customer credit, provide weather data, consolidate drilling reports”
OK, these are discretely requestable services; not components/microservices that offer multiple services.

 

In TOGAF and ArchiMate standards, the “services” are discretely requestable behaviors.

TOGAF’s application component offers “information system services” to its clients.

ArchiMate’s application component offers “application services” to its clients.

Each service is a response to a query or command, definable declaratively in a service contract.

 

So beware, “microservices” are called “components” in TOGAF and ArchiMate.

Read the slide show here for a deeper explanation.

Footnote 5: Linkedin comments and discussion

 

Marc Bastien wrote

At last, a post on architecture with some real content.

Thanks Graham for elevating a bit the level of discussion about IT architecture above vendor sales pitches.

And for those that have a hard time understanding your post, that are lost in the terms, and find this subject too "esoteric", here's a friendly suggestion:

Don't rely too much on programmers that are good at doing the "real" thing.

Because most of them just don't have a clue about the impacts of what they are doing besides and above the code they deploy.

Not because they're​ not bright enough, but simply because they're busy doing their things.

 

Rob Dean wrote:

Watching a presentation on Microservice Architecture I was astounded by the lack of perception [of the points in this paper].

You cannot provide every quality attribute in any architecture.

Trade offs are the nature of engineering, regardless of the discipline.

Each service will contain some redundant code to enable it's isolation from the other services to allow for the horizontal scaling.

While removing one type of dependency you introduce others, such as state.

The only way to introduce reuse in these services is composing them from a configuration definition.

Also, while attempting to scale small units of functionality you introduce dependency scaling.

If you decouple Orders from Inventory as in the example, you will always need enough inventory services to support the orders.

The dependency remains intact even though the monolith gone.

 

Edin Nuhic wrote:

Graham, your writing on the subject is highly relevant.

TOGAF can often confuse people who misunderstand it.

However, you seem to be the one who analyzes things deeply and then tries to summarize what it all comes down to.

Your analysis of loose coupling is a poetry that everybody getting into system integration should read several times until it is really understood.

As you point out - there are two definitions of microservices

·         the one from Martin Fowler that addresses the organisational and the deployment aspects in the first hand

·         the other that addresses the performance.

 

This second advocates asynchronicity as a means of preventing idle slots waiting for a synchronous service to return.

[This bring something new to the table] and deserves a separate discussion.

It is strange that very serious people can mean so different things and call them the same name.

 

FAQS

 

-1- What makes a service a micro?

It is micro compared with the macro application it is a component of.

 

-2- What is micro about “pseudo-microservices”?

They are similar in size to true microservices.

Both process a subset of service/transaction requests from the client layer.

But pseudo microservices don’t wholly encapsulate the data they access.

 

-3- How does a microservice differ from a usual SOA style service?

What is a usual SOA style service?

 

-4- Is the transaction aspect is relevant?

It is relevant for the reasons mentioned in the paper

 

-5- How important is independence of deployment - the devops aspect.

That suggests a distinct application, not a subdivision of one.

Unless it means deliberately dismembering a business process.

 

-6- How important are performance gains through asynchronicity and granularity.

Speed – could make things worse.

Throughput – important in some (probably rare) cases

Availability – it depends what the business wants to be available.

 

All free-to-read materials at http://avancier.website are paid for out of income from Avancier’s training courses and methods licences.

If you find the web site helpful, please spread the word and link to avancier.website in whichever social media you use.