Dividing an application into microservices

One of about 300 papers at http://avancier.website. Copyright 2014-17 Graham Berrisford. Last updated 14/03/2017 20:15

 

This paper is a supplement to introducing microservices - a paper you might want to read first.

An EA vision of the 1980s  – one enterprise-wide application – wholly centralised data management – is no longer achievable

The much touted ‘single-version-of –the-truth’ can usually be achieved only with a segment of the enterprise.

But microservices are smaller than applications let alone enterprise segments.

So what is the scope of a microservice? How to divide one application into microservices?

This paper discusses two of Martin Fowler’s nine microservices principles:

·         Organisation around Business Capabilities

·         Decentralized data management

Contents

“Organisation around Business Capabilities”. 1

“Decentralized data management”. 3

Logical and physical decoupling. 3

Other design concerns. 5

On the impossibility of defining an enterprise-wide business data model 6

 

“Organisation around Business Capabilities”

 

Business capabilities as bounded contexts or domains

Fowler refers to Domain-Driven Design.

"DDD divides up a large system into Bounded Contexts, each of which can have a unified model - essentially a way of structuring MultipleCanonicalModels." Fowler

"Various factors draw boundaries between contexts.

Usually the dominant one is human culture, since models act as Ubiquitous Language, you need a different model when the language changes." Fowler

If your Bounded Contexts are so disparate they serve different populations using different terms and concepts, they will surely be different applications anyway.

 

Given disparate data structures, you will naturally design distinct applications to handle them.

Given a data structure in which all entities are related directly or indirectly, then however you divide it, you run into difficulties.

The smaller the partitions, the larger the difficulties, so modularising around normalised entities seems a non-starter

Even dividing the structure into aggregate entities feels like an attempt to force-fit OO design onto transaction processing.

 

“You also find multiple contexts within the same domain context, such as the separation between in-memory and relational database models in a single application.”

This boundary is set by the different way we represent models." Fowler

I don’t believe the model representation is important.

I believe the difference is between the structure of code that needs only a transient cache and the structure of persistent data.

The application layer code structure can be restructured while persistent data is sitting safely in the database.

Programmers can devise complex inheritance structures and revise them as need be.

The database structure can’t be restructured without a data migration exercise, which makes inheritance relationships a hostage to fortune.

 

“The canonical source for DDD is Eric Evans's bookhttps://www.assoc-amazon.com/e/ir?t=martinfowlerc-20&l=as2&o=1&a=0321601912.

It isn't the easiest read in the software literature, but it's one of those books that amply repays a substantial investment.” Fowler

Perhaps, but to be brutal, a solution architect may see minimising the investment needed to maintain an application as a goal.

And that might mean avoiding DDD until it is demonstrably of benefit.

“Microsoft recommends that it be applied only to complex domains” said Wikipedia January 2017.

 

Here is a paper DDD v Transaction Script

http://grahamberrisford.com/AM%202%20Methods%20support/06DesignPatternPairs/Domain%20Driven%20Design%20v.%20Transaction%20script.htm

 

Business capabilities as cohesive data processing modules

Back in the 1970s, the springboard for enterprise architecture was not thinking about technologies.

Duane Walker (working for IBM) established an enterprise analysis-oriented tool called Business Systems Planning.

Business System Planning featured analyses of Processes, Data and Applications,

At the heart of the analysis was the Data entity/Process matrix.

    Process

Register customer

Place order

Complete sale

Launch product

Recall product

Data entity

Customer

Create

Read

 

 

Read

Sale

 

Create

Update

 

Read

Sale item

 

Create

 

 

Read

Product type

 

Update

 

Create

Update

 

This table suggests two ways to modularise a business into discrete capabilities – by data or by process.

A third way is to cluster processes that act on the same data.

You can find these processes by applying cluster analysis (the north-west corner method) to values in the cells of the table above.

“Decentralized data management”

Given one application is to be divided, decentralized data management likely means:

·         dividing the persistent data structure to match the division of the application layer into microservices.

·         partitioning the processing of what is or could be one cohesive data structure.

 

A logical data model (LDM) defines terms and concepts in one "business context".

It defines terms and concepts used by software architects as well as business domain experts.

It must be internally consistent, so that there are no contradictions within it.

It also acts as the conceptual foundation for the physical design of application databases and application software.

 

The conversion of an LDM into database structures is a solution-level database designer task.

Similarly, its conversion into a domain model for OO programming is a solution-level OO designer task.

(Some OO nuts don’t like to admit that their domain model is a physical, implementation-specific model, just as a database schema is.)

 

The EA team can and should identify where LDMs contain duplicated or related data that needs to be made consistent.

Martin Fowler shows a simple example which serves to illustrate the point here.

The two LDMs below each contain some terms and concepts unique to their own context, and some terms and concepts shared with the other.

http://martinfowler.com/bliki/images/boundedContext/sketch.png

Fowler’s example is questionable:

Customer and Product may have different attributes and relationships Sales and Support contexts, yet still mean the same thing

Some may be relaxed about decoupling and duplication of Customers and Product data.

Others may propose that decoupling of Sales from Support is a business issue that needs to be addressed.

Logical and physical decoupling

 

When physical decoupling matches logical decoupling

Ideally, one business capability require uses cases that can be mapped to one and only one microservice.

Business

Capability 1

Capability 2

Capability 3

Capability 4

etc

Client-side user interface

3 use cases

5 use cases

2 use cases

3 use cases

etc

Server-side application

N classes

N classes

N classes

N classes

Data store

8 entities

9 entities

3 entities

5 entities

etc

 

Suppose 1 macro app offers 30 use cases and maintains 80 database tables.

You might divide that 1 macro app into:

·         10 microservices that each offer 3 use cases, OR

·         10 microservices that each maintain the data in 8 database tables.

 

Now suppose, for the sake of argument, the two divisions above coincide.

So you have 10 microservices, which each offer 3 use cases and maintain 8 database tables.

 

Apps can be completely decoupled if the requirements for those apps are also decoupled.

Suppose the 3 use cases of 1 microservice above need access only the 8 database tables maintained by that same app.

Then the microservices are natural silos, each with its own user interface and database.

The microservices are decoupled logically, which means there is no obvious reason to decouple them physically.

Each can be readily deployed and maintained as a distinct silo system, maintaining its own business data.

 

When physical decoupling hinders logical coupling

Suppose the use cases of 1 microservice need access to database tables maintained by other microservices.

Or to put it another way, there are logical transactions that need access to data in several discrete data stores.

The microservices might be physically decoupled, but the fact is, they are logically coupled.

 

Given a macro app, one transaction can act on a database that is wide-ranging enough to hold all the required data.

E.g. a hotel booking transaction might:

·         check the hotel has a room in the “available” state for the required span of days

·         update that room to the “booked state” for the required span of days

·         take payment.

 

After division into microservices, that same transaction might turn into a logical workflow that coordinates two physical transactions

1.      take payment (in the hope a room will be available)

2.      check the hotel has a room in the “available” state and update that room to the “booked state” for the required span of days

 

Immediate consistency means that if the second transaction fails, the first transaction will be immediately rolled back, as though it never happened.

Eventual consistency means that if the second transaction fails, then the money must be refunded later, or the customer must be compensated in some other way.

Whether the requirement is for immediate consistency or eventual consistency is a business decision (not an engineering decision).

Other design concerns

Modularisation and decoupling tends to create extra work to integrate modules.

 

Extra work to integrate modules

The wider the scope of data in one data store, the easier it is to maintain data consistency - immediately.

A wide-ranging logical transaction can be implemented as one physical transaction on that data store.

 

The wider data is distributed between data stores, the more work has to be done to achieve “eventual consistency”.

The same wide-ranging logical transaction requires several physical transactions to be performed – in different places at different times.

 

“These days, everything seems to be going mobile but it's not as simple as just providing a smartphone app. 

Modern users' workflows pass through so many systems.

Without integrated, automated business processes the user needs to switch between multiple apps or browser windows. 

Worse still, they may even be unable to accomplish their processes on mobile.” Magic Software

 

There is still the need to implement the logical transaction by coordinating the several physical transactions in discrete microservices.

See Introduction to Microservices for more options.

 

The mistake is to assume that coordination will happen by magic, or organically.

It won’t. The architect still has to understand the logical transaction or workflow at a conceptual level

There is not only the need to coordinate the local transactions, there is the need to consider the non-functional requirements.

And to design compensating transactions were required.

 

Extra work to meet non-functional requirements

Tight coupling can prove better for data integrity, speed, availability, security and other non-functional requirements.

Martin points to the need to:

 

Batch multiple invocations into one

“Remote calls are more expensive than in-process calls, and thus remote APIs need to be coarser-grained, which is often more awkward to use.

 

Monitoring of distributed microservices

“teams would expect to see sophisticated monitoring and logging setups for each individual [microservice]

such as dashboards showing up/down status and a variety of operational and business relevant metrics.”

 

On the impossibility of defining an enterprise-wide business data model

Every enterprise's applications architecture depends on the data architecture of that enterprise.

Some EA frameworks suggest building and maintaining a single enterprise-wide Business Data Model (BDM).

However, it is usually not possible to draw an enterprise-wide model of entities, attributes and relationships.

 

In discussion of the “Bounded Context” concept (15 January 2014) Martin Fowler said as follows.

In [my] younger days we were advised to build a unified model of the entire business.

But we've learned that "total unification of the domain model for a large system will not be feasible or cost-effective"

 

Also:

As you try to model a larger domain, it gets progressively harder to build a single unified model.

Different groups of people will use subtly different vocabularies in different parts of a large organization.

The precision of modeling rapidly runs into this, often to leading to a lot of confusion.

Typically this confusion focuses on the central concepts of the domain… with polysemes like "Customer" and "Product".

 

Another example of what Fowler calls a “polyseme” is the term “Policy”, which used in an insurance company with several meanings.

 

Despite the impossibility of drawing one BDM, enterprise architects (EAs) do have enterprise-wide concerns.

They are responsible for identifying where data in discrete data stores is or should be duplicated or related.

And then, for identifying where, when and how duplicated and related data should be made consistent.

(If the EAs are not responsible for that, then in what sense can they call themselves EAs?)

 

In 1995, speaking at a BCS Data Management Specialist Group meeting, I recommended

·         focus the BDM only on data that is duplicated or related in discrete data stores

·         document this common data BDM in a simple data dictionary (rather than trying to draw it as a data model).

 

People thought my recommendation was controversial!

 

All free-to-read materials at http://avancier.website are paid for out of income from Avancier’s training courses and methods licences.

If you find the web site helpful, please spread the word and link to avancier.website in whichever social media you use.