Integrity challenges

This paper is published under the terms and conditions in the footnote.

This paper outlines a number of contrasting principles related to maintaining the integrity of a system’s state.

I've tried to convey what might be called "OO" and/or "Agile" principles with my usual scepticism.

And relate them to the tension between enterprise architecture and agile system development.

Contents

Global integrity v. local agility. 1

Continuous consistency v. eventual consistency. 1

CAP triangle v. CAP theorem.. 2

ACID v. BASE.. 3

Notes on the patterns above. 4

Global integrity v. local agility

There is a tension between broad-scoped enterprise architecture and narrow-scoped agile system development.

This table contrasts the two in terms of choices faced by people throughout history (after Mark Madsen).

EA tendencies	Agilist tendencies
Top-down	Bottom up
Authority	Anarchy
Bureaucracy	Autonomy
Control	Creativity
Hierarchy	Network
Consistency	Flexibility

Mark Madsen says of such: “in every choice, something is lost and something is gained.”

Usually, a compromise has to be made.

Enterprise architecture looks to standardise, integrate and reuse systems, with a view to minimising:

· issues arising from hand-offs between people and systems.

· cases where systems get out of step to the point where dangerous or costly mistakes are made.

· elaborate and expensive compensatory processes to undo the effects of inappropriate actions.

Agile system development is fine in many ways.

Local agile development tends to result in silo systems that are not standardised, not integrated and don’t share common services and resources.

The enterprise architect’s desire for global integrity (data quality, system integration and reuse) guides and constrains local system development.

And this overarching governance inhibits agility to some extent.

Continuous consistency v. eventual consistency

All kinds of data replication create the possibility that data (stored or copied in different places) becomes inconsistent.

When and how to detect inconsistency, when and how to restore consistency, are among the essential challenges of enterprise architecture.

Continuous consistency

One ambition of enterprise architecture is to eliminate issues arising from inconsistent data sources.

Ideally, all related data is updated together, inconsistencies are not allowed.

In practice, this requires all related data to be closely located, ideally within one database.

And updates to the stored data will be ACID transactions, or atomic state changes.

A transaction moves the stored data from one consistent state to the next consistent state (or else fails).

Eventually consistency

This means that inconsistencies are allowed temporarily, and consistency is restored later.

In practice, temporary inconsistency is forced on a business by the enterprise data architecture, if not by human nature.

But then, architects have to understand how that inconsistency will affect how people behave in business processes.

The way humans behave is naturally asynchronous.

A person never stops (frozen, unable to think) waiting for a reply from another person.

People work in different places, based on the information they have locally, and send messages to each other.

Even if they do want a reply to a message, they think about or do something else while they are waiting.

If there are discrepancies between items of information held in different places, then things may have to be patched up later.

However, inconsistency makes work for developers.

Google on eventual consistency

“Designing applications to cope with concurrency anomalies in their data is very error prone, time-consuming, and ultimately not worth the performance gains.

“developers spend a significant fraction of their time building extremely complex and error-prone mechanisms to cope with eventual consistency and handle data that may be out of date.

We think this is an unacceptable burden to place on developers and that that consistency problems should be solved at the database level”

F1: A Distributed SQL Database That Scales”, Proceedings of the VLB Endowment, Vol. 6, No. 11, 2013

CAP triangle v. CAP theorem

The CAP triangle refers to:

· Consistency = All nodes see the same data at the same time, so data consumers will always get the latest information.

· Availability = One or more node failures will not stop surviving nodes from working, so your system will always be available.

· Partition tolerance = Inter-node communication failures will not stop any node from working.

The CAP theorem can be summarised (this is edited from Wikipedia) thus:

· The proof of the CAP theorem by Gilbert and Lynch is limited

· The theorem sets up a scenario in which

o two conflicting requests arrive at

o components replicated in distinct locations,

o at a time when a link between them is failed.

· The obligation to provide Availability despite Partitioning failures means both components must respond.

· At least one response shall necessarily be inconsistent with what implementing a true one-copy replication semantic would have done.

· The researchers then go on to show that other forms of Consistency are achievable, including a property they call Eventual Consistency.

The CAP theorem doesn't rule out achieving consistency in a distributed system, and says nothing about cloud computing or scalability.

However, people use the CAP triangle to explain other things, with implications for how to build systems in practice.

ACID v. BASE

These two patterns may be contrasted with reference to the CAP triangle (if not the CAP theorem).

Remember there is a difference between replicating data between:

· different tiers of one client-server application

· differently structured update (OLTP) and reporting (OLAP) data stores

· data copies in the data storage tier of one client-server application.

· different databases at the bottom of different client-server applications.

ACID: Atomic, Consistent, Isolated, Durable

If you want data to be consistent, conformant to rules, correct and up to date.

Then you may use the ACID pattern.

· RDMS (ideally a normalised data structure)

· Synchronous write to central disk

· Referential integrity checking

· ACID transactions - rolled back on error.

In client-server system, the database may be:

· Consistent

· Available enough for business needs

· At the expense of Partition-tolerance

If a failure prevents access to the database, then server-side components cannot guarantee a consistent response, and may reject the query or command.

BASE: Basic Availability, Soft-state, Eventual consistency

If you are happy with stale data, approximate answers, and eventual consistency.

Then you may use the BASE pattern.

· Divide data between nodes of the network.

· Maximise read availability through replication of the data

· Minimise update response time via asynchronous replication

In a client-server system, the higher-level components are usually:

· Available - components respond immediately and continually

· Partition-tolerant - components respond even if a failure prevents access to lower-level resources.

· At the expense of perfect Consistency.

Though consistency in underlying database may still be required.

Notes on the patterns above

CAP in a conventional client-server system

Remote client nodes send Queries and Commands to server nodes.

Since the network will fail at some point, P is needed.

P between client and server nodes requires clients to call asynchronously.

A of server nodes is achieved by scaling out the upper-most nodes

But A of the upper-most server nodes doesn’t guarantee the C of the data returned.

If A matters more, use cached data.

The server node returns their version of the required data immediately, a possibly stale copy of the master data.

Caching data increases the chance of inconsistency.

If C matters more, use master data

The server node consults the database or other “master” node to get the latest data before returning a response to a client,

Checking data for consistency (given the risk of a node or network outage) decreases the level availability.

How to optimise the balance between Consistency and Availability?

· invalidate caches based on time and/or

· update caches from Events published by the master node.

ACID v BASE

Eric Brewer interpreted CAP as precluding consistency for components in the highly scalable first tier of a modern cloud computing system.

So, CAP means we sacrifice consistency to gain faster responses in a more scalable manner.

It’s harder to design in the fault-tolerant BASE world (compared to the ACID world)

But Brewer says you have no choice if you want very high throughput and concurrency.

Remember global integrity v. local agility

Bear in mind the wider EA problem, replication of data in different databases at the bottom of different client-server applications

Global integrity may favour data consolidation and continuous Consistency, but this is usually impractical.

Local agility tends to result in data replication, and leave Eventual Consistency as the most practical option.

Footnote: Creative Commons Attribution-No Derivative Works Licence 2.0 20/04/2015 08:22

Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited: http://avancier.website” before the start and include this footnote at the end.

No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it.

For more information about the licence, see http://creativecommons.org