CQRS and Event Sourcing

This paper is published under the terms and conditions in the footnote.

 

This paper outlines a number of contrasting principles and patterns,

I've tried to convey what might be called "OO" and/or "Agile" principles with my usual scepticism.

And relate them to the tension between enterprise architecture and agile system development.

Contents

Command-Query responsibility segregation. 1

Segregating data stores as well as processing components. 2

Coordinating components. 2

Smaller simpler components – more complex coordination. 3

Event sourcing v. database transaction processing. 3

Event sourcing. 4

Database transaction processing. 4

Notes on the integrating the patterns above. 4

CQRS and update/reporting data store separation. 5

Domain-Driven Design and CQRS. 5

Domain-Driven Design, CQRS and Event Sourcing. 5

Appendix 1: some (reader supplied) code. 7

Appendix 2: disputable (reader supplied) observations on Command v Event 8

 

Command-Query responsibility segregation

Greg Young and Udi Dahan (specialists in Event-Driven Architectures and Event-Sourcing) took Meyer’s Command Query Separation pattern to a higher application interface level.

They proposed application component interfaces should contain only update operations, or only Query operations.

The idea is that splitting writes from reads should make systems faster, more stable, more testable and more maintainable.

So there are:

·         Application components that write state data, change something without returning data (bar failure messages).

·         Application components that read data, get something without data updates or other side-effects.

 

The CQRS pattern features three kinds of communication: Queries, Commands and Events.

Typically, a Command or Event is handled by a single component, and starts and commits one physical database transaction.

 

A Query is sent, usually by a UI component to retrieve a snapshot of data

There is usually a response, a Data Transfer Object (DTO) that maps directly to a UI view.

 

Messages

 

UI

QueryRequest

Data Component

QueryResponse

 

A Command is how a UI component requests action on data, like Place Order and Register Customer.

 

Message

 

UI

Command

Update Component

 

The server-side component features:

Command handlers: check Commands, retrieve data for transactions.

Command operations: perform transaction logic, store data and posts events (that an order has been placed or a customer registered)

 

An Event is typically published after a transaction is complete to notify all listeners/subscribers of what has changed.

An Event is typically in the past tense: Order Placed, Customer Registered.

 

Message

 

Message

 

UI

Command

Update Component

Event

Event subscriber

 

What kind of communication mechanism?

This can be seen as logical - technology-independent - design pattern.

But it is often expected that Commands and Events will be transported by Message Queues or Message Bus.

 

Segregating data stores as well as processing components

Commands post updates in an Event store.

It contains time-ordered log of events, the state changes made by successful transactions.

The state of an entity or aggregate – at any time - can be constructed from the event log

Rolling snap shots save you have to start from the beginning of time.

 

Queries are directed to a data store called a View Model or Read store.

It stores current entity state data, new states overwrite old states

It may be de-normalised to match UI views, and cached on the Query server.

It is updated by Events published by Command operations.

 

What kind of data store?

Again this can be seen as logical - technology-independent - design pattern.

Read stores and Event stores could be relational or any other kind of database.

However, an Event store could well be a message queue or a NoSQL database.

 

Coordinating components

Events (once posted) can be received and processed by any other component.

A Query component simply consumes Events.

A Command component may post Events that trigger business logic in another update component, which publishes other Events as a result, and so on.

 

Message

 

Message

 

Message

 

UI

Command

Update Component 1

Event 1

Reporting Component

 

 

Update Component 2

Event 2

Update Component 3

 

Suppose a Command triggers a Customer component to post an Event which is processed by Product and Accounting components.

 

Does the Customer component need to know which other components read and process that Event?

No. But the business cares about this, and system architects should know.

 

Does the Customer component need to know if the Event has been processed by any other component?

Perhaps. The business needs to know, and the system architects have to think about why and how two or more components are coordinated by Events.

 

If the Product component posts an OrderRejected Event, the Customer must process that Event.

So, the customer component does “care” about what happens after it posts the OrderPlaced Event.

It must know how to recognise the undo Event perform the appropriate compensating transaction.

 

The designer may coordinate the apps by designing inter-component communication, or an overarching control procedure, workflow, or saga

A saga can be introduced to coordinate applications involved in a long-running process or logical transaction

In our example, the saga for one logical business transaction links several physical database transactions.

 

Smaller simpler components – more complex coordination

The smaller and simpler the queries and update transactions, the more complex the coordination.

Complexity appears in coordinating physical transactions to implement the logical transaction.

And in adding compensating transactions to achieve eventual consistency  at the end of the logical transaction.

 

Inconsistency between systems makes work for developers.

Remember what Google say on eventual consistency:

“Designing applications to cope with concurrency anomalies in their data is very error prone, time-consuming, and ultimately not worth the performance gains.”

developers spend a significant fraction of their time building extremely complex and error-prone mechanisms to cope with eventual consistency and handle data that may be out of date.

We think this is an unacceptable burden to place on developers and that that consistency problems should be solved at the database level”

F1: A Distributed SQL Database That Scales”, Proceedings of the VLB Endowment, Vol. 6, No. 11, 2013

 

CQRS is a pattern for very high availability, very high throughput - huge concurrency.  Surely, most apps are not like that?

And if one data store can support both Command and Query processing, why not?

Event sourcing v. database transaction processing

Entities are things that persist, with continuity of identity (e.g. a bank account with a current balance).

Events are things that happen, which affect persistent entities (e.g. credit and debit transactions).

 

However, the distinction is blurred, since events can be identified and remembered, just as entities are.

And the current state data of an entity can be seen as a side-effect of an event stream that started when the entity was born.

Event sourcing

The following is partly edited from Martin Fowler’s web site: http://martinfowler.com/eaaDev/EventSourcing.html

The basic idea of event sourcing is that every change to the state of an application is recorded in an event object.

Event objects are stored in the sequence they were applied - for the lifetime of the application state.

 

The current state of entities (things that persist) can be recovered from the event log (rather than from a conventional database).

In the Model-View-Controller pattern, the Views may retrieve data from the Event log using Query messages.

 

The key to event sourcing is that all changes to domain objects are initiated by event objects.

A number of facilities that can be built on top of the event log:

·         Complete Rebuild: discard the application state completely and rebuild it by re-running the events from the event log on an empty application.

·         Temporal Query: determine the application state at any point in time. Notionally we do this by starting with a blank state and rerunning the events up to a particular time or event.

·         Event Replay: If a past event was incorrect or missing, reverse to before that event and then replay the new event and later events.

 

What about the wider business?

Event replay works for one application in isolation, but what if this application sends/receives data to/from other applications?

If you replay old events in this application, you don’t want to update other applications a second time, or collect data that has changed since you first collected it.

So, to replay events may involve disabling the sending of events, and storing all data collected previously, when events happened in the past.

 

Database transaction processing

A conventional database schema can be extended to store the events (e.g. debits and credits) that affect the persistent entities (e.g. bank accounts).

And note that a database management system records a transaction log.

A transaction log is not the same as an event log, but it does usually support the following operations:

·         Recovery of individual transactions.

·         Rolling a restored database forward to a given point.

·         Transaction replication, database mirroring, and log shipping.

Notes on the integrating the patterns above

There are relationships between the three patterns above.

But note you don’t need Domain-Driven Design or CQRS to separate database transactions from database queries, to publish Events, or to use Event Sourcing.

CQRS and update/reporting data store separation

These work together because the CQSR pattern separates Command and Query application components.

This suits separation of the update and reporting data stores, if required.

Domain-Driven Design and CQRS

These work together because both separate the processing of update Commands and Queries.

Queries can be processed in simplest and most efficient way, say executing stored procedures on the data store through a thin API Layer.

Commands trigger update transactions which act on the data of a Domain Model aggregate - retrieved from the same or different data store.

A Command hander passes a Command to a Command operation on the root entity of an aggregate

That root entity operation validates the Command and applies it to data within the aggregate.

 

Commands/transactions and aggregates do not align themselves by accident.

Aggregates are scoped with update transactions in mind, so that most transactions access data contained inside aggregate.

Domain-Driven Design, CQRS and Event Sourcing

If you combine DDD, CQRS and Event Sourcing, then:

·         the data store for Queries may be a called the Read store

·         the data store for Commands is an Event store.

 

Event store is a logical name here – it implies a particular kind of logical data model

The data storage technology is whatever you choose, be it relational or non-relational.

 

Commands are applied to Domain Model data, which must retrieved from the Event store.

After a Command (say, Debit Account) has been applied to a Domain Model aggregate, the root entity saves one or more Events (say, Account Debited) in the Event store.

(Although it's usually not the aggregate but a repository that is responsible for serializing and de-serializing the aggregate to and from the underlying data stores that 'saves the events'.)

 

Before a Command can be applied to an aggregate entity in a Domain Model, there are two things to do

 

First, assemble the data of the aggregate entity on which the Command will act.

How? There are at least three options.

 

1.      Hold current state data in the Event store

This turns Event store into a conventional database, with tables for Accounts, Debits and Credits).

 

2.      Send a Query to the Read store

This breaks the segregation principle (and there is a risk that Event and Read stores are inconsistent).

 

3.      Build the current system state (e.g. current account balance) by replaying Events (e.g. credits and debits).

This may sound impractical, but in code it's not really difficult to implement.

The basic idea is that a fresh instance of the aggregate is created in memory.

All events of that aggregate are retrieved from the event store, deserialized, and then re-applied to the aggregate to build it's current state in memory.

Then, a method that contains the logic of the Command is executed on the aggregate, such that business rules are validated and changes to the aggregate (i.e. by raising new events) are made.

Then, the new events are appended to the event store.

 

To optimize 3, you could store snapshots on the aggregate's state say once every 20 or 50 events.

So that you only have to retrieve and deserialize the latest snapshot of the aggregate and it's 20 to 50 latest events to rebuild its current state in memory.

This is hybrid between 1 and 3 basically.

 

Second, test any preconditions for the Command

A Command (say Debit) usually has to check the system data is in a valid state for that Command.

Some preconditions test attribute values (e.g. does the Account hold enough money for the Debit?).

Others are known as referential integrity tests (e.g. has the Account been deleted?).

 

Again there are at least three validation options:

1.      No validation (which results in data that is inconsistent with rules, though eventual consistency may be achievable later)

2.      Pre-command validation (which leaves a small, perhaps acceptable, risk of inconsistency)

3.      In-command validation (which minimises if not eliminates inconsistency)

 

Pre-command validation tests could be done before or when a Debit Command is posted by a UI Component.

But then, other Commands could change the Account state before this Debit is processed.

And applying the defensive design principle, the tests should be made again on the server side.

 

In general, at least some in-command validation is needed.

So in general, testing the state of the system is an integral part of processing a Command.

 

Remember

You don’t need Domain-Driven Design or CQRS to separate database transactions from database queries, publish Events, or to use Event Sourcing

Transaction scripts can equally well publish Events (for others to consume) and log Events (for subsequent Query and replay).

Appendix 1: some (reader supplied) code

In general, at least some in-command validation is needed.

So in general, testing the state of the system is an integral part of processing a Command.

 

To prevent broken or illegal commands to be processed, you typically validate a Command before you process it (that is, at the server-side, where the Commands are received).

Validation means that you perform simple checks on the content of a Command, like checking whether or not all required data is present and whether or not this data conforms to simple constraints (similar to basic integrity checks on a database column).

These checks should not require any sort of context or other kind of external data to run - they should be executable everywhere.

This allows clients (UI) to validate a message before it is sent to the server and provide feedback to a user in an early stage that, for example, a form with required fields has not been filled in completely just yet. It's also a performance improvement since it prevents invalid data to be sent to the server.

This does, however, mean that valid commands will be validated for correctness both in the client and on the server, but since validation-checks are usually very quick and simple, this rarely is an issue.

 

The second level of 'validation' concerns the system's state, which is typically maintained and managed by the aggregates we talked about earlier.

This kind of validation can only be executed by simply processing the command and see where you end up, asserting data and business rules as you go.

For example, when a new order is placed by the following PlaceOrderCommand:

 

class PlaceOrderCommand {

    public readonly Guid CustomerId;

    public readonly Guid OrderId;

    public readonly List<Guid> ProductIds;

}

 

then the first thing that is acquired is the Customer-aggregate using the specified ID (constructor omitted for simplicitly):

 

public class PlaceOrderCommandHandler {

    private readonly ICustomerRepository _customers;

 

    public void Handle(PlaceOrderCommand command) {

        var customer = _customers.GetCustomerById(command.CustomerId);    }

}

 

If, for any reason, no customer with the specified ID exists, the repository will throw an Exception (implicitly stopping the Command from being executed, and causing the current transaction to be rolled back).

Otherwise, it will return the requested Customer-object.

This Customer-aggregate can then be used to place (create) a new Order-aggregate with the specified products selected:

 

public class PlaceOrderCommandHandler {

     private readonly ICustomerRepository _customers;

     public void Handle(PlaceOrderCommand command) {

         var customer = _customers.GetCustomerById(command.CustomerId);

         var order = customer.PlaceOrder(command.OrderId, command.ProductIds);

     }

}

 

The PlaceOrder-method could now check, for example, if this Customer is allowed to place new orders.

There could be a Business Rule, for example, that says that Customers can only place orders when their credit has been checked and asserted.

In the case this Customer is allowed to place new orders, the Customer-object will create a new Order-aggregate using the specified ID and Products.

Finally, this order could be saved into its own repository:

 

public class PlaceOrderCommandHandler {

     private readonly ICustomerRepository _customers;

     private readonly IOrderRepository _orders;

     public void Handle(PlaceOrderCommand command) {

         var customer = _customers.GetCustomerById(command.CustomerId);

         var order = customer.PlaceOrder(command.OrderId, command.ProductIds);

         _orders.Add(order);

     }

}

 

After the code completes, the OrderRepository-implementation will flush its changes to the database/event store (containing the new order) and the transaction will commit.

Again, it is not strictly necessary to use an event store for this.

In fact, event stores are rarely a feasible option related to all non-functional requirements, since they harness great advantages but also introduce extra complexity into the system (code, maintenance, less availble query capabilities, thus the need for a read model or other queryable data store).

 

Appendix 2: disputable (reader supplied) observations on Command v Event

Every message is either a Query, a Command, or an Event - to every sender and receiver.

A component can distinguish between message-types by name: Place Order Command, OrderPlacedEvent, GetOrderDataRequest and then handle them accordingly.

 

On a technical level, messages are nothing more than simple data-structures that are typically serialized and deserialized to get them across a network.

The type of the message (Command, Event, request or reply) is deduced from its name.

In OO-languages such as C# and Java, the name of a message is identical to its runtime type (i.e., message 'PlaceOrderCommand' maps to a PlaceOrderCommand class).

The code inside the server-component will then automatically 'know' how to process a message of type PlaceOrderCommand.

In fact, in .NET and Java many frameworks (WCF, MVC, NServiceBus) use reflection on message types to route the message to the appropriate handlers (a method that accepts an instance of that specific message).

 

Technically speaking, a component only distinguishes between one-way messages (no return value) and two-way messages (with a return value).

Since Commands and Events are both one-way messages, the receiving component doesn't have to be aware which is which.

However, functionally speaking, they are handled differently:

·         Commands are requests that can be denied based on certain criteria (invalidation, security, business rules), whereas

·         Events can only be discarded or ignored.

 

That makes a big difference in terms of design.

The publisher of an Event does not care if other components react to the Event (it doesn't depend on it).

So basically, no one has to explicitly report whether it successfully handled an Event or not.

 

Footnote: Creative Commons Attribution-No Derivative Works Licence 2.0           24/04/2015 10:24

Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited: http://avancier.website” before the start and include this footnote at the end.

No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it.

For more information about the licence, see  http://creativecommons.org