My journey on DDD, CQRS and Event Sourcing

This post is not about what DDD, CQRS and Event Sourcing are, but rather how I’ve been using it.

Over the last year I’ve been developing a collaborative social app (web and mobile) on my spare time where you can have user groups with activities, polls, discussions, feeds, and more.

As I’m targeting mostly mobile audience, I wanted to support disconnected clients and let offline users work on cached data. Once they’re online, I can synchronize their changes. This is easier to accomplish with task based UIs (where user actions map to commands in CQRS) and it’s clear that one user’s action doesn’t really need to be immediately visible to all members of the group, since it’s very likely that they’re offline and will only see the changes later. However, I wanted to be able to track and list other changes that have been done by users and not just show the final, last version of the data, giving a better feeling of collaboration even though clients can be disconnected most of the time.

Possibly this app could scale to millions of users and I wanted to keep it free of ads, so I needed the backend to be fast, scalable, cloud hosted and be as cheap as possible. I’m currently using Azure, but the original plan was to use AWS. On Azure I can implement my messaging infrastructure on top of Azure Queues and use Table Storage for my EventStore. On AWS I could SQS for messaging and DynamoDB for my EventStore. The key point is that using the right set of abstractions, my architecture doesn’t get tied to any particular service, database or cloud service.

Below is an overview of my current architecture. Non-blue boxes are components that can be hosted in separate processes / machines, but there’s really no obligation for that. My current setup is one worker role for the API and another worker role for command and event handlers.

Backend architecture overview

The CQRS pattern fits very nicely for my architecture, as it allows me to scale my write and read model differently. Also, combining it with messaging and asynchronous operations, I can temporally decouple the updates on the write model and read model. Yes, this introduces eventual consistency, but users don’t even notice, and I can send them push notifications whenever the read model has been updated so they can get a newer version of the data. On my write model I’m following Domain Driven Design, so instead of having one global data model to support all my business rules and that would need to scale as a whole, I can decompose my data model by persisting different aggregates that contain only the required state in order to enforce consistency and transactional behavior for a small set of business requirements. For instance, managing activities within a group has nothing to do with managing polls or discussions. So I can define an aggregate to enforce the business rules for managing activities and a different one to manage polls, each with its own responsibility. Although I’m using Event Sourcing on both of them, I could potentially store them on different databases and scale them differently. They integrate with the read model by publishing state change events. This not only gives me a lot of freedom of choice and flexibility, but also a lot of robustness because DDD and CQRS create so much decoupling that it’s very unlikely that I hardly break any functionality when I refactor things. And I did refactor a lot.. specially when I realized that with reliable messaging all my command and event handlers needed to be idempotent, otherwise I’d have very inconsistent results. I had to completely redesign my aggregates to support this. Unit testing is a must and very easy to do with DDD and CQRS, but having integration tests really helps here, so you can test the behavior when commands are processed, and ultimately read models are updated.

Besides the advantages of using an EventStore for the write model, it also has advantages for the read model. For instance, on my app, it’s very unlikely that past activities (months old) within a user group are accessed, so I can actually dispose of their read models and rebuild them if they’re accessed again. This means that, instead of having potentially TB’s of storage of non accessed data, I might only need to keep some GB’s. In fact, I don’t even need to persist my read models and I could store them in memory using Redis for great performance.

In order to have a common programming model for my architecture I started working on the infrastructure bits by creating the right set of abstractions that I wanted to use in my code. I started with in-memory implementations, so I could do end-to-end tests sooner and then moved on to real implementations using Azure Storage (Queues, Tables and Blobs), Redis, and so on. Soon I realized I could reuse these infrastructure components on other projects, so I dedicated a little bit more work to it and recently published it on github. It’s called NDomain and it’s a fully functional DDD/CQRS/ES framework.

Building NDomain allowed me to go through the internals and get a better insight of these concepts, but reading some excellent books and blogs helped me a lot so I’m sharing them here:

If you’re starting your DDD, CQRS, ES journey now, I hope my experiences on the subject and NDomain also pushes you further to create better software. It’s been a lot of fun for me and my journey is far from completed!


One thought on “My journey on DDD, CQRS and Event Sourcing

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s